[comp.lang.c++] Managing C++ Libraries: Subdirectories and .c files

coggins@coggins.cs.unc.edu (Dr. James Coggins) (11/03/88)
Managing C++ Libraries: Subdirectories and .c Files

James Coggins and Greg Bollella
Computer Science Department
UNC-Chapel Hill

Organizing a library in a hierarchical directory structure simplifies
development, maintenance, and use of the library by contributing to
two desirable goals: (1) Separation of concerns and (2) Information
hiding. 

Part of the justification for object-oriented design is the formal
separation of the concerns of the user, designer, and implementer of a
class.  In C++, the interfaces between these communities is formalized
in the header file.  The public part of the header file is a contract
between the user community and the class designer.  The entire header
file is a contract between the class designer and the implementer. It
is only natural that this separation of concerns should be reflected
in the large-scale organization of a library as well as in the
small-scale coding structures.  The encapsulation implemented in the
"class" concept is reinforced by physically separating the code of
different class implementations.  A single directory containing links
to headers helps to reinforce the separation between user concerns
(requiring frequent access to headers of many classes) and implementer
concerns (requring access to code and header of the particular class
being implemented). 

The humble admission of the limited capacity of short-term memory leads
to the technique called information hiding (encapsulation).  We seek
to minimize the amount of data that a software developer or user must
manipulate at one time to correctly use the system.  This notion is an
essential justification of object-oriented design and should be
reflected in the large-scale structure of libraries as well as in the
design of code and languages. A suitable directory hierarchy hides
code, headers, and administrative concerns that are irrelevant to the
current activity.  Minimizing the number of names that must be
understood or manipulated at once ("surface area" in software physics
terms) is an important technique for simplifying the use of a large
body of software.

We organize Dr. Coggins' library in a three-level directory hierarchy
as shown in the simplified example below.  The library is located
under a directory (called "mainlib" below) with subdirectories for
sub-libraries consisting of groups of related classes - several of
these groups are stand-alone inheritance hierarchies, while others are
collections of topically related classes.  Each class has its own
subdirectory in the appropriate sublibrary.  Of course, a smaller
library might be organized without the intermediate level; the main
directory might contain the class subdirectories. Since users most
often refer to header files, we have another subdirectory called
"headers" that contains soft links (ln -s) to all of the .h files
throughout the library. 

                             mainlib
                            /   |   \
                           /    |    \
                    sublib1  headers  sublib 2 
                    /     \          /     |   \ 
                   /       \        /      |    \
                ClassA   ClassB  ClassC ClassD  ClassE

      Figure 1: Example of subdirectory structure for a library

Each directory contains a Makefile that performs appropriate
operations for a directory at that level.  The Makefile in mainlib
invokes the makefiles for sublibraries; those makefiles invoke the
makefiles for class directories.  Thus, the directory structure
implements a version of the object-oriented philosophy at the level of
library organization.  All of the Makefiles at each level have a
common format, duplicated below, that simplifies addition of new member
functions, classes, or sublibraries.


#  MAKEFILE FOR MAINLIB -------------------------------------------

.SILENT
A = sublib1
B = sublib2
#C = 
#D = 

compile: 
	echo "Perform all compilations"
	(cd $A;  make all)
	(cd $B;  make all)
#	(cd $C;  make all)
#	(cd $D;  make all)
	echo "Compilations complete"

cleanup:
	echo "Cleanup all libraries"
	(cd $A;  make cleanup)
	(cd $B;  make cleanup)
#	(cd $C;  make cleanup)
#	(cd $D;  make cleanup)
	echo "Cleanup complete"

create:
	echo "Create mainlib.a from scratch"
	(cd $A;  make library)
	(cd $B;  make library)
#	(cd $C;  make library)
#	(cd $D;  make library)
	touch mainlib.a
	rm mainlib.a
	mv newlib.a mainlib.a
	ranlib mainlib.a
	echo "mainlib.a complete"

#  END OF MAKEFILE FOR MAINLIB------------------------------------
#  MAKEFILE FOR SUBLIB1 ------------------------------------------

.SILENT
LIB = SUBLIB1
MAINLIB= /.../mainlib/newlib.a   ( <-- full path name of the .a file)
A = ClassA
B = ClassB
#C = 
#D = 

all:
	echo "$(LIB) Begin Compilation"
	(cd $A; make all)
	(cd $B; make all)
#	(cd $C; make all)
#	(cd $D; make all)
	echo "$(LIB) Compilation Complete"

cleanup:
	echo "$(LIB) Begin Cleanup"
	(cd $A; make cleanup)
	(cd $B; make cleanup)
#	(cd $C; make cleanup)
#	(cd $D; make cleanup)
	echo "$(LIB) Cleanup Complete"

library:
	echo "$(LIB) Create library"
	ar lq $(MAINLIB) $A/*.o
	ar lq $(MAINLIB) $B/*.o
#	ar lq $(MAINLIB) $C/*.o
#	ar lq $(MAINLIB) $D/*.o
	echo "$(LIB) Create library complete"

#  END OF MAKEFILE FOR SUBLIB1 ------------------------------------------
#  MAKEFILE FOR CLASS CLASSA --------------------------------------------

.SILENT
CC=CC
CFLAGS= +e0 -fswitch
CLASS = CLASSA
OBJ = CDest.o CNull.o Cintint.o reset.o \
      draw.o compute.o

all:
	echo "$(CLASS) Begin Compilation"
	make $(OBJ)
	echo "$(CLASS) Compilation complete"

cleanup:
	echo "$(CLASS) Cleanup"
	/bin/rm *.o
	echo "$(CLASS) Cleanup complete"

.c.o:
        echo "Begin $*.c"
        $(CC) $(CFLAGS) -c $<

#  END OF MAKEFILE FOR CLASSA ------------------------------------------


To add a new sublibrary or to add a new class to a sublibrary, the
next available symbol should be defined and the comment marks should
be removed from subsequent lines involving that symbol.  To add a new
member function to the class's Makefile is even easier: the name of
the .o file should be added to the OBJ symbol definition. (Note that
if the list extends to multiple lines, the \ escape must be used at
the end of each nonfinal line.)

The third Makefile above takes advantage of the implicit rules in Make
for transforming .c files to .o files (see make(1)).  We modify the
implicit rule .c.o to echo a status message as each compilation
begins. 

The third Makefile illustrates another technique we have used to
organize and optimize use of the library.  We store each non-inline
member function in a separate .c file.  This practice requires
tradeoffs that must be evaluated to determine whether the technique is
worthwhile for a particular case. 

Advantages of storing each function in a separate .c file include:

(1) The linker will not attempt to break apart .o files.  Therefore,
if all functions of a class are stored in a single .c file, a
reference to ANY function will cause ALL of the functions to be linked
into the application program's a.out file.  Separating member
functions into separate .c files helps to minimize the size of a.out
files and decrease link time. 

(2) A member function can be recompiled without recompiling the entire
class.  Since this is a common operation, we consider the savings to
be significant.

Disadvantages of storing each function in a separate .c file include:

(1) You must name the .c file for each function.  This becomes a
problem if there exist many overloaded versions of some operations
since, in effect, the file name must encode the arguments to
distinguish the functions.  By convention, we name all constructors
with an upper-case C followed by an encoding of the arguments to the
constructor.  CDest is the destructor, CNull is the constructor with
no arguments, CCopy is the constructor that takes a reference to an
object of the same class, Cintint takes two integers, etc.  If the
names assigned to the .c files are not unique across the entire
library, then when the library is created, you must suppress
elimination of duplicate names by using the q option on the ar
command.  Unfortunately, this means that the .a file must be
re-created from the .o files whenever any member function in the
library is recompiled.  However, by assigning a (one-character) prefix
to the .c file names designating the class, one can sidestep this
problem at a small cost in setup overhead. 

(2) Compilation of the whole library takes MUCH longer because all
relevant header files must be processed for each member function
instead of once for each class.  We have found this potential
disadvantage to be largely irrelevant because we rarely need to
recompile the entire library.