coggins@coggins.cs.unc.edu (Dr. James Coggins) (11/03/88)
Managing C++ Libraries: Subdirectories and .c Files James Coggins and Greg Bollella Computer Science Department UNC-Chapel Hill Organizing a library in a hierarchical directory structure simplifies development, maintenance, and use of the library by contributing to two desirable goals: (1) Separation of concerns and (2) Information hiding. Part of the justification for object-oriented design is the formal separation of the concerns of the user, designer, and implementer of a class. In C++, the interfaces between these communities is formalized in the header file. The public part of the header file is a contract between the user community and the class designer. The entire header file is a contract between the class designer and the implementer. It is only natural that this separation of concerns should be reflected in the large-scale organization of a library as well as in the small-scale coding structures. The encapsulation implemented in the "class" concept is reinforced by physically separating the code of different class implementations. A single directory containing links to headers helps to reinforce the separation between user concerns (requiring frequent access to headers of many classes) and implementer concerns (requring access to code and header of the particular class being implemented). The humble admission of the limited capacity of short-term memory leads to the technique called information hiding (encapsulation). We seek to minimize the amount of data that a software developer or user must manipulate at one time to correctly use the system. This notion is an essential justification of object-oriented design and should be reflected in the large-scale structure of libraries as well as in the design of code and languages. A suitable directory hierarchy hides code, headers, and administrative concerns that are irrelevant to the current activity. Minimizing the number of names that must be understood or manipulated at once ("surface area" in software physics terms) is an important technique for simplifying the use of a large body of software. We organize Dr. Coggins' library in a three-level directory hierarchy as shown in the simplified example below. The library is located under a directory (called "mainlib" below) with subdirectories for sub-libraries consisting of groups of related classes - several of these groups are stand-alone inheritance hierarchies, while others are collections of topically related classes. Each class has its own subdirectory in the appropriate sublibrary. Of course, a smaller library might be organized without the intermediate level; the main directory might contain the class subdirectories. Since users most often refer to header files, we have another subdirectory called "headers" that contains soft links (ln -s) to all of the .h files throughout the library. mainlib / | \ / | \ sublib1 headers sublib 2 / \ / | \ / \ / | \ ClassA ClassB ClassC ClassD ClassE Figure 1: Example of subdirectory structure for a library Each directory contains a Makefile that performs appropriate operations for a directory at that level. The Makefile in mainlib invokes the makefiles for sublibraries; those makefiles invoke the makefiles for class directories. Thus, the directory structure implements a version of the object-oriented philosophy at the level of library organization. All of the Makefiles at each level have a common format, duplicated below, that simplifies addition of new member functions, classes, or sublibraries. # MAKEFILE FOR MAINLIB ------------------------------------------- .SILENT A = sublib1 B = sublib2 #C = #D = compile: echo "Perform all compilations" (cd $A; make all) (cd $B; make all) # (cd $C; make all) # (cd $D; make all) echo "Compilations complete" cleanup: echo "Cleanup all libraries" (cd $A; make cleanup) (cd $B; make cleanup) # (cd $C; make cleanup) # (cd $D; make cleanup) echo "Cleanup complete" create: echo "Create mainlib.a from scratch" (cd $A; make library) (cd $B; make library) # (cd $C; make library) # (cd $D; make library) touch mainlib.a rm mainlib.a mv newlib.a mainlib.a ranlib mainlib.a echo "mainlib.a complete" # END OF MAKEFILE FOR MAINLIB------------------------------------ # MAKEFILE FOR SUBLIB1 ------------------------------------------ .SILENT LIB = SUBLIB1 MAINLIB= /.../mainlib/newlib.a ( <-- full path name of the .a file) A = ClassA B = ClassB #C = #D = all: echo "$(LIB) Begin Compilation" (cd $A; make all) (cd $B; make all) # (cd $C; make all) # (cd $D; make all) echo "$(LIB) Compilation Complete" cleanup: echo "$(LIB) Begin Cleanup" (cd $A; make cleanup) (cd $B; make cleanup) # (cd $C; make cleanup) # (cd $D; make cleanup) echo "$(LIB) Cleanup Complete" library: echo "$(LIB) Create library" ar lq $(MAINLIB) $A/*.o ar lq $(MAINLIB) $B/*.o # ar lq $(MAINLIB) $C/*.o # ar lq $(MAINLIB) $D/*.o echo "$(LIB) Create library complete" # END OF MAKEFILE FOR SUBLIB1 ------------------------------------------ # MAKEFILE FOR CLASS CLASSA -------------------------------------------- .SILENT CC=CC CFLAGS= +e0 -fswitch CLASS = CLASSA OBJ = CDest.o CNull.o Cintint.o reset.o \ draw.o compute.o all: echo "$(CLASS) Begin Compilation" make $(OBJ) echo "$(CLASS) Compilation complete" cleanup: echo "$(CLASS) Cleanup" /bin/rm *.o echo "$(CLASS) Cleanup complete" .c.o: echo "Begin $*.c" $(CC) $(CFLAGS) -c $< # END OF MAKEFILE FOR CLASSA ------------------------------------------ To add a new sublibrary or to add a new class to a sublibrary, the next available symbol should be defined and the comment marks should be removed from subsequent lines involving that symbol. To add a new member function to the class's Makefile is even easier: the name of the .o file should be added to the OBJ symbol definition. (Note that if the list extends to multiple lines, the \ escape must be used at the end of each nonfinal line.) The third Makefile above takes advantage of the implicit rules in Make for transforming .c files to .o files (see make(1)). We modify the implicit rule .c.o to echo a status message as each compilation begins. The third Makefile illustrates another technique we have used to organize and optimize use of the library. We store each non-inline member function in a separate .c file. This practice requires tradeoffs that must be evaluated to determine whether the technique is worthwhile for a particular case. Advantages of storing each function in a separate .c file include: (1) The linker will not attempt to break apart .o files. Therefore, if all functions of a class are stored in a single .c file, a reference to ANY function will cause ALL of the functions to be linked into the application program's a.out file. Separating member functions into separate .c files helps to minimize the size of a.out files and decrease link time. (2) A member function can be recompiled without recompiling the entire class. Since this is a common operation, we consider the savings to be significant. Disadvantages of storing each function in a separate .c file include: (1) You must name the .c file for each function. This becomes a problem if there exist many overloaded versions of some operations since, in effect, the file name must encode the arguments to distinguish the functions. By convention, we name all constructors with an upper-case C followed by an encoding of the arguments to the constructor. CDest is the destructor, CNull is the constructor with no arguments, CCopy is the constructor that takes a reference to an object of the same class, Cintint takes two integers, etc. If the names assigned to the .c files are not unique across the entire library, then when the library is created, you must suppress elimination of duplicate names by using the q option on the ar command. Unfortunately, this means that the .a file must be re-created from the .o files whenever any member function in the library is recompiled. However, by assigning a (one-character) prefix to the .c file names designating the class, one can sidestep this problem at a small cost in setup overhead. (2) Compilation of the whole library takes MUCH longer because all relevant header files must be processed for each member function instead of once for each class. We have found this potential disadvantage to be largely irrelevant because we rarely need to recompile the entire library.