richw@rosevax.UUCP (11/19/87)
I have some questions about the MSC linker and compiler I hope someone can answer. 1) Does the MSC linker link an entire library to an object file or does it extract only the functions actually used by the object code? 2) Why is the executable size of program so much larger than the object file size? The De Smet compiler I've used seems to produce much smaller executables than MSC. 3) If the MSC linker does link in an entire library are there programs which will remove unused library functions. Thanks in advance, Rich W
farren@gethen.UUCP (Michael J. Farren) (11/21/87)
richw@rosevax.Rosemount.COM (Rich Wagenknecht) writes: > >1) Does the MSC linker link an entire library to an object file >or does it extract only the functions actually used by the >object code? It extracts only those functions actually used. This, however, can include functions needed by those functions, so it isn't as simple as a one-to-one match. >2) Why is the executable size of program so much larger >than the object file size? Because the object file does not include any of the code for the library functions; this is included later, by the linker, and expands the size significantly. In the case of small programs which use a lot of library functions, the code for the library functions can be massively larger than the code for the program itself. Also, the .EXE file produced by the linker generally has a large amount of initialization data and relocation information not present in the object file. >3) If the MSC linker does link in an entire library are there >programs which will remove unused library functions. Not needed; see #1. -- ---------------- Michael J. Farren "... if the church put in half the time on covetousness unisoft!gethen!farren that it does on lust, this would be a better world ..." gethen!farren@lll-winken.arpa Garrison Keillor, "Lake Wobegon Days"
daveb@laidbak.UUCP (Dave Burton) (11/23/87)
In article <3195@rosevax.Rosemount.COM> richw@rosevax.Rosemount.COM (Rich Wagenknecht) writes: >1) Does the MSC linker link an entire library to an object file >or does it extract only the functions actually used by the >object code? >2) Why is the executable size of program so much larger >than the object file size? The De Smet compiler I've used seems to >produce much smaller executables than MSC. >3) If the MSC linker does link in an entire library are there >programs which will remove unused library functions. 1)No. No. 2)See below. DeSmet has a better engineered library. 3)Not that I know of/I seriously doubt it. In general: Linkers can only remove objects called "modules" from libraries. A module is simply a source file compiled/assembled into an object file. If the source file contains more than one function, so will the object. A library archiver then places the object module into the library, noting only the existance and location of the module and any external symbols. The linker simply searches the library for these symbols, extracting the MODULE the symbol is defined within. There are pros and cons to the implementation of libraries with multi-function modules: -- Pro -- a) Placing several related functions in the same source file allows them to share variables/buffer space, thus reducing the data space requirements. b) This also means that the related functions can communicate through "private channels" (static global variables) which any potential caller cannot access symbolically. c) Maintenance of these modules is easier than maintaining several source files, especially without the assistance of automated maintenance tools such as SCCS and make. d) Link time will (usually) be improved because the linker may already have the external symbol required (due to retrieving an earlier required module). -- Con -- 1) Executable files contain potentially many dead areas of code, increasing load time, memory usage, and disk space usage. 2) The output of a Static Analysis of these executables will be more complex due to the presence of dead code. 3) Automatic Program Verification becomes more difficult when a section of code is never used: is it because the test suite is incomplete, the program is flawed, or the code is actually dead? Single function modules must be carefully written, however, or references to other external symbols can have a dominoe effect and chain-link the entire library. (A good test of the granularity and quality of a library is the code a program such as: main() { float a=1.2; printf("this is a test\n"); exit(0); } The printf() will want to bring in several different modules to satisfy its complex/diverse conversion requirements. Many compilers define symbols such as "__floatused" to help the linker in determining if certain modules are needed, so the float assignment should trigger this. This is definitely *NOT* the definitive test, but an indicator.) Further, single function modules must now pass information via global symbols (although usually undocumented). As an example of the need to pass information in this manner, consider a set of functions which manage a video screen, and several of the functions can modifiy the state of the screen, such as current page, window, font, attribute, etc. If the library writer is not careful, using just one of these functions can bring in most, if not all, the related functions because of this interaction. Library engineering is the major reason for the reported differences in code size between MSC and DeSmet. While the size of the executable is important, code speed and efficiency is more so. If I had to choose between a 50k executable that would run a given program in 10 seconds vs. a 30k executable that took 25 seconds, I would take the larger. In summary: Although the linker actually does the work and is seen as the culprit of large executables, the library writer is actually at 'fault'. What you are seeing is the engineering decision and implementation quality of the library based upon those decisions. -- --------------------"Well, it looked good when I wrote it"--------------------- Verbal: Dave Burton Net: ...!ihnp4!laidbak!daveb V-MAIL: (312) 505-9100 x325 USSnail: 1901 N. Naper Blvd. #include <disclaimer.h> Naperville, IL 60540
daveb@geac.UUCP (11/27/87)
In article <1259@laidbak.UUCP> daveb@laidbak.UUCP (Dave Burton) writes: >In general: >Linkers can only remove objects called "modules" from libraries. >A module is simply a source file compiled/assembled into an object >file. If the source file contains more than one function, so will >the object. A library archiver then places the object module into >the library, noting only the existance and location of the module >and any external symbols. The linker simply searches the library for >these symbols, extracting the MODULE the symbol is defined within. Well, thats the usual implementation of C. Not all languages/compilers do that, though. The alternative is to put all the functions in as separate linkable items, while arranging for the "top-level statics" to be given a name invisible to the casual user and arranging for the functions which require the statics to reference that name. An example from CP/M (!) is: /* foo.c */ static int harold; foo() { harold = 2; } bar() { printf("%d\n",harold); } maude() { ; } /* end */ Linking either foo or bar will drag in "foo^statics", a block of data two bytes long, containing "harold". Linking maude will nor drag in foo^statics. (I forget what character the linker used for the separator: it wasn't really ^). --dave -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
tim@doug.UUCP (Tim J Ihde) (11/30/87)
In article <3195@rosevax.Rosemount.COM>, richw@rosevax.Rosemount.COM (Rich Wagenknecht) writes: > 1) Does the MSC linker link an entire library to an object file > or does it extract only the functions actually used by the > object code? If you are using a given function, then the entire .obj file that included that function will be included in your executable. The whole library is not included, but you still might get some functions that you don't need. For example, you might use printf and then find that the object code for scanf has been included in your executable even though you don't use it. For the most part, MicroSoft and other library vendors try to lump external functions/variables into one object module only if they are closely related - under the assumption that you will probably want the extraneous stuff as well. If you give the LIB program a filename to list to, it will produce a giant file telling which modules contain which functions. > 2) Why is the executable size of program so much larger > than the object file size? The De Smet compiler I've used seems to > produce much smaller executables than MSC. The executable contains a fair amount of code from the library just to start running, so it will always be bigger than your object code. The degree depends on how much startup work they do. It's doing things like allocating stack/heap space and whatnot. Plus you've got any library code you've used stuck in there someplace, this could easily be bigger than your object code all by itself. I'm not familier with DeSmet C, but it must do something similar. Perhaps they have broken their library modules down farther, so you end up with less unused object in the executable? tim