[net.unix-wizards] \"ld\" and \".a\" files

FIRTH@TL-20B.ARPA (01/15/85)

We are using BSD 4.2 Unix.  The manual for the linker "ld" says
that the program will  search archive ".a" files for previously
undefined global symbols, and will include any ROUTINES that
define such symbols.

After finding strange anomalies, I eventually looked at the code.
The conclusion I reached (with some difficulty, since the relevant
programs "ar" and "ranlib" are almost totally devoid of commentary)
is that "ld" in fact loads not the routine but the FILE containing
the definition of the global symbol.

Is this the intended behaviour, and should I change the manual?
Or is it a bug, and, if so, does anyone have a fix?

Robert Firth
-------

Doug Gwyn (VLD/VMB) <gwyn@Brl-Vld.ARPA> (01/15/85)

The entire whatever.o file is linked in, whether from an archive
or by itself.  This is by design and is indeed necessary for C
programs.

guy@rlgvax.UUCP (Guy Harris) (01/17/85)

> The manual for the linker "ld" says that the program will  search
> archive ".a" files for previously undefined global symbols, and
> will include any ROUTINES that define such symbols.
> 
> ...The conclusion I reached is that "ld" in fact loads not the
> routine but the FILE containing the definition of the global symbol.

Most OSes that I know of work that way - object libraries are archives of
object modules (an object module being an object file built from one
source file) and if a given object module resolves an undefined symbol
the whole module, not just the part of the module that resolves the symbol,
is included.  UNIX is no exception.  Perhaps TENEX/TOPS-20 is.  I presume
the documentation was assuming familiarity with linkers which work the way
the UNIX linker does.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

rpw3@redwood.UUCP (Rob Warnock) (01/19/85)

+---------------
| > ...The conclusion I reached is that "ld" in fact loads not the
| > routine but the FILE containing the definition of the global symbol.
| Most OSes that I know of work that way - object libraries are archives of
| object modules (an object module being an object file built from one
| source file) and if a given object module resolves an undefined symbol
| the whole module, not just the part of the module that resolves the symbol,
| is included.  UNIX is no exception.  Perhaps TENEX/TOPS-20 is...
| 	Guy Harris | 	{seismo,ihnp4,allegra}!rlgvax!guy
+---------------

No, at least TOPS-10 works the same way, and I believe so does TOPS-20.
The whole module gets loaded (but not necessarily the whole file).

Here is one bit of trivia for you though: Some systems allow more than one
"module" per source file. The PDP-10 assembler (MACRO-10) has a pseudo-op
called "PRGEND" which can end the module and allow another module to begin
in the same source file (with another "TITLE" pseudo-op). It was added to the
assembler primarily to assist in compiling the standard system libraries
(FORTRAN, COBOL, etc.) to avoid the overhead of running the assembler from
scratch to compile each of hundreds of little routines! (Another advantage
is that if any macro libraries were loaded ("UNIVERSALS"), they need be loaded
only once per assembler run.) But each of them is a separate module, and may
(and often does) contain multiple routines or entry points.  The output from
such a compile is a "library" (UNIX "archive").


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404