sommar@enea.se (Erland Sommarskog) (03/02/88)
Lee Sailer (UH2@PSUVM.BITNET) writes: >How does this "smart linker" business tie into the "shared libraries" >in Unix V.3. As I understand it, (1) when I need a module, the whole >library is loaded, but (2) when another program needs a module from >the library, it shares the core image that is already in memory. > >So, for example, at any moment, there is only one copy of all the stdio >(that's standard input-output in Unix-speak) stuff in memory at any given >moment, and all programs that need it share. (This also makes the >executables smaller and saves disk space and load time.) Well, I know nothing of shared libraries or even System V.3 as such. But I guess it looks much like shareable images in VMS. If you really want to save space for your binaries under VMS, you put them in a shareable image. No matter how many of these procedure you call, none will be included. Mere references to the shared image. Slowly I am beginning to realize that this concept is not standard under Unix. Well, that explains why even the simplest of programs exceeds 100 kbytes when linked. (Pascal, f77 and Ada) Library routines, or even entire libraries, in the langauge environment are included in my private executeable. Needless to say, all such routines are provided in shareable images in VMS, unless you explcitly tell the linker not to use them. To make it even more fun, VMS permits you to install these images just like other heavily used programs like compilers, editors and usual utilities are. My exact notion of this "installation" is uncertain, but if I'm right, but I belive that it is the file header is constantly loaded into physical memory. (To INSTALL may also involve other things, such as priviliges, but that is out of the subject.) Does Unix have such a concept? As a whole: Many Unix-fans have reacted on the critics on the Unix linker with: "It does what you want, just if you use in the right way." Remember that strikes back on you, the occassion you flame another OS. Some manouvers are the way to go under Unix, but meets problems under VMS. And vice verca. Often because you don't know the best way under the another operating system. But if you look, you very often find out that you can easily do what you like, "just if you use it the right way." But sometimes you fall flat. And depending where you stumble, you pick your favourite system, which doesn't have to be Unix by necessity. It's not mine. -- Erland Sommarskog ENEA Data, Stockholm sommar@enea.UUCP "Souvent pour s'amuser les hommes d'equipages and it's like talking to a stranger" -- H&C.
cml@tove.umd.edu (Christopher Lott) (12/05/89)
in <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: | WHAT? What year is this? I don't think I've ever used a linker that | didn't eliminate unused routines. Any such linker would be seriously | brain damaged. | -- | Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com Well! I think this is the transitive-closure problem in a linker, quite difficult to solve in a straightforward link process. I'm thinking of the unix linker, which (I am told) makes exactly 1 pass through the object files it is told to read, collecting everything that it finds, and finally resolving unknown externals from libraries. Deciding whether or not to link in a module requires the linker to know if it is ever going to be used. Since the linker hasn't seen all the code yet, it can't know, so it adds it in. In an ideal world, the linker should keep track of ALL references to externals, and run through its data 1 more time to kick out all externals that were not referenced. This can be verified easily on any machine. sort-of-c-code: -----snip------- main() { printf("hello, world\n"); } void never_called_func(i, j, k) int i, j, k; { i = 1; printf("i = %d\n", i); j = 2; printf("j = %d\n", j); k = 3; printf("k = %d\n", k); < repeat above lines a hundred times or so > } -----snip------- (You really need > 100 filler lines for them to take up noticeable space.) Compile this into an object and examine the executable size. Then remove function "never_called_func", recompile, and reexamine. Then if you really want to be convinced, compile never_called_func in a library and link the simple program with that library. I know from personal experience using Microsoft C v5.1 on a PC that that particular linker includes ALL code in all modules that are ".o" files and includes ONLY the modules needed from libraries. I agree about the previous poster's assesment of "brain damaged" but perhaps a better description would be "efficient in time, not space." ....time out for testing.....ok. On tove, a vax something runing 4.3Tahoe, the standard linker is dumb. (uh, "efficient in time, not space" :-) ....another time-out....ok. Ditto for the GNU linker. Anyone know if the GNU project has plans to build a better linker?? chris... -- cml@tove.umd.edu Computer Science Dept, U. Maryland at College Park 4122 A.V.W. 301-454-8711 <standard disclaimers>
davecb@yunexus.UUCP (David Collier-Brown) (12/06/89)
cml@tove.umd.edu (Christopher Lott) writes: [talking about a common linker...] | that particular linker includes ALL code in all modules that are ".o" | files and includes ONLY the modules needed from libraries. | | Anyone know if the GNU project has plans to build a better linker?? I'd suggest writing a binder, which is a .o->.o translator which make previously-global function and variable names local static, and only "exports" a small list of names. Then it could be used with any linker. I'd suggest using the Gnu linker **code** as a framework, though... --dave (I wrote one once upon a time...) c-b -- David Collier-Brown, | davecb@yunexus, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | Joyce C-B: CANADA. 416-223-8968 | He's so smart he's dumb.
sommar@enea.se (Erland Sommarskog) (12/07/89)
Tim Maroney (tim@hoptoad.UUCP) writes: >WHAT? What year is this? I don't think I've ever used a linker that >didn't eliminate unused routines. Any such linker would be seriously >brain damaged. While this may seem credible at first glance, it is at not second. I am very happy that the linker we use in our project (VMS LINK) don't remove uncalled routines. In that case it would notice that this routine is never called, and never is this one and so forth and rapidly it would have removed the 250 top modules. In the next step it would remove modules they call etc, and instead of giving us the 10500 block executeable we want, it would leaves a tiny thing on 100-200 blocks. The trick is in Cobol where you can say Routine PIC x(32). ... CALL Routine USING.... Routine is not a literal, it is a string variable. All top entries corresponds to menu choices. When the user enters a menu choice, it is looked up in a database, which gives you the name of the procedure to call. Then of course there are other reasons why may want uncalled routines to be included in the final image. Debugging is one. As been pointed out by other posters, most linkers include all object files you feed it with - and that's probably the behaviour you want - but from libraries it only includes referenced modules, and those you particulary ask for. I don't know about Unix linker, but the VMS linker takes the entire object module, even if only one routine in it is called. This may seem stupid, but if the linker should be able to pick out pieces it would have to analyze the object module to see exactly which routines the referenced routine called, both inside and outside the object module. However, the language processor may help the linker. VAX-Cobol makes a separate module of each procedure. I don't know about VAX-C, but I would guess that one file gives one object module, although one could envision the opposite with a separate object module for variables on file level. (Or, couldn't one. Don't flame me, I don't speak C.) -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se Mail me your votes on comp.lang.cobol.
rang@cs.wisc.edu (Anton Rang) (12/07/89)
In article <21107@mimsy.umd.edu> cml@tove.umd.edu (Christopher Lott) writes: >in <9185@hoptoad.uucp> tim@hoptoad.UUCP (Tim Maroney) writes: >| WHAT? What year is this? I don't think I've ever used a linker that >| didn't eliminate unused routines. Any such linker would be seriously >| brain damaged. >| -- >| Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com No linker should link in unused modules from an object library. However, there is a bit of a tradeoff involved with object files. Many compilers/assemblers resolve references to symbols within the object file at compile time, so that linking is faster. If this is done, the linker can no longer arbitrarily "munge" an object file. >Well! I think this is the transitive-closure problem in a linker, >quite difficult to solve in a straightforward link process. It's not that difficult, as long as references are always resolved at link time (and never at compile time--or at least, if they are resolved at compile time, relocation information is still kept around). The THINK Pascal compiler on the Macintosh works this way: the compiler resolves references at compile time, but when the final build is done the linker removes all procedures which are unused. The VMS compilers mostly create a single object module per source file (not keeping around relocation info). The FORTRAN compiler, however, generates one module per routine (within one object file) which allows the linker to remove unused code. >I know from personal experience using Microsoft C v5.1 on a PC that >that particular linker includes ALL code in all modules that are ".o" >files and includes ONLY the modules needed from libraries. Any system using the BSD UNIX object file format (or similar ones) has to do this; the information needed to move code within a module is not available. >I agree about the previous poster's assesment of "brain damaged" but >perhaps a better description would be "efficient in time, not space." Well...sort of. Hopefully, references within a module will be resolved at compile time instead of link time, which will speed up links. It would still be nice if the object file format included full relocation information, which could be skipped for a fast link, or used for a "small" link. Just my thoughts.... Anton +---------------------------+------------------+-------------+ | Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison | +---------------------------+------------------+-------------+
peter@ficc.uu.net (Peter da Silva) (12/08/89)
Smart linkers aren't that much of a win in practice, but they are pretty safe for high-level languages. For example, the following situation is not a problem: In article <530@enea.se> sommar@enea.se (Erland Sommarskog) writes: > Routine PIC x(32). > ... > CALL Routine USING.... > Routine is not a literal, it is a string variable. All top entries > corresponds to menu choices. When the user enters a menu choice, > it is looked up in a database, which gives you the name of the > procedure to call. In which case the routine would be referenced in the *database*, and so would be linked in. -- `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. 'U` Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>. "If you want PL/I, you know where to find it." -- Dennis
tim@hoptoad.uucp (Tim Maroney) (12/09/89)
In article <530@enea.se> sommar@enea.se (Erland Sommarskog) writes: >Tim Maroney (tim@hoptoad.UUCP) writes: >>WHAT? What year is this? I don't think I've ever used a linker that >>didn't eliminate unused routines. Any such linker would be seriously >>brain damaged. > >While this may seem credible at first glance, it is at not second. Try the third.... >I am very happy that the linker we use in our project (VMS LINK) >don't remove uncalled routines. In that case it would notice that >this routine is never called, and never is this one and so forth and >rapidly it would have removed the 250 top modules. > >The trick is in Cobol where you can say > Routine PIC x(32). > ... > CALL Routine USING.... >Routine is not a literal, it is a string variable. All top entries >corresponds to menu choices. When the user enters a menu choice, >it is looked up in a database, which gives you the name of the >procedure to call. And of course, all such routines are referenced; pointers to them are stored in this database. The analogue in C is when a routine is never explicitly called, but a function pointer to it is used in a referenced routine. The routine is referenced, just as routines entered into a late-binding database are referenced. -- Tim Maroney, Mac Software Consultant, sun!hoptoad!tim, tim@toad.com "This signature is not to be quoted." -- Erland Sommarskog
sommar@enea.se (Erland Sommarskog) (12/10/89)
Peter da Silva (peter@ficc.uu.net) writes, quoting me: )Smart linkers aren't that much of a win in practice, but they are pretty )safe for high-level languages. For example, the following situation is )not a problem: )) Routine PIC x(32). )) ... )) CALL Routine USING.... )) Routine is not a literal, it is a string variable. All top entries )) corresponds to menu choices. When the user enters a menu choice, )) it is looked up in a database, which gives you the name of the )) procedure to call. ) )In which case the routine would be referenced in the *database*, and so )would be linked in. Eh? You link relational databases with your executeable? With arbitrary relations and columns? The linker has not only to be smart, but to be clairvoyant to see which column in which relation is the function name. To make it even worse, one code may map to different routines at different sites, since two customers want different behaviour. To make it simple, you link both routines with your executeable, and the contents in the menu databases at the customer site decide which variant they will run. -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se Mail me your votes on comp.lang.cobol.
peter@ficc.uu.net (Peter da Silva) (12/13/89)
> )In which case the routine would be referenced in the *database*, and so > )would be linked in. > Eh? You link relational databases with your executeable? Database != relational database. In this case it just means a table of function names and locations. > To make it even worse, one code may map to different routines > at different sites, since two customers want different behaviour. So far, so good. > To make it simple, you link both routines with your executeable, > and the contents in the menu databases at the customer site decide > which variant they will run. Say what? You have an external relational database containing absolute addresses in your executable? What happens when you want to ship them a new version of the program? They have to rebuild the database? -- `-_-' Peter da Silva. +1 713 274 5180. <peter@ficc.uu.net>. 'U` Also <peter@ficc.lonestar.org> or <peter@sugar.lonestar.org>. "It was just dumb luck that Unix managed to break through the Stupidity Barrier and become popular in spite of its inherent elegance." -- gavin@krypton.sgi.com
sommar@enea.se (Erland Sommarskog) (12/17/89)
Peter da Silva (peter@ficc.uu.net) writes: >> )In which case the routine would be referenced in the *database*, and so >> )would be linked in. > >> Eh? You link relational databases with your executeable? > >Database != relational database. In this case it just means a table of >function names and locations. Peter, if you think you are clairvoyant, I've bad news for you. There is something disturbing your reception. I introduced this thread, including the database. So maybe you should try to under- stand what I'm talking about instead of deciding that on your own. The database I'm talking of is an RDB database, and last time I look R stood for relational. That it is relational is beside the point, but point is that does not conatin any information on function locations. In short: the thing is a generic menu handler. The user enters a code or a number which is looked up the database. What you get is the *name* of the function to call. You also get some information what access rights the user has to this particular function. The menu handler then calls function which is application specific. The menu handler has it own set of menues for maintenance. For adding or modifying users, but also to add or modify function entries. For instance if there is a function which only one customer have paid for, only in the menu database for that customer that function is availble. The others have it in the executeable, but cannot access it. (Unless they know the name of function in which case they add it. That is less likely with our crude name standards.) >> To make it simple, you link both routines with your executeable, >> and the contents in the menu databases at the customer site decide >> which variant they will run. > >Say what? You have an external relational database containing absolute >addresses in your executable? What happens when you want to ship them a >new version of the program? They have to rebuild the database? No. Who said that were any absolute addresses in the database, Peter? Certainly not me. You made that up yourself. I even told the trick in my first article, and I have mentioned it before. But let's take it again: Routine PIC X(32). ... CALL Routine USING ... If you didn't recignize it, this is Cobol. Peter is probably stuck in C thinking, thereof his talk of absolute addresses. Anyway, Routine is a string variable, into which we load the contents of the database entry. Then we call the function with the name Routine contains. If there is no such routine (on VMS at least this has to be another Cobol procedure) we get a run-time error. (Which is handled by the menu handler.) Yes, somewhere there is a coupling function name <-> address, but that is handled by the Cobol compiler and the Cobol run-time library. My original comment to the linker discussion was that the menu handler wouldn't work with a linker that removed unreferenced modules, since the routines called by the menu handler are neiher compile-time nor link-time references, but run-time references and beyond the linker's horizon. Of course this doesn't mean that a linker shouldn't be allowed to remove unreferenced routines, but that you need a mechanism to tell such a linker that it should include a routine no matter whether it's referenced or not. (Some people may question the wise of making everything one big executeable, as we do. We are heading for a better solution. We're making every function a shareable image of its own to be activated by LIB$Find_image_symbol. This means that the main executeable will only be the menu handler.) -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se Mail me your votes on comp.lang.cobol.