stephen@alberta (07/23/83)
.nh .ad l It has bothered me for a while that every time I compile a C program, of the routines which I use are physically added to the object file. When you consider that many of the routines are included in almost every object file, perhaps it would be better to share one copy of the more common routines between all tasks. The most obvious advantage of this approach would be the savings in disk space. As an example: On our system, with about 1200 object files, the savings from sharing the startup routines and PRINTF (apx 7k per file) would come to about 9meg. Sharing routines would also result in decreased memory usage and possibly faster loading times (since only the user routines would have to be loaded). Those subroutines used by the kernel would have to be designated as shared non-paged, and others could be simply shared. The big question is how easy would it be to make the change? Memory management would have to be modified, the loader, and possibly things like 'adb' as well. And where would the routines be placed? If they are put at the bottom of memory, would this cause problems with routines that didn't expect them there? And if they are placed at the top of memory, would that cause problems with ADB and CDB or get in the way of the stack? I don't yet know enough about UN*X to really answer those questions well. Does it sound like a feasible Idea or am I out in left field? Stephen Samuel (ubc-visi!alberta!stephen)
msc@qubix.UUCP (07/24/83)
There was a discussion about shared libraries less than 2 months ago. The conclusion was to leave things the way they are. -- Mark ...{decvax,ucbvax,ihnp4}!decwrl! ...{ittvax,amd70}!qubix!msc decwrl!qubix!msc@Berkeley.ARPA
guy@rlgvax.UUCP (Guy Harris) (07/24/83)
There was some discussion of shared libraries in UNIX a while ago. Many other operating systems do support them, and they probably do cut down on the physical memory requirements of the programs that use them. The main tricky part is that they cannot call routines outside the shared library except through an indirect pointer of some sort. If they called them using the standard subroutine call instruction provided on most machines, the address of the routine would be hardcoded into the code of the routine itself. However, this address may be different in different programs which include this routine. Furthermore, if they reference any globals they would also have to reference them through such a pointer. Some current routines might have to be modified. Another alternative is to have two data segments; one for globals referenced by the library, and one for others. The globals referenced by the library must be DEFINED (not just referenced) by the library routine; the size must be assigned at the time the library is built, not the time the program using the library is built. Also, the shared library routines must either be position-independent code (which the PDP-11 C compiler does not generate) or must always appear at the exact same place in the virtual address space in all processes. As long as the shared library routines always appeared at the same address in all programs' virtual address spaces, the exact placement wouldn't be a problem; put it wherever your machines' memory mapping hardware wants you to. I suspect the various debuggers wouldn't care too much where they appeared, except that UNIX prefers that the data segments appear before the stack segment in virtual address space - but this can probably be gotten around if necessary (I'm sure there's at least ONE machine out there that makes this difficult). I suspect it's feasible, but it'll take a lot of work. Cooperative hardware and compilers would help; you may also want to impose or use certain coding conventions within shared library routines (use of pointers to variables passed as arguments rather than global variables, for example). Guy Harris {seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy
Michael.Young%cmu-cs-g@sri-unix.UUCP (07/25/83)
In order to allow shared modules, you'd also probably need a global variable dictionary for each shared module. [Unless you choose to put the shared modules at *fixed* places in every user's address space, which is not inconceivable on a Vax, for example.] When a shared module wants to access a global variable (which cannot be shared), it must look up it's address in this "dictionary" (merely because the address of the global variable may be different in separate users' address spaces). Likewise for shared modules calling external routines. Thus, inter-module calls and externals cost more. Assuming the Unix non-shared disk structure (that is, a given disk block is in one file only), you'd have to meddle with the loader to handle the incorporation of these modules. [That is, not only the linker, which would generate references to these shared code files, but the kernel loader which would interpret them.] A big problem here is that changing one of these modules' code files probably breaks everything that requires it. A much simpler, but also a *lot* less flexible, approach would be to make system-wide fixed-location libary routines. *All* of these routines would be at a known location in *every* process's address space; kernel tables for such stuff could be limited to once (instead of once per process). Adb/sdb would have to be taught that when looking up addresses in the shared area to look in their own address spaces rather than that of the child/core-file. The linker would have to be changed to understand the new addresses for these things, but that's not too tough. Again, the kernel's loader would be the hardest change; I'm not sure how I'd deal with page tables, but it could be done. You'd probably have to build some mechanism for changing the shared modules (like, adding some, or even changing some without rebooting (!)); requiring that all entries into shared modules go through an indirect dictionary (even if it's from a non-shared module) would help in that regard. A nice idea, and one whose time has come, but not for Unix systems probably. Capability systems, as well as better virtual memory systems, stand a much better chance of pulling this off. Michael
james.umcp-cs%udel-relay@sri-unix.UUCP (07/25/83)
From: James O'Toole <james.umcp-cs@udel-relay> On PRIME machines running the PRIMOS operating system, sharing of library routines is accomplished via a strange call-by-name. The name of the routine is provided to the OS, it looks up the address, and MODIFIES the calling instruction to directly this address. I don't like this, but it works pretty well.
mike.rice@rand-relay@sri-unix.UUCP (07/26/83)
From: Mike.Caplinger <mike.rice@rand-relay> Anybody ever heard of shareable images under VMS? There's no additional overhead once a process's address space is mapped in because tha shared routines are just magically mapped in virtual memorywise. I think there were once plans to put such things in 4.2 BSD, but they seem to have been abandoned. They are nice in VMS, and can save lots of disk space. Also, if you change something you just reinstall the sharable library - nothing need be relinked (if you're clever and use a vector table, anyway.) I don't like too many things about VMS, but this seems to be a major win.
obrien@rand-unix@sri-unix.UUCP (07/26/83)
This message is empty.
ron%brl-bmd@sri-unix.UUCP (07/26/83)
From: Ron Natalie <ron@brl-bmd> Now don't think this is unique to VMS. If you really want to get wierd, have some one explain common banks in EXEC 8 to you. -Ron
edhall%rand-unix@sri-unix.UUCP (07/26/83)
The idea precedes VMS; DEC's RSTS/E for the PDP-11 has had `Run-Time Systems' for some time. They essentially allow re-entrant code to be accessed by any number of jobs at once. Even such a beast as Perkin-Elmer's OS/32 has re-entrant libraries. Any discussion as to why (or how) such a thing could/couldn't be added to UNIX? -Ed
tbray@mprvaxa (07/26/83)
Shared system routines were a primary design objective of VMS, and several different tools are provided for building and using them. However, the goal was not easily achieved, and even now, 5 years later, there were some pretty fundamental changes made to the linker (read loader) with VMS version 3, to eliminate some obscure contradictions that had been introduced. People in the group are correct when they predict horrible problems arising in making the loader smart enough to correctly all the permutations and combinations this can introduce. And with real memory getting so cheap, I wonder if it's worth it. The way it's done at VMS run-time is that the shared stuff can appear anywhere in the address space, with the corresponding entries in the page table containing a flag indicating a shared reference. The code is then found via ANOTHER page table (called a global section table). This is referred to as a 'global valid page fault'. thinking that the new Amdahls support 64M phys memory (!!!!!!!), ...microsoft!ubc-visi!mprvaxa!tbray
jhh@ihldt.UUCP (07/27/83)
If shared memory would have execution permission turned on, no other kernel changes would need to be made to support shared libraries on System V, everything else is there. All that remains is to create special library interface routines, and the shared memory manipulator.
barmar@mit-eddie.UUCP (07/27/83)
The discussion of shared libraries that occurred a while ago was mostly about whether library routines that just about everyone uses should be moved into the kernal, since people didn't want to have to deal with these issues. It died, luckily. BTW, shared libraries were implemented in Multics (a "pre-clone" of Unix :-)) from day 1 (nearly twenty years ago). We call it dynamic linking, and I wouldn't want to live without it. All it takes is a bit in indirect pointers which causes a reference to fault; the OS traps the linkage fault, unfaults the pointer to find the symbolic name of the reference, finds the library routine, patches the indirect pointer to reference it, and restart the instruction. -- Barry Margolin ARPA: barmar@MIT-Multics UUCP: ..!genrad!mit-eddie!barmar
mat@hou5e.UUCP (M Terribile) (07/27/83)
There is one reason for not putting a shared library system on an OS. It is vary hard to do it in a way that is both general and right without sacrificing machine cycles, IO bandwidth, or something else. I have seen it done efficiently, but in a very specific way, to help maintain speed of a UNI*X machine dedicated to a few special applications. I have seen it done generally, on the HP 3000, with HARDWARE SUPPORT. As a result, loading the COBOL compiler can take up over 3 seconds of dedicated disk usage (run-time linking). It IS possible to get around it; the HP's OS has a ``sticky-bit'' type of facility; but the problem affects EVERYTHING that runs on the machine and the difference between a dog of a machine and a smooth--running one lies almost entirely in the technical savvy of the system administrators. Not desireable! Perhaps a middle ground could be found in a multi-kernelized system, with an efficient ``sys call'' facility, but if you are talking about cheap machines (managers DO buy DG machines, you know) it may take a while to happen. Mark Terribile Duke of deNet
thomas@utah-gr.UUCP (Spencer W. Thomas) (07/27/83)
I'm surprised nobody has mentioned IBM yet. The 360 architecture certainly supported "shared libraries" quite well, but they called it "dynamic linking". I wrote quite a few programs which used shared database libraries on IBM 360/370 equipment (IMS and CICS). =Spencer
jbray@bbn-unix@sri-unix.UUCP (07/27/83)
From: James Bray <jbray@bbn-unix> What you are talking about here is a Run-Time Library. This is something which the Gods would indeed smile upon, had they not in their imponderable wisdom created Unix without shared segments or things of this sort. As Unix grows into the more advanced hardware which it now finds itself on, these should become available. We are told that system V, which I should have and be upgrading our Unix to any day now, has some sort of shared-memory capability between processes. I would be most interested if someone who has actually seen the code could describe it, as shared memory for unix can be done either of two ways: as a major architectural change involving work all over the place in the kernel and breaking everything in the process -- the way it should be done -- or as a bizarre and inelegant hack, sort of like using pipes and ports as interprocess communication. In any case, what you want is something like, to hark back for the umpteenth time to my last job, the way Perkin-Elmer's OS/32 (a big assembly-language mess with a horrible user interface which makes it look like a bizarre form of torture compared to unix, and all this neat real-time type stuff, which makes unix look like a toy compared to it (I'll take unix any day, but would really like both)) does it, which is named, shared, read-only segments in memory which are loaded when the first process having need of their contents is loaded, and which subseqeunt users merely link to. The way this all works is that one builds this thing just like a library, putting all the right stuff in it, and then one has one's loader scan this thing for needed library routines before getting them in the usual way; the loader builds a link into the task image that points to this segment, and pulls it in off the disk if not already resident, at run-time. You can do all sorts of great things with shared segments, but as one can imagine they rather complicate questions concerning core management, especially swapping. But it is worth it; they save not only disk space, but a lot of core as well. What you don't want to do is start putting the stuff in the kernel, unless you do it via a very restrictive and well-defined system-service interface. It is much nicer to have shared run-time libraries. --Jim Bray
guy@rlgvax.UUCP (Guy Harris) (07/27/83)
The way Multics solved the "global references from shared libraries" problem was to use a giant transfer vector called the "(common) linkage segment". This was also necessary for the dynamic linking. Any reference to an external of any sort went through an indirect pointer in the common linkage segment. This pointer was initially a special pointer which caused a trap, and it pointed to a character string which was the name of the external. When the fault occured due to this pointer, the OS would find the segment referenced by that pointer (using a search rule similar to PATH) and "initiate" it (i.e., map it into your address space). It would copy the prototype of its common linkage segment section into the system common linkage segment (which was also the per-process static data segment, so this copy would also initialize static variables), so any time any routine in that segment referenced an external the same fault process would occur. Then it would paste the address of the given entry point in the given segment into the pointer in the common linkage segment. Unfortunately for this scheme under UNIX, existing compilers don't produce code to reference externals through such a transfer vector (at least not on the machines I'm familiar with; I've seen references to transfer vectors on the 3B machines), so the Multics solution can't just be dropped into UNIX. Guy Harris {seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy
guy@rlgvax.UUCP (Guy Harris) (07/27/83)
P.S. In case anyone wants to pick nits, I should have said "COMBINED linkage segment" when I said "COMMON linkage segment". It's been over 8 years since I've been around Multics.
guy@rlgvax.UUCP (Guy Harris) (07/27/83)
If the HP 3000 does shared library routines with full generality (i.e., dynamic linking), that probably accounts for most of the load. RSX-11 does not do dynamic linking; you must "link" in the references to the shared library at program link time. Of course, this means you can't just stick in a new copy of the shared library whenever you change a routine and expect everybody's programs to use the new version automatically (which is one of the side benefits of system calls; just re-sysgen the OS and everybody making a system call gets the new code). So there are some tradeoffs available, depending on how general or right you want to be. I've not used RSX-11M style shared libraries (i.e., you bind at program link time, and they are accessed mostly like regular libraries), so I don't know how inconvenient the restrictions on such are. The fully general approach (i.e., bind at program execute time) does impose the cost of a linker each time you run the program, but Multics provided a binder which permitted you to bind references between the modules given to the binder before program run time. Guy Harris
mark.umcp-cs@udel-relay@sri-unix.UUCP (07/29/83)
From: Mark Weiser <mark.umcp-cs@udel-relay> thinking that the new Amdahls support 64M phys memory (!!!!!!!), ...microsoft!ubc-visi!mprvaxa!tbray And now even Vaxes support 32M phys...
hal@cornell.UUCP (Hal Perkins) (07/31/83)
Someone made a remark about the Univac 1100 EXEC 8 "common banks". This was something of a kludge, but it did allow shared libraries WITHOUT having to make changes to the existing compilers or loaders, and old programs could take advantage of the shared library by just relinking them. It worked something like this. The shared routines were kept in a common area of virtual memory (that's the basic idea--the details are much more grungy and you probably don't want to know them). [On a VAX, this could be done by placing the shared routines in the system 1/4th of the virtual memory and using common page tables for all users.] The shared routines were preceded by an address vector, and all calls to shared routines went through this vector. Thus, shared routines could be modified and moved around in memory as long as the pointer was updated. This avoided wiring absolute entry point addresses into user programs, and meant that programs always used the currently installed version of the routines. The interesting thing is how this was made to work with old user programs. In the system libraries, the existing routines were replaced by little stubs that had exactly the same calling sequence as the old (non-shared) library routines. These stubs jumped to the appropriate shared routine to do the work. The stubs were linked with compiler object files to produce absolute files just as before. But once this change was made, the size of absolute files shrunk by almost the entire size of the library routines, and this was achieved without modifying any of the compilers or loaders. Eventually, some of the compilers were modified to call the shared library directly (I believe), which eliminated the small overhead of calling the shared library through the stub routines. I can't see why it wouldn't be possible to implement a similar setup in Unix, at least on systems with large virtual memories. The savings in disk space for linked files and the reduction in individual program working sets might well be worth the small cost in extra CPU time needed to call shared routines indirectly. Any volunteers? I am not a qualified wizard and don't have any spare time to attempt something like this even if I knew enough about the system to do it. Hal Perkins UUCP: {decvax|vax135|...}!cornell!hal Cornell Computer Science ARPA: hal@cornell BITNET: hal@crnlcs
guy@rlgvax.UUCP (08/01/83)
Just out of curiosity, how did the EXEC 8 system handle global references from the shared library to data, rather than routines, outside of it? Did it use the Multics technique of using an indirect pointer to refer to any external, whether code or data; did it just forbid such references; or did it find a third way out? Guy Harris {seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy
andrew@orca.UUCP (Andrew Klossner) (08/02/83)
"The idea precedes VMS; DEC's RSTS/E for the PDP-11 has had `Run-Time Systems' for some time. They essentially allow re-entrant code to be accessed by any number of jobs at once." And the idea precedes RSTS. TOPS-10, the PDP-10 operating system upon whose architecture RSTS was based ("hey, let's write TOPS-10 in Basic for a minicomputer!") implemented "Object Time Systems", which occupied the upper half of the logical address space of a program written in Fortran, Algol, or Cobol. -- Andrew Klossner (decvax!teklabs!tekecs!andrew) [UUCP] (andrew.tektronix@rand-relay) [ARPA]