rbbb@rice.EDU (04/08/87)
(Why this belongs in info-c is beyond me, but maybe it will be educational). I received replies from BOB%UTAHMED.BITNET@wiscvm.wisc.edu (Bob Wheeler) blarson%castor.usc.edu@usc-oberon.ARPA (Bob Larson) Beebe@SCIENCE.UTAH.EDU (Nelson Beebe) telling of shared libraries under VMS 4.* and PRIMOS 19.4-*. I also discussed details of the VMS implementation with Scott Comer here. First, under VMS it is NOT necessary to INSTALL a shared library; that is only done for purposes of speed or protected (i.e, privileged) shareable images. Any user can create and/or use a shareable image. Bob Wheeler and Nelson Beebe both report that these are Good Things, both because of the real memory saved and because of the flexibility. Shareable libraries are particularly nice in a graphics application--the device driver library can be chosen at run-time rather than link time, so we avoid a proliferation of .EXE files loaded for a number of display devices (PLOT79 supports over 40 different ones). PRIMOS has shareable images, and in 19.4 and beyond does them in a "new" nicer way. A link to a shared library, known as a dynt (dynamic entry) is a pointer to a charater string with the subroutine name with the fault bit set. When a procedure is called inderectly through a pointer with the fault bit set, primos handles the exception and will search the libraries specified in the user's entrypoint search rules for a routine with that name. When the routine is found, the library is mapped into the user's address space and the faulting pointer is replaced with a pointer to the actual routine. (Such pointers must be in the users data rather than pure procedure space since the program may not be mapped in at the same address for other users.) The pure procedure portion of the library is shared among all users using the library and paged against the run file (rather than copying to the paging device), and the static data is allocated out of process-class or procedure-class storage depending on the class of the library. (Process class storage will not be reallocated for further use of the library across program execution bounds.) Libraries are referenced by their file name, and protected via the normal file system protection. Search rules are part of the user's enviornment, are initialized to the system default at login time, and may be manipulated via the set_search_rules command and some system subroutines. This approach seems to be doable entirely at run-time. Any unresolved references at link-time are just converted to these character-string references. This approach is (according to Scott) also used in Cedar Mesa for unresolved references. Another approach, less flexible but still imaginable, is to specify the shared libraries at link time and leave holes in the address space for the shared routines (you must specify them to the linker so it knows how big a hole to make). At run-time when the image page map is created these holes are mapped to the shared libraries; if the mapped routine has become bigger than its hole, you lose. Both approaches require so additional fooling around in the operating system to be sure that the same physical pages are actually mapped by all current users of the image. Without this, you still have the flexibility, but every user of the image has his own private set of physical pages (but everyone pages against the same image). To add shareable libraries to unix (or any other operating system), the following things must be taken care of: 1) Position Independent Code. Without this, any shareable library implementation is bound to be brain-damaged. It must be possible to load shared (or just run-time-bound) code into any place in text address space to avoid conflicts or wasted address space. 2) Debugging and shared code. It is Not Friendly to actually set a breakpoint in a shared page; this must either be prohibited or handled by copying and remapping the page (ptrace is a system call, so it can handle this if it really wants to). 3) Correct paging and swapping of shared code. There is no reason that a piece of shared code should not get paged out, but this should only happen after it has been ejected from the working set of every one of its users (or else it should cause this to happen). This creates an odd sort of dilemma---one the one hand, there is "no reason" to eject a heavily shared page from a working set because it "doesn't cost that much" (only one physical page is used). On the other hand, if it isn't ejected from working sets then it will never be released. A possible solution to this is to page shared memory as if it belonged to a separate process (in fact, it might be managed by a separate process). Another complication arises because it becomes possible for several faults to the same physical page to occur at the same time. 4) Calls from one shared, dynamically loaded library to another shared, dynamically loaded library. To implement this, there must be a few unshared (data) pages containing the traps and names of the routines so that each user gets the correct calls. I think I've said enough. It should be clear that shared memory is Not Enough to get shared libraries; one also needs position independent code and some idea of what to do with debugging and shared calls to dynamically loaded code. David