[net.unix-wizards] shared libraries

jmcg (04/08/83)

Printf should not be a system call.  System calls should be reserved
for functions requiring access to something not available to the
process by normal means.

Shared libraries would provide all the advantages claimed.  Only one
copy of the library needs to be kept in memory.  Only one copy of the
library's routines needs to be kept on disk.  Linking and loading would
be faster.

The main reason that shared libraries are not a part of UNIX is because
they were not reasonable on PDP11s, where dedicating a chunk of address
space to the library would have decreased the amount of space available
to programs (few programs load the ENTIRE library, after all).

Besides having to consume enough address space to hold the entire
library, there are other implementation decisions that have to be made
for shared libraries.  A scheme for dynamic linkage is needed, to allow
for loading the library initially and to allow for updating it without
rebooting the system or having to recompile everything.  For instance,
VMS uses a transfer vector at the front of the library so that a
particular routine does not get a new address when the library is
updated.  On large Burroughs systems, the operating system would trap
on the first call to a library routine, load it if necessary, and then
patch in the address so later calls would take place without overhead.

The code in a shared library generally must be position-independent and
access to static data must use indirection, since its location is not
known at the time the library is linked.  These restrictions can be
avoided by reserving fixed locations in address space for library text
and data, but this approach has several drawbacks making it difficult
to use several shared libraries, and, on the VAX, wastes page table
space; plus, you have to find a place that will satisfy everyone.

Shared libraries offer significant performance advantages, so they will
eventually occur despite their requiring an extension to the UNIX
process architecture.  We should see them implemented at the same rate as
we see the PCC replaced (or supplemented) with C compilers that do a
much better job of generating code on their particular machine: i.e. it
is a machine-dependent improvement that is only likely to occur as
manufacturers start supporting UNIX.

In their talk at Winter UNICOM in San Diego, Glenn Skinner and Bill
Jolitz talked about their plans for shared libraries for UNIX on the
16032.  Neither SUN nor Berkeley has mentioned similar plans for
4.2BSD, but some of the new virtual memory capabilities planned would
make it easier to do.
							Jim McGinness
		sdcsvax!jmcg	(619)452-4016		UC San Diego, Chemistry
	   or	decvax!jmcg

tim (04/08/83)

Does bsd UNIX on the VAX not follow the VAX Architecture Handbook?
If it does, then half the address space (3 gigabytes, isn't it?)
is system space, shared by all processes. What is the objection
to putting printf and other very commonly used routines in system
space?

Tim Maroney

BRUCE@umdb@sri-unix.UUCP (07/29/83)

From:  Bruce Crabill <BRUCE@umdb>

There was nothing intrinsic to IBM's 360 architecture that made dynamic linking
possible.  It was all done with software and should be do-able on almost any
machine.  Univac does however have hardware features that make shared libraries
very efficient to use.  They have a memory mapping concept that allows upto 4
address spaces ("banks") to be visible at any given time.  They have an
instruction that says to make a given bank visible and to jump to it.  If the
bank is not currently in memory at the time, the operating system is given an
interrupt and it swaps it in.  Entry points to these "common banks" (as Univac
calls these pieces of shared code) are usually handled by having a jump vector
at the beginning of the common bank.  Thus, the entry points always have a fixed
address and changes to the common bank will not require modifications to
existing programs.  Back to IBMs, VM/370 has a feature of shared code much like
Univac's common banks except that it is implemented via software.  In VM, a
program must issue a DIAGNOSE (a operating system call) to cause a given piece
shared code to be mapped into a user's address space.  This requires much more
work on the part of the operating system than Univac's approach since it always
requires OS intervention rather than only when the shared code is not in memory,
which, for commonly used routines, it probably is.  IBM has a feature called
DAS ("Dual Address Space") that appears to do this all in hardware the same
as Univac.  It is only available on the 308x as a standard feature and on the
3031, 3033 and 4341 as optional features.  It appears to have been designed
for MVS.

                                       Bruce

Arpanet:  BRUCE%UMDB.BITNET@BERKELEY
Bitnet:   BRUCE@UMDB

gwyn@brl-vld@sri-unix.UUCP (07/29/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

I like libraries the way they are, thank you.

mo@lbl-csam@sri-unix.UUCP (07/31/83)

From:  Mike O'Dell [system] <mo@lbl-csam>

In general, I think Unix has gotten along quite well without 
shared libraries, running on very small machines (dual rk05 systems
have 5-megs total), and should probably continue to do so.

While shared libraries do have some useful things going for them,
they also have one extreme liability:  everyone using the
shared library gets ALL of the library, even when they only use
small parts of it.  On machines with 16-bit program counters,
this can be disastrous.  Example: no way will VI run without overlays
on non-split I/D machines if all of STDIO gets loaded.  This is not
a bad example since it is probably one of the heavier used programs
at many  installations and would probably be expected to benefit
a great deal. It seems to me that the machines which can support it
easiest (eg, VAX) usually need it the least: VAXen usually don't run
from 10-meg winnies (modulo the dual-RL02 730).  RSX on the PDP-11
has had shared paritions (which can be readonly, or read-write)
since T=0, but shared libraries aren't used too terribly much as a
general facility (like sharing STDIO, etc; but they are used a great deal
for other things) because of the large granularity of memory management
on the PDP-11 (8K bytes).

This might be mitigated, however, by the large-address-space micros like
the 68K and the 16K.  In that case, you either get to do a general memory
sharing scheme, or do a special case hack.  If a hack is proposed, I have
some suggestions.

One way would be to add a new magic number for a.out files.  This number
says to load at the address specified in the header, but share the first
ZZ bytes (determined from the header or as a system constant) with other
programs having the same magic number.  The "real" text segment could still
be shared with a little extra work.  The scheme is to then package the Library
code with the C startup code which is then loaded at the front of the
a.out.  This would let the normal loader work fine without any special
hacks - with the possible exception to add a flag which says to start loading
at a certain address (4.x already has this), and a flag to generate
the appropriage magic number. The cc command would by default then generate
an ld command which says

	ld -Z sharedcrt0.o -T<ZZ> main.o user1.o user2.o ...

The -Z flag says to generate this new magic number.
There would have to be some way to get this shared part initialized.
Probably the easiest way would be for something out /etc/rc
load code into the shared area with a shared magic number.  The stuff is then
administered like stick-bit text.

What is wrong with this??  Lots of things.

It isn't general - there are provisions for only one such shared library.
One solution would be to add another word (somewhere!) which indicates which
shared first piece is being used, and keep track of multiple pieces, but that
doesn't really solve the problem of them all loading in the same place.
It also doesn't solve the problem of a shared read-write segment, which
one would like to see done if going to all this trouble.  A shared paritition
scheme like RSX uses could certainly be done, and could be supported by
the current loader without too much work, but if you have ever seen an
RSX-11 Task Builder input file, you know the evils which must be strenously
avoided.

Again, the whole issue of shared libraries sounds like a special-case
hack one does when he is out of some critical resource: eg, disk or
physical memory, but both of those are getting cheaper almost infinitely
faster than programmer time.  So I would argue that this is NOT
a feature which belongs in Unix for the future, but might well be
a hack I would put into my own system running on toy disks, or possibly
into a system like Xenix which is specifically targeted to run on
radically-underconfigured machines.

	Save me Rob, save me!!!
	-Mike

db@cbosgd.UUCP (08/04/83)

A reason for shared libraries that doesn't seem to have gotten much
discussion is: being able to update common application libraries without
having to recompile & load the entire system.

For example, our product uses a database library that is referenced by the
majority of the application code.  If we need to modify this library, we
rebuild the entire system to make sure that everything is properly loaded.
This takes *eight hours* (or more) on a PDP-11/70 (NOT including the time
to load the resultant tapes and restart the system).

Most of our application programs can be updated in the field by putting
the revised version on the customer's machine and compiling and installing
it there, but this cannot be easily done with the library code.

So, some kind of shared-library mechanism that would allow new versions
of common code to be installed (as long as the function calls remain the
same) sure would be handy.

-Dave Bursik/BTL Columbus/..cbosg!db, ..cbosgd!db

zrm@mit-eddie.UUCP (Zigurd R. Mednieks) (08/06/83)

The NS16000 seems to be able to hack shared libraries, and with some
only slightly awful software hacks, seems capable of Multics-like
dynamic linking. Now THAT would be a really nice thing to have in Unix.
No more relinking huge database systems applications, no more dozen or
more copies of huge stdio libraries in core. And you could even supply
you own versions of libraries to be used by other programs -- your own
version of curses for instance.

I was a bit taken aback that the National Semi people at Usenix  hadn't
even heard of Multics and they didn't know what dynamic linking was.
It's amazing how the best ideas of the past become lost arts. No wonder
the VAX turned out the way it did.

Cheers,
Zig

laura@utcsstat.UUCP (08/07/83)

The flip side of the "gee wouldnt it be nice if everyone got the new
changes without having to recompile everything" is not so much fun.

Many users write software that depnds on the features (bugs) of existing
routines. If you fix the bugs, you break their code. The standard
situation is for some soul to come staggering into your office complaining
that this program which worked perfectly on June 4, 1980 is no longer
working. They dont know when it stopped working. They werent even at
the university 2 years ago, but their supervisor said that his last
grad student used it and...

And they dont know how to program in C, and they need the program
right away, and they dont know the algorithm it is trying to implement,
and would you please fix it right away??

This is a hassle. People who are interested in saving themselves work
need to consider this. It doesnt matter that people should have written
correct code in the first place -- I can guarantee that they wont...

laura creighton
utcsstat!laura

gwyn@brl-vld@sri-unix.UUCP (08/07/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

You are exactly right.

BRUCE@umdb@sri-unix.UUCP (08/07/83)

From:  Bruce Crabill <BRUCE@umdb>

In reply to Laura Creighton's comment about the disadvantages to shared
libraries I would just like to point out that if a program is taking advantage
of bugs in libraries then that program is going to die the next time anyone
builds a new version of it.

                                       Bruce

Arpanet:  BRUCE%UMDB.BITNET@BERKELEY
Bitnet:   BRUCE@UMDB

padpowell@wateng.UUCP (PAD Powell[Admin]) (08/07/83)

Laura made a comment that I think we should all be aware of.  Besides the
standard Murphy's Laws", which always apply to any programs, I have the
following observations (henceforth termed "Patrick's Nitpicks")
1.  No program done by an undergrad will work after he graduates.
2.  No program done by a grad student will work after he completes his
	thesis.
3.  No program done by a "haque" will work unless he is on the system
	("falling leaf in desert side effect").
4.  Programs done by U. of W. coop students will immediately fail as soon as
	the student starts working for your bitterest corporate/academic
	rival.
5.  Portable Unix libaries are for nerds.  Real Programmers use their own
	that give 50% more effectiveness.  This is called the "high speed
	chase in a minefield" syndrome, based on what users would like to do
	to the Real Programmer.

I think you all have further additions to the list...
Patrick ("I hear that UUCP was done by a haque") Powell

padpowell@wateng.UUCP (PAD Powell[Admin]) (08/07/83)

Gee, Ziggie,  I think you have hit on something.  Lets see now.
1.  According to "popular belief", all the system architects and so forth
	must be wonder whiz kids.
2.  Seeing how smart they are, they usually (with two exceptions that I know
	personally)  could not care less about dead/obselete/nonworking
	systems.  The only things to care about are the current/newest/fastest
	ones on the market/currently reviewed.
3.  All systems designs have ridiculous short lead times.  For example,  in my
	personal experience,  a company was told to come up with a system
	design for a real time system, with UNIX like structure, with
	documents,  an implementation schedule,  and supporting technical
	arguments in 6 weeks.

Conclusion:
	System design seems to be like running a stampede, it doesn't matter
	what you do, as long as you keep in front, and don't look back,
	or you get trampled to death.

Patrick ("UNIX still hasn't got inter process communication") Powell

dmmartindale@watcgl.UUCP (Dave Martindale) (08/08/83)

Now wait a minute, Patrick.  Hacks tend to write code which, at some level,
they feel proud of.  It may be elegant, or it may run faster than anything
anyone else can write, or just make use of an instruction that no one else
could find a use for, but there's something special about it.
Uucp doesn't fit this description.  I would guess that at least some of
the code was written by a novice, not a hack.

laura@utcsstat.UUCP (08/09/83)

A lot of people around here will hang onto their binaries until the day
they die, retire, or graduate, and they will not be interested in
'rebuilding' anything. The problem is more serious, though. What do
you do the day you decide to change malloc or printf and you make a
mistake? What do you do when your whole kernel is broken and can't come
up? Get a new disk pack and regen a new system? What? You don't *have*
a spare disk pack? I'd say you were rather thoroughly stuck, then...

laura creighton
utzoo!utcsstat!laura

gwyn@brl-vld@sri-unix.UUCP (08/10/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

Come on, guys, there are different user populations out here.
Some of us like to play with computers and some of us like to use
them to get things done.  Either way, UNIX is nifty.

UNIX most certainly DOES have IPC.  Maybe you should check your
system vendor out more carefully next time you buy a UNIX.

jbray@bbn-unix@sri-unix.UUCP (08/10/83)

From:  James Bray <jbray@bbn-unix>

However someone may choose to do shared libraries, they should neither be in the
kernel nor associated with it in any way. Thus, the kernel will come up. It
is possible that one will not be able to do anything once it does if one has
broken libc or if it has gotten trashed accidently; thus, one should keep a
survival kit of things like rm and mv in /systools, and backup copies of the
essential libraries in some safe place.

keith@soph.UUCP (Keith Crews) (06/20/85)

*** REPLACE THIS LINE WITH YOUR MESSAGE ***

A couple of weeks ago some one indicated that code to implement
shared libraries would soon be appearing in net.sources.  If I missed
it could someone please mail it to me.  Thanks in advance.

glenn@emory.UUCP (Glenn A. Zazulia) (11/04/85)

Does anyone have any information or bibliographic references on shared
libraries?  I am about to begin writing a paper on this topic and would
appreciate any suggestions of good places to search.

Please reply through electronic mail, if possible.  I will post a summary
if there is interest.

Thanks!

-- 
Glenn Zazulia

Emory University         |   {akgua,sb1,gatech,decvax}!emory!glenn    USENET
Dept of Math and CS      |   glenn@emory                              CSNET
Atlanta, Ga 30322        |   glenn.emory@csnet-relay                  ARPANET