[comp.unix.internals] Shared Lib Question

pat@rwing.UUCP (Pat Myrto) (05/06/91)

I have noticed with interest the discussion going on regarding shared
libraries.  However, what is obvious is that there are several kinds
of shared libaries, all using some different scheme to operate.

Does anyone out there know how the type that is used on the ISC SysV
R3 version operate?  Does the whole image load into core and remain
there, and the programs that use the shared libary access the
functions they need?  Or does an attempt to load the library occur and
the system finds its allready loaded and shares the text, similar to
the way separate processes of the same program share text?  Once
loaded by a program using it, does it STAY in core, or do parts stay,
or if nothing is running that uses it, is the core space freed?

If the questions seem stupid, its because I have NO IDEA of how this
works - docs not being much help - so other than the obvious saving on
disk space, I am not really able to make an intelligent decision on
whether a given application would be best built with or without using
the shared library.

Thanks for any info...
-- 
pat@rwing                                       (Pat Myrto),  Seattle, WA
                            ...!uunet!pilchuck!rwing!pat
      ...!uw-beaver!uw-entropy!dataio!/
WISDOM:    "Travelling unarmed is like boating without a life jacket" 

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/07/91)

In article <276@rwing.UUCP> pat@rwing.UUCP (Pat Myrto) writes:

>I have noticed with interest the discussion going on regarding shared
>libraries.  However, what is obvious is that there are several kinds
>of shared libaries, all using some different scheme to operate.

It proves that the concept of shared libraries is not so simple.

						Masataka Ohta

guy@auspex.auspex.com (Guy Harris) (05/09/91)

>>I have noticed with interest the discussion going on regarding shared
>>libraries.  However, what is obvious is that there are several kinds
>>of shared libaries, all using some different scheme to operate.
>
>It proves that the concept of shared libraries is not so simple.

Only to people who, for whatever mysterious reason, thought that:

	1) there was only one way that every OS in the universe uses to
	   implement shared libraries;

or

	2) every UNIX system in creation that provides some
	   functionality does so in the same fashion.

Anybody who believed neither 1) nor 2) already knew that the concept
wasn't "so simple" that they could assume that every system in the
universe did shared libraries the same way.  Anybody who believed 2)
hasn't seen very many UNIX systems; anybody who believed 1) hasn't seen
many OSes, or hasn't noticed that, in fact, different OSes provide
*lots* of different functions - not just shared libraries - in different
fashions. 

I.e., it's not an argument of any sort against shared libraries, if
that's what you had in mind....

jfh@rpp386.cactus.org (John F Haugh II) (05/09/91)

In article <162@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <276@rwing.UUCP> pat@rwing.UUCP (Pat Myrto) writes:
>>I have noticed with interest the discussion going on regarding shared
>>libraries.  However, what is obvious is that there are several kinds
>>of shared libaries, all using some different scheme to operate.
>
>It proves that the concept of shared libraries is not so simple.

You are confusing the concept and the implementation of the
concept.  I would argue that the wide variety of implementation
schemes is proof that the fundamental concept is very simple.

The trouble (as I see it) is that the C libraries were not
designed well from the start.  The notion that there should
never be global variables with all manner of hidden side
effects was beaten into my brain as a CS undergrad.  Not to
slight Ritchie, et al, but they appear not to have suffered
the same violent abuse at the hands of their instructors ...

Were the code in the C library pure, shared libraries would
be extremely simple to implement.  Data, which isn't sharable,
is the worst of the flies in the ointment.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/09/91)

In article <7690@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:


>Anybody who believed 2)
>hasn't seen very many UNIX systems; anybody who believed 1) hasn't seen
>many OSes, or hasn't noticed that, in fact, different OSes provide
>*lots* of different functions - not just shared libraries - in different
>fashions. 

Apparently, you haven't used many OSes. Most OSes do many things badly.

There are only few (if not zero) ways to do something right.

Moreover, there seems to be no right implementation of shared libraries, so
far.

						Masataka Ohta

kre@cs.mu.oz.au (Robert Elz) (05/09/91)

jfh@rpp386.cactus.org (John F Haugh II) writes:

>The notion that there should
>never be global variables with all manner of hidden side
>effects was beaten into my brain as a CS undergrad.

All kinds of generally true, but occasionally inadequate, principles
are beaten into the brains of CS undergrads (or those that actually
have such things) - CS undergrads are, as a rule, lacking in the maturity
and experience needed to understand the subtleties of what really should
be done.  I suspect that dmr just may have had that experience and ability.

>Were the code in the C library pure, shared libraries would
>be extremely simple to implement.

True, they'd also be close to useless.

kre

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/10/91)

In article <19252@rpp386.cactus.org>
	jfh@rpp386.cactus.org (John F Haugh II) writes:

>The trouble (as I see it) is that the C libraries were not
>designed well from the start.  The notion that there should
>never be global variables with all manner of hidden side
>effects was beaten into my brain as a CS undergrad.  Not to
>slight Ritchie, et al, but they appear not to have suffered
>the same violent abuse at the hands of their instructors ...

Consider an OS. It contains process private global variables, such as uid,
which has global hidden side effects.

As you know, UNIX libraries are well desined so that there is no distiction
between library calls and pure system calls.

A pure system call, one day, may be implemented as a complex library call
the other day.

Thus, it is natural that a library call have its own state, like a stdio
library.

>Were the code in the C library pure, shared libraries would
>be extremely simple to implement.  Data, which isn't sharable,
>is the worst of the flies in the ointment.

It is absurd to make C libraries complex only for the simple implementation
of shared library.

BTW, data for indirect jump to library routines is, anyway, not sharable.

						Masataka Ohta

jfh@rpp386.cactus.org (John F Haugh II) (05/10/91)

In article <kre.673798776@mundamutti.cs.mu.OZ.AU> kre@cs.mu.oz.au (Robert Elz) writes:
>jfh@rpp386.cactus.org (John F Haugh II) writes:
>>Were the code in the C library pure, shared libraries would
>>be extremely simple to implement.
>
>True, they'd also be close to useless.

Would you care to elaborate on this point?  Name a single library function
which cannot be implemented well without global variables.  Justify your
answer.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

jfh@rpp386.cactus.org (John F Haugh II) (05/10/91)

In article <174@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>Thus, it is natural that a library call have its own state, like a stdio
>library.

There is a difference between `state' and `global variables'.  For example,
the stdio library you mention, could keep its state in the (FILE *) object
you pass as its argument in the `no global variables' version of our
stdio library.  Functions which implement the `no (FILE *) argument'
version of the library routines would be simple wrappers ala

	printf (args ...)
	{
		return fprintf (stdio, args ...);
	}

where what you do to make "args ..." work is left to the reader ...
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

fmayhar@hermes.ladc.bull.com (Frank Mayhar) (05/11/91)

In article <162@titccy.cc.titech.ac.jp>, mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
-> In article <276@rwing.UUCP> pat@rwing.UUCP (Pat Myrto) writes:
-> >I have noticed with interest the discussion going on regarding shared
-> >libraries.  However, what is obvious is that there are several kinds
-> >of shared libaries, all using some different scheme to operate.
-> It proves that the concept of shared libraries is not so simple.

The _concept_ is simple.  The _implementation_ is complex.  And has certain
problems that may be solved in a number of different ways, hence the
different shared library schemes.
-- 
Frank Mayhar  fmayhar@hermes.ladc.bull.com (..!{uunet,hacgate}!ladcgw!fmayhar)
              Bull HN Information Systems Inc.  Los Angeles Development Center
              5250 W. Century Blvd., LA, CA  90045    Phone:  (213) 216-6241

barmar@think.com (Barry Margolin) (05/11/91)

In article <19255@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>  Name a single library function
>which cannot be implemented well without global variables.

Malloc() needs a global variable that points to the arena.

Stdio uses the global variables stdin, stdout, and stderr.

Errno is a global variable, and some library routines set it.
-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR/AA) (05/11/91)

As quoted from <162@titccy.cc.titech.ac.jp> by mohta@necom830.cc.titech.ac.jp (Masataka Ohta):
+---------------
| In article <276@rwing.UUCP> pat@rwing.UUCP (Pat Myrto) writes:
| >I have noticed with interest the discussion going on regarding shared
| >libraries.  However, what is obvious is that there are several kinds
| >of shared libaries, all using some different scheme to operate.
| 
| It proves that the concept of shared libraries is not so simple.
+---------------

No, it proves that anything, regardless of its simplicity, can be made
arbitrarily and unnecessarily complex.  SVR3 shared libraries are a pretty
good example of that.  But it does NOT mean that any given complex
implementation is proof that the *concept* is complex.

Quite aside from the other complexities underlying such things as varying
shared library implementations:  marketing decisions, for example.  Now
THERE'S a complex system for you to try to unravel.  Good luck --- you'll need
it.

++Brandon
-- 
Me: Brandon S. Allbery			  Ham: KB8JRR/AA  10m,6m,2m,220,440,1.2
Internet: allbery@NCoast.ORG		       (restricted HF at present)
Delphi: ALLBERY				 AMPR: kb8jrr.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery       KB8JRR @ WA8BXN.OH

jim@segue.segue.com (Jim Balter) (05/12/91)

In article <kre.673798776@mundamutti.cs.mu.OZ.AU> kre@cs.mu.oz.au (Robert Elz) writes:
>>Were the code in the C library pure, shared libraries would
>>be extremely simple to implement.
>
>True, they'd also be close to useless.

Nah.  If every library routine took a pointer argument that was a handle for
the data area, the library could trivially be pure.  You could even write
internal versions that take the handle and external versions (with the familiar
names and arguments) that don't but pick it up from a global and pass it and
all their arguments to the internal routines.  Voila, you have pure library
routines compatible with the current C library interface.  The internal
routines would be just like the current routines except that, instead of using
globals, they would use members of a structure, an instance of which would be
pointed to by the handle.

There's generally a cost for the extra call level, although on architectures
with lots of registers the handle could be loaded into one register and never
changed, and the internal library routines could be called directly (or even
vectored through the handle) given compiler support for accessing the register.


Since it is trivial to turn a routine that uses globals into one that doesn't
by adding an argument, it is silly to say that such routines are "close to
useless".  On the other hand, if "pure" means "has no side effects", then
we are talking about Functional Programming, which is a whole other subject.

jfh@rpp386.cactus.org (John F Haugh II) (05/13/91)

In article <1991May10.192648.3147@Think.COM> barmar@think.com writes:
>In article <19255@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>>  Name a single library function
>>which cannot be implemented well without global variables.
>
>Malloc() needs a global variable that points to the arena.
>
>Stdio uses the global variables stdin, stdout, and stderr.
>
>Errno is a global variable, and some library routines set it.

The key word was "well", not "at all".  malloc() can be
implemented as a function which is bound static and has
a single pointer to all the private data that it requires.
It then calls the real routine which takes a pointer to
the arena glarp and size of the desired object.

Likewise for the stdio library.  Functions which have
implied (FILE *) objects can be re-implemented as wrappers
for the versions which require the explicit argument.  The
FILE _iob[] 

Given that the functions were =designed= with the notion
of shared global data, I don't see any reason not to cheat
and leave certain parts =out= of the shared library.

I'll leave errno as an exercise for the reader.

[ Hint: How does the system get errno out of the kernel and
  into the user space, if it is a user space global variable? ]
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

guy@auspex.auspex.com (Guy Harris) (05/13/91)

>Apparently, you haven't used many OSes. Most OSes do many things badly.

Irrelevant.  I said that different OSes provide various functions in
different fashions, which means that the fact that different OSes
implement shared libraries isn't any sort of valid argument against
shared libraries; your statement doesn't have any relevance to that.

So what's an OS that doesn't "do many things badly"?

>Moreover, there seems to be no right implementation of shared libraries, so
>far.

OK, so what would you consider a "right" implementation of them?  What
don't you like about, say, Multics's implementation, or VMS's, or
Aegis's, or SunOS 4.x/S5R4's, or OSF/1's, or....?

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/13/91)

In article <7762@auspex.auspex.com>
	guy@auspex.auspex.com (Guy Harris) writes:

>>Apparently, you haven't used many OSes. Most OSes do many things badly.
>
>Irrelevant.  I said that different OSes provide various functions in
>different fashions, which means that the fact that different OSes
>implement shared libraries isn't any sort of valid argument against
>shared libraries; your statement doesn't have any relevance to that.

The problem is that NO OS support shared libraries right, perhaps because
there is no way to do so.

>>Moreover, there seems to be no right implementation of shared libraries, so
>>far.

>OK, so what would you consider a "right" implementation of them?  

Do you consider there is a "right" one?

>What
>don't you like about, say, Multics's implementation, or VMS's, or
>Aegis's, or SunOS 4.x/S5R4's, or OSF/1's, or....?

Indirect jumps and accompanied process private data for the jump table.

						Masataka Ohta

jim@segue.segue.com (Jim Balter) (05/14/91)

In article <19256@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>There is a difference between `state' and `global variables'.  For example,
>the stdio library you mention, could keep its state in the (FILE *) object
>you pass as its argument in the `no global variables' version of our
>stdio library.

However, ANSI mandates that fclose(NULL) close all open FILEs, which requires
a global pointing to the list of FILEs.  Of course, fopen could be a non-shared
wrapper around a shared global routine; you would also need fopen and fdopen to
be wrappers, since they need to hang new FILEs on the list.  So, one can
implement the C library as a shared library without globals if one is willing
to claim that the wrappers and the overhead they require isn't part of the
implementation, but that is stretching it.

The obvious other C library routine that requires globals is malloc.
You could pass it an arena descriptor, but that would require that any library
routine that ever possibly might be shared and use malloc to be passed the
descriptor.  Making malloc take an arena descriptor is bad library design.
However, it is again possible to have the internal version of malloc accept
such a descriptor, and have the wrapper for malloc, as well as the wrapper for
any shared routine that uses malloc, pass the a global descriptor to malloc.
Of course, any shared library routine that does not get passed the descriptor
can never be reimplemented to use malloc, at least not compatibly with old
executables (but then, that's what library version numbers are for).

It is far better to either keep a register pointing to a global data area,
as mentioned in my previous note, or to support shared libraries with
private data, with all the attendant complexity, than to cripple a library
specification on the basis of a misapplication of CS principles.

>"If liberals interpreted the 2nd Amendment the same way they interpret the
> rest of the Constitution, gun ownership would be mandatory."

Actually, they would merely oppose gun control.  But why be honest when you
can be hyperbolic?  Whereas liberals want to ignore the 2nd while honoring
the rest, reactionaries want to honor the 2nd and ignore the rest, and call
liberals hypocrites while they are at it.  And then there are the centrists,
who consider themselves superior by virtue of too little knowledge or concern
to take any position.  But the less one recognizes one's own hypocrisy,
the more hypocritical one is.

P.S.  If you don't want your propaganda discussed here, don't post it.

"Guard against the impostures of pretended patriotism."

"Overgrown military establishments are under any form of government
inauspicious to liberty ..."

	-- George Washington

jfh@rpp386.cactus.org (John F Haugh II) (05/14/91)

In article <184@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>The problem is that NO OS support shared libraries right, perhaps because
>there is no way to do so.

This is trivially false, and the conclusion you reached along with it.

>>What
>>don't you like about, say, Multics's implementation, or VMS's, or
>>Aegis's, or SunOS 4.x/S5R4's, or OSF/1's, or....?
>
>Indirect jumps and accompanied process private data for the jump table.

Oh, so you don't like any shared library because it has to use things
that you don't like?  And this is the basis for your proof that NO OS
can "support shared libraries right."

Well, boyo, it's off to the KILL file for you!
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/15/91)

In article <19273@rpp386.cactus.org>
	jfh@rpp386.cactus.org (John F Haugh II) writes:

>>The problem is that NO OS support shared libraries right, perhaps because
>>there is no way to do so.

>This is trivially false, and the conclusion you reached along with it.

To claim so, he could have simply name a OS which do shared libraries
right, which he don't and perhaps can't do.

>>Indirect jumps and accompanied process private data for the jump table.

>Oh, so you don't like any shared library because it has to use things
>that you don't like?  And this is the basis for your proof that NO OS
>can "support shared libraries right."

Below is the JFH's claim.

	From: jfh@rpp386.cactus.org (John F Haugh II)
	Newsgroups: comp.unix.internals
	Subject: Re: Shared Lib Question (ISC)
	Message-ID: <19252@rpp386.cactus.org>
	Date: 9 May 91 00:00:52 GMT

	Were the code in the C library pure, shared libraries would
	be extremely simple to implement.  Data, which isn't sharable,
	is the worst of the flies in the ointment.

And, as I pointed out, the jump table IS unsharable global data.

>Well, boyo, it's off to the KILL file for you!

Good-bye.

						Masataka Ohta

rickert@mp.cs.niu.edu (Neil Rickert) (05/15/91)

In article <187@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <19273@rpp386.cactus.org>
>	jfh@rpp386.cactus.org (John F Haugh II) writes:
>
>>>The problem is that NO OS support shared libraries right, perhaps because
>>>there is no way to do so.
>
>>This is trivially false, and the conclusion you reached along with it.
>
>To claim so, he could have simply name a OS which do shared libraries
>right, which he don't and perhaps can't do.

 I send Masataka Ohta private email suggesting that he look at a particular
(non-unix) implementation of shared libraries.  Had he followed up my
suggestion he would have found an implementation which justified many of
John's claims.

 Instead, Ohta responded that there is nothing which works in that system.

 When you have a closed mind, all discussion finishes.
-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/15/91)

In article <1991May15.043318.7046@mp.cs.niu.edu>
	rickert@mp.cs.niu.edu (Neil Rickert) writes:

> When you have a closed mind, all discussion finishes.

You may claim I have a closed mind based on nothing. But please don't
post that here.

Instead, please make technical and informative posting based on evidences.

By now, I have claimed there can be no right implementation of shared
libraries because of the complexity related to the indirect jump table.

						Masataka Ohta

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/15/91)

In article <7516@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:

>It is far better to either keep a register pointing to a global data area,
>as mentioned in my previous note, or to support shared libraries with
>private data, with all the attendant complexity, than to cripple a library
>specification on the basis of a misapplication of CS principles.

What if, as is often the case, you have several shared libraries linked
to one executable?

							Masataka Ohta

rickert@mp.cs.niu.edu (Neil Rickert) (05/15/91)

In article <194@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>In article <1991May15.043318.7046@mp.cs.niu.edu>
>	rickert@mp.cs.niu.edu (Neil Rickert) writes:
>
>> When you have a closed mind, all discussion finishes.
>
>You may claim I have a closed mind based on nothing. But please don't
>post that here.

 Your public display of hypocrisy, and your public criticism of others based
on it, warrants public exposure.

>Instead, please make technical and informative posting based on evidences.

  What would be the point?  You have refused to consider the evidence.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (05/16/91)

In article <7516@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
> So, one can
> implement the C library as a shared library without globals if one is willing
> to claim that the wrappers and the overhead they require isn't part of the
> implementation, but that is stretching it.

Yeah. I've been arguing this with John via e-mail. He keeps repeating
the same old description of how to separate a library into a sharable
part and the part with the global variables. I keep pointing out that
the global variables are still there. We all know how to write a shared
library under the *constraint* of no static data, but the issue here is
whether that is a real constraint---i.e., whether good libraries can use
globals. John keeps ignoring the fact that malloc() does use globals.

C'mon, John, would you just admit that eliminating global variables is
not always a good idea, and that requiring that shared libraries be pure
is a real constraint?

> The obvious other C library routine that requires globals is malloc.
> You could pass it an arena descriptor, but that would require that any library
> routine that ever possibly might be shared and use malloc to be passed the
> descriptor.

Even worse, it requires that any library routine that might *use* one of
those malloc-using routines must also take the pointer. So much for
separating interface from implementation.

Even worse than that, if you want to keep the same interface forever,
you have to predict all libraries that the implementation might use,
directly or indirectly, not just now but in the far future. So much for
finite argument lists. :-(

> It is far better to either keep a register pointing to a global data area,
> as mentioned in my previous note, or to support shared libraries with
> private data, with all the attendant complexity, than to cripple a library
> specification on the basis of a misapplication of CS principles.

Amen.

---Dan

rickert@mp.cs.niu.edu (Neil Rickert) (05/16/91)

In article <14213:May1522:13:2291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <7516@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>> So, one can
>> implement the C library as a shared library without globals if one is willing
>> to claim that the wrappers and the overhead they require isn't part of the
>> implementation, but that is stretching it.
>
>Yeah. I've been arguing this with John via e-mail. He keeps repeating
>the same old description of how to separate a library into a sharable
>part and the part with the global variables. I keep pointing out that
>the global variables are still there. We all know how to write a shared
>library under the *constraint* of no static data, but the issue here is
>whether that is a real constraint---i.e., whether good libraries can use
>globals. John keeps ignoring the fact that malloc() does use globals.

  Why must you assume that a global variable must be static?

  Take a look at how IBM implemented a shared PL/I library in MVS.  And
don't bother to send me your flames that IBM stinks, or that MVS stinks,
or that PL/I stinks, or even that the PL/I shared library stinks.  It
doesn't matter, and is not relevant.  The fact is, the implementation manages
to use pure code, but still use global variables.  There is a register
dedicated to carrying the data structure via which the global variables are
accessed.

  Look at it this way - if you think IBM stinks, and PL/I stinks, then you
should be able to do this even better with C and Unix.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

fmayhar@hermes.ladc.bull.com (Frank Mayhar) (05/16/91)

In article <184@titccy.cc.titech.ac.jp>, mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
-> In article <7762@auspex.auspex.com>
-> 	guy@auspex.auspex.com (Guy Harris) writes:
-> >Irrelevant.  I said that different OSes provide various functions in
-> >different fashions, which means that the fact that different OSes
-> >implement shared libraries isn't any sort of valid argument against
-> >shared libraries; your statement doesn't have any relevance to that.
-> The problem is that NO OS support shared libraries right, perhaps because
-> there is no way to do so.

Again I ask, what do you consider a "right" way to implement them?  As opposed
to what you consider a "wrong" way.  Ignore existing implementations.  I mean,
in the best of all possible worlds, how should shared libraries be implemented.
(And don't say that in the best of all possible worlds, shared libraries
wouldn't exist.  See the last paragraph, below.)

-> >>Moreover, there seems to be no right implementation of shared libraries, so
-> >>far.
-> >OK, so what would you consider a "right" implementation of them?  
-> Do you consider there is a "right" one?

It really sounds like you're saying here that you don't like shared libraries
because none are done "right" and that none are done "right" because none are
done "right."

I _do_ consider that there are "right" ways to implement shared libraries, in
that there are effective, relatively efficient ways as opposed to ineffective,
relatively inefficient ways.  There is probably more than one "right" way, in
fact.  There may actually not be any "right" implementations extant at the
moment (this is debatable), but that's not the point.

-> >What
-> >don't you like about, say, Multics's implementation, or VMS's, or
-> >Aegis's, or SunOS 4.x/S5R4's, or OSF/1's, or....?
-> Indirect jumps and accompanied process private data for the jump table.

So what would be a better way to do it?

Really, there's a tradeoff between the utility of shared libraries and
efficiency.  This is the way operating systems work, unfortunately.  Odds
are, using a shared library will always be (perhaps only slightly) less
efficient that using unshared libraries, in terms of execution speed.  In
other terms, such as ease of maintenance or disk or memory usage (given
that shared libraries' instruction space is sharable) it can be much
more efficient.  This is the tradeoff.  And, certainly, not all applications
are suited to the use of shared libraries.  But that doesn't mean that _no_
application should use them.
-- 
Frank Mayhar  fmayhar@hermes.ladc.bull.com (..!{uunet,hacgate}!ladcgw!fmayhar)
              Bull HN Information Systems Inc.  Los Angeles Development Center
              5250 W. Century Blvd., LA, CA  90045    Phone:  (213) 216-6241

rang@cs.wisc.edu (Anton Rang) (05/16/91)

In article <194@titccy.cc.titech.ac.jp> mohta@necom830.cc.titech.ac.jp (Masataka Ohta) writes:
>By now, I have claimed there can be no right implementation of shared
>libraries because of the complexity related to the indirect jump table.

  I'm not sure what the big deal about an indirect jump table is,
honestly.  Are you just complaining that it's slower?  If most of the
time a program is running is spent in loops, I'd hope that the
compiler would keep function addresses in registers, anyway (though of
course there are never enough registers).  If that's not your
complaint, then what's the problem with having a jump table?

  Anyway....

  You don't need an indirect jump table if your shared libraries link
at a static location (messy and not general), or if you're willing to
do load-time fixups on your code (expensive), or if you have an
architecture which supports segmentation.

  VMS allows you to build shared libraries which load at a fixed
location, and does not use indirect jumps in this case.  This is
rarely used, since it requires that shared libraries built in this
manner not conflict with each other (in terms of address space) and
must load at a high enough address to leave space for the user image.

  I don't know of any operating systems which support multiple
processes which use the load-time fixup approach.  Certain micro
systems (e.g. the Apple //gs) do this, but the cost on a multi-user
machine of not being able to share pages (and of paging everything in
as you do the load-time fixup) is high.

  You could do run-time fixup if you were willing to write to your
code pages, by making 'call XXXX' into a trap instruction and setting
the real address up the first time that call was hit.  I don't think
this would be worth it for many real applications.

  I'm not familiar enough with OSes running on segmented architectures
to be sure if there are any that use this approach, but I seem to
recall that MULTICS did, and that the cost to call a shared library
routine was higher on the *first* call from a particular point but
that the code was then patched to jump directly to the routine.
(Which presumably defeated code sharing?  I'll shut up now because I'm
not really sure and can't find my MULTICS papers.)

  The benefits of shared libraries, both in saving space on disk and
in sharing code in memory, seem to me to outweigh the benefits.  I
haven't seen any empirical studies of this yet, though.  I'm sure they
exist, somewhere.

	Anton
   
+---------------------------+------------------+-------------+----------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison | "VMS Forever!" |
+---------------------------+------------------+-------------+----------------+

barmar@think.com (Barry Margolin) (05/16/91)

In article <RANG.91May15231758@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>  I'm not familiar enough with OSes running on segmented architectures
>to be sure if there are any that use this approach, but I seem to
>recall that MULTICS did, and that the cost to call a shared library
>routine was higher on the *first* call from a particular point but
>that the code was then patched to jump directly to the routine.
>(Which presumably defeated code sharing?  I'll shut up now because I'm
>not really sure and can't find my MULTICS papers.)

No, Multics (capital M, small rest) doesn't patch the text segment -- it is
almost always sharable.  Multics uses indirection through the static data
segment for dynamic linking.  It's the entry in this segment that is
patched the first time the routine is called, so future calls to the
routine are non-trapping indirect calls.  The first call is more expensive
simply because that is when the routine is linked.


-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

jfh@rpp386.cactus.org (John F Haugh II) (05/16/91)

In article <14213:May1522:13:2291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>Yeah. I've been arguing this with John via e-mail. He keeps repeating
>the same old description of how to separate a library into a sharable
>part and the part with the global variables. I keep pointing out that
>the global variables are still there. We all know how to write a shared
>library under the *constraint* of no static data, but the issue here is
>whether that is a real constraint---i.e., whether good libraries can use
>globals. John keeps ignoring the fact that malloc() does use globals.

Oh.  If you argue that there is no real need to have global static
data removed from the library, sure, in the absolute sense there is
no real need to do that.  All you need is a dynamic loader in the
kernel can fix up all the references and create the additional data
space on the fly as needed.  And I'm not ignoring anything, I've
already said that yes, malloc() uses globals, and pretty much has
to given its currently implementation.  It preserves state across
invocations, and that pretty much requires some static data somewheres
to handle.

>C'mon, John, would you just admit that eliminating global variables is
>not always a good idea, and that requiring that shared libraries be pure
>is a real constraint?

I don't know, I think the performance loss due to dynamic binding
might be the real constraint - constraint to future sales, that is.

>Even worse, it requires that any library routine that might *use* one of
>those malloc-using routines must also take the pointer. So much for
>separating interface from implementation.

Nope.  All that is needed is for the other routines to know how to
find the external interface to the malloc command.  Put a jump table
at a well-known address, and the internal routines are free to
invoke malloc via its external interface.  This even works if you
supply your own malloc() routine - the jump table, which resides
in your address space, get fixed up by the binder or loader or
wherever you want to defer the problem off to.

>Even worse than that, if you want to keep the same interface forever,
>you have to predict all libraries that the implementation might use,
>directly or indirectly, not just now but in the far future. So much for
>finite argument lists. :-(

Glad to see that the implementations I've seen don't have that
problem.  That could be pretty tough to get around ;-)

>> It is far better to either keep a register pointing to a global data area,
>> as mentioned in my previous note, or to support shared libraries with
>> private data, with all the attendant complexity, than to cripple a library
>> specification on the basis of a misapplication of CS principles.
>
>Amen.

I do have to agree that using a register to point to global data is
a very good idea.  My only complaint is that you have to include so
much global data ...

I saw Larry McEvoy's (right name, right person?) posting regarding SunOS's
loader speed and don't exactly think we want to make it worse by having
more address fixups needed.

But seriously, you aren't crippling anything.  Last time I checked,
which was 5 years ago, the library scheme that John Bremsteller did
at Pinnacle had none of these problems.  It did have others, but that
was 5 years ago when no one knew what we were getting into.  I don't
know if it is still being used by them, they sacked me on 1/1/87,
but it worked just fine in my office on my desk for the short time
I was there playing with it.  Not counting me (because I was working
for the "marketing department"), there were two other UNIX programmers
there and they managed to get some "shared library" scheme working in
just a few weeks.  It really isn't that hard.  The original prototype
I worked up took me less than a week, and I had half the library or
more shared in that time.  I think the other John finished the idea
off in just a few more weeks of work, along with his other work load.
It really wasn't that hard, and it solved quite a few of the "hard"
problems that Dan dreams up.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

goykhman_a@apollo.HP.COM (Alex Goykhman) (05/17/91)

In article <1991May10.192648.3147@Think.COM> barmar@think.com writes:
>In article <19255@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>>  Name a single library function
>>which cannot be implemented well without global variables.
>
>Malloc() needs a global variable that points to the arena..........................
>
>Barry Margolin, Thinking Machines Corp.
>
>barmar@think.com
>{uunet,harvard}!think!barmar

Malloc() allocates memory from a process' data region.  Its internal data structures 
are only global to a process, but not the kernel and shared libraries.

Malloc() is always called in a particular context, and always has to deal with a single
set of memory structures.  It should not matter whether malloc() is linked to to a 
particular process statically, or dynamically.


Alex Goykhman                    speaking for myself  
Chelmsford System Software Lab   mit-eddie!apollo!goykhman_a
Hewlett-Packard, Company         goykhman_a@apollo.hp.com

goykhman_a@apollo.HP.COM (Alex Goykhman) (05/17/91)

In article <7516@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>In article <19256@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>>There is a difference between `state' and `global variables'.  For example,
>>the stdio library you mention, could keep its state in the (FILE *) object
>>you pass as its argument in the `no global variables' version of our
>>stdio library.
>
>However, ANSI mandates that fclose(NULL) close all open FILEs, which requires
>a global pointing to the list of FILEs.  Of course, fopen could be a non-shared
>wrapper around a shared global routine; you would also need fopen and fdopen to
>be wrappers, since they need to hang new FILEs on the list.  So, one can
>implement the C library as a shared library without globals if one is willing
>to claim that the wrappers and the overhead they require isn't part of the
>implementation, but that is stretching it.

    Fclose(NULL) is only defined within the context of a single process, all it 
    needs to do is to go through the per-process fd table and and close every fd that
    remains open.  It should be easy to implement this call as a part of a shared
    library, and I am not sure what kind of overhead you are referring to.

[deleted]


Alex Goykhman                    speaking for myself  
Chelmsford System Software Lab   mit-eddie!apollo!goykhman_a
Hewlett-Packard, Company         goykhman_a@apollo.hp.com

pauld@stowe.cs.washington.edu (Paul Barton-Davis) (05/17/91)

In article <RANG.91May15231758@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>  I don't know of any operating systems which support multiple
>processes which use the load-time fixup approach.  Certain micro
>systems (e.g. the Apple //gs) do this, but the cost on a multi-user
>machine of not being able to share pages (and of paging everything in
>as you do the load-time fixup) is high.
>

Anton, I'm not quite sure what you mean by a load-time fixup appraoch,
but if it describes resolving the references to library symbols at
load time, then just check out OSF/1 :-)

Just got back from a 1 day seminar on OSF/1, and this behaviour
provoked some interesting questions from several people. Apparently,
some systems use another level of indirection to solve the problem of
globals during load-time resolution ("look at 0xXXXXXX to find out
where errno lives" type of thing). No-one at the seminar knew what
OSF/1 does, but it was claimed that Mach's lazy evaluation approach to
VM made whatever it does do much easier :-)

[ BTW - from the impression I received, I intend to stay as far
  away from OSF/1 as possible. Maybe OSF/2, if its a proper
  microkernel-based implementation of Mach, will be more
  satisfactory. However, my impressions may, of course, be wrong ]



-- 
Paul Barton-Davis <pauld@cs.washington.edu> UW Computer Science Lab	 

"People cannot cooperate towards common goals if they are forced to
 compete with each other in order to guarantee their own survival."

guy@auspex.auspex.com (Guy Harris) (05/18/91)

>The problem is that NO OS support shared libraries right, perhaps because
>there is no way to do so.

Or, at least, you don't consider any OS to support shared libraries
right.  Others would disagree with you.

>>>Moreover, there seems to be no right implementation of shared libraries, so
>>>far.
>
>>OK, so what would you consider a "right" implementation of them?  
>
>Do you consider there is a "right" one?

Answer the question - and not with another question!  Do *you* consider
there to be a "right" one, or do you think that it's impossible to do
shared libraries "right"?  Given your statement:

>>>Moreover, there seems to be no right implementation of shared libraries, so
>>>far.

one might well reasonably conclude, from the "so far", that you believe
that it is possible.  If so, you obviously have *some* criterion for
deciding whether a shared library implementation is "right" or not; what
is that criterion?

>>What
>>don't you like about, say, Multics's implementation, or VMS's, or
>>Aegis's, or SunOS 4.x/S5R4's, or OSF/1's, or....?
>
>Indirect jumps and accompanied process private data for the jump table.

So if there were a shared library scheme wherein, say, each shared
library were placed at a fixed address in the virtual address space,
known somehow to the linker (perhaps it's in a header in the shared
library file), and the linker, when you link against a shared library,
turns a jump to some particular routine in that shared library into a
jump to the address that routine will have at run time, that might be a
"right" implementation?

Is your objection to the indirect jumps due to the extra CPU time spent
doing the jumps?  Some of us would probably be willing to pay the price
for those indirect jumps in exchange for the flexibility of being able
to plug in a new version of a shared library and still have old binaries
continue to run, just as some of us are willing to pay the performance
price for having software written in higher-level languages than
assembler in exchange for the flexibility of being able to move the
software to a different processor - others might not, but, hey, that's
what makes horse races. 

ka@felix.UUCP (Kenneth Almquist) (05/18/91)

jfh@rpp386.cactus.org (John F Haugh II) writes:
>>>Were the code in the C library pure, shared libraries would
>>>be extremely simple to implement.

and later challenges:
> Name a single library function which cannot be implemented well
> without global variables.  Justify your answer.

How about:

1)  getpwent and relatives -- needs global variables to hold the current
    state and the returned passwd structure.  (Placing these in a structure
    which is passed to getpwent would complicate the calling program,
    especially if the passwd file is examined in several places.)

2)  malloc/realloc/free -- requires a global data structure to keep track
    of which areas of memory are free.

3)  stdio/exit -- needs a global variable so that exit can locate and close
    all open files.
					Kenneth Almquist

jim@segue.segue.com (Jim Balter) (05/18/91)

In article <19261@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>The key word was "well", not "at all".  malloc() can be
>implemented as a function which is bound static and has
>a single pointer to all the private data that it requires.
>It then calls the real routine which takes a pointer to
>the arena glarp and size of the desired object.

As I've already pointed out, this requires that every library routine that ever
might call [a routine that ever might call ...] (I meant to imply this closure,
Dan) malloc must also be a wrapper that passes the malloc arena pointer to the
real routine, which must in turn pass the arena pointer to any other real
routine that ever might call [a real routine that ever might call ...] real
malloc.  Whether this is "implemented well" is certainly a matter of opinion.
Passing this arena pointer around violates all reasonable coupling rules.
Better to pass a pointer to a structure containing all global data,
in a hidden register if possible, else as the first or last (by convention)
arg to every routine.

>Likewise for the stdio library.  Functions which have
>implied (FILE *) objects can be re-implemented as wrappers
>for the versions which require the explicit argument.  The
>FILE _iob[] 

Better to have a wrapper for every function and pass one pointer
for every function than to try to guess ahead of time which functions
might lead to a call to malloc and which functions might lead to
a reference to _iob.  Of course, this requires a single structure
definition that contains the malloc arena as well as _iob and any
other globals that might be needed, which is grossly bad coupling,
although you could build the structure up cleanly from pieces in
various modules, avoiding the need to couple all this disparate info.

>[ Hint: How does the system get errno out of the kernel and
>  into the user space, if it is a user space global variable? ]

Here's where DMR's bad CS becomes evident; system call interfaces should
have taken a pointer through which to store the error number (or an error
structure, for more detailed info), instead of the global errno hack.

>"If liberals interpreted the 2nd Amendment the same way they interpret the
> rest of the Constitution, gun ownership would be mandatory."

Foo on political propaganda in technical newsgroups.

jfh@rpp386.cactus.org (John F Haugh II) (05/19/91)

In article <7611@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>As I've already pointed out, this requires that every library routine that ever
>might call [a routine that ever might call ...] (I meant to imply this closure,
>Dan) malloc must also be a wrapper that passes the malloc arena pointer to the
>real routine, which must in turn pass the arena pointer to any other real
>routine that ever might call [a real routine that ever might call ...] real
>malloc.  Whether this is "implemented well" is certainly a matter of opinion.
>Passing this arena pointer around violates all reasonable coupling rules.
>Better to pass a pointer to a structure containing all global data,
>in a hidden register if possible, else as the first or last (by convention)
>arg to every routine.

No, it only requires that every routine which ever may call malloc, either
directly or indirectly, to use the externally visible interface.

Any routine which directly calls malloc invokes malloc via its external
interface at a well known address.  All the normal rules for binding the
executable insure that the external interface is bound - it is the only
symbol that can satisfy the undefined reference to malloc.  If, for example,
I invoke some stdio routine that gets a buffer from malloc, I will collect
the undefined symbols from that module and attempt to resolve them.  I find
malloc to be one of those symbols, and I load the static part.  If my
routine calls a routine that calls a routine ... I eventually find malloc
as an undefined symbol, and load the static part.

By placing the address of certain routines in a jump-table that resides
at a well-known address, it is possible to find the statically bound
malloc wrapper.  Internal routines do not need have the arena handle
because they have the external interface address, and the external
interface is the only routine that needs to know about the handle.

>Better to have a wrapper for every function and pass one pointer
>for every function than to try to guess ahead of time which functions
>might lead to a call to malloc and which functions might lead to
>a reference to _iob.  Of course, this requires a single structure
>definition that contains the malloc arena as well as _iob and any
>other globals that might be needed, which is grossly bad coupling,
>although you could build the structure up cleanly from pieces in
>various modules, avoiding the need to couple all this disparate info.

Sure, if that were the case, it would make sense to pass a pointer
to the global shared data.  However, it isn't the case - it is
possible to determine the name of every routine that is invoked, and
to bind the required parts.

Here is a construction proof that malloc can be implemented in the
fashion I describe -

	1). It is possible to code a function which is pure
	    text and has no external data references (simple,
	    we all agree you can have a pure-code shared library
	    routine) and have it referenced by non-library
	    code.
	2). It is possible to code a function which is impure
	    and unshared.  (again pretty simple - we do it
	    all the time.)
	3). It is possible to have a shared library function
	    invoke unshared code (we publish the address of
	    the function at some well-known address, so no
	    linkage is required)
	4). Define malloc() to be a unshared library routine
	    which contains private data [ Just a definition,
	    and permitted by 2) and 3) ] and is invoked by
	    an unknown number of shared library functions.
	5). Implement malloc() so that it invokes shared
	    pure-text library functions and passes locally
	    defined static variables as arguments [ permitted
	    by 1) and 2) - that is, I can make a shared library
	    routine, and now I'm going to invoke it. ]
	6). Define the shared library function which our
	    unshared malloc() invokes to perform the
	    operations which malloc() traditionally performs,
	    using a passed structure pointer as its argument.
	    [ another definition - our pure-code malloc'()
	    can do whatever it wants with the contents of
	    the structure ].

By induction every other library function can be implemented in the
same manner.  Indeed, malloc'() probably needs to invoke a function
which is a wrapper about the sbrk() or brk() system calls since they
traditionally keep the end of the break as an variable and one or
the other calls the system after some manipulation has been performed
on the user's argument and the current break variable value.

>>[ Hint: How does the system get errno out of the kernel and
>>  into the user space, if it is a user space global variable? ]
>
>Here's where DMR's bad CS becomes evident; system call interfaces should
>have taken a pointer through which to store the error number (or an error
>structure, for more detailed info), instead of the global errno hack.

That then is your answer.  However Ritchie did it, I'll do it too ;-)
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

jfh@rpp386.cactus.org (John F Haugh II) (05/19/91)

In article <162950@felix.UUCP> ka@felix.UUCP (Kenneth Almquist) writes:
>jfh@rpp386.cactus.org (John F Haugh II) writes:
>> Name a single library function which cannot be implemented well
>> without global variables.  Justify your answer.
>
>How about:
>
>1)  getpwent and relatives -- needs global variables to hold the current
>    state and the returned passwd structure.  (Placing these in a structure
>    which is passed to getpwent would complicate the calling program,
>    especially if the passwd file is examined in several places.)
>
>2)  malloc/realloc/free -- requires a global data structure to keep track
>    of which areas of memory are free.
>
>3)  stdio/exit -- needs a global variable so that exit can locate and close
>    all open files.
>					Kenneth Almquist

Kenneth, you should know better.  The answer is going to be separate the
function into two parts.  The first part we call "the hard part".  The
second part we call "the easy part."

The hard part deals with the issues that you raise concerning the persistent
state or other global static data.  The easy part is implemented using the
traditional techniques for pure-text programming, which is simply don't use
no global data, let the hard part provide you the address of the global
variables.

See my other recent posting describing how to construct the routines you've
listed above.  I'll admit that the statement "without global variables" was
very poorly worded.  In the context of the discussion, I intended to say
"without global variables in the shared library segment", since the topic
at hand is shared libraries.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (05/19/91)

In article <19311@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
> See my other recent posting describing how to construct the routines you've
> listed above.  I'll admit that the statement "without global variables" was
> very poorly worded.  In the context of the discussion, I intended to say
> "without global variables in the shared library segment", since the topic
> at hand is shared libraries.

Okay. Do you agree, then, that global variables (or at least static
variables) can be part of a well-designed, well-programmed library? In
other words, do you agree that there are well-designed, well-programmed
libraries (like malloc() or stdio) which cannot be put into a ``pure''
shared library without some work? In other words, do you agree that
``pure'' shared libraries restrict a good programmer, by forcing him to
do extra work to make some of his libraries sharable? In other words, do
you agree that the feature of ``impure'' shared libraries is indeed
beneficial in some cases?

Good. Glad we settled that.

---Dan

jfh@rpp386.cactus.org (John F Haugh II) (05/20/91)

In article <23997:May1901:27:0891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>Okay. Do you agree, then, that global variables (or at least static
>variables) can be part of a well-designed, well-programmed library?

Sure.  But you can't get to the rest of your argument by this point.

> In
>other words, do you agree that there are well-designed, well-programmed
>libraries (like malloc() or stdio) which cannot be put into a ``pure''
>shared library without some work?

Nope.  Define "some work".  I don't know what the hell your agenda
is, but I really resent this notion that the implementations that _do_
exist somehow _don't_ exist.

>                                   In other words, do you agree that
>``pure'' shared libraries restrict a good programmer, by forcing him to
>do extra work to make some of his libraries sharable? In other words, do
>you agree that the feature of ``impure'' shared libraries is indeed
>beneficial in some cases?

No.  Pure code has its own rewards, in particular, it can be used for
purposes which the author originally did not imagine.  For example, if
malloc() took the address of a structure describing 1) the arena in
some abstract terms, 2) the address of a function to get more memory,
and 3) the address of a function to return unneeded memory, the
standard malloc() function could be adapted to work with kernel
virtual memory, local static memory, mapped file memory, etc.  Instead,
it is married to the user application and memory in the data segment
only.  Sounds like quite the well-designed piece of code to me.  The
more you pin down with hidden state the less you are free to adapt
the code to your own uses.  As just one example, what is the difference
between DBM and NDBM?  DBM uses a global data structure and supports
at most one open database at a time.  NDBM takes a handle to the
open database and supports multiple open databases.  Is this a
coincidence?

>Good. Glad we settled that.

In your mind, Dan.  In your mind.
-- 
John F. Haugh II        | Distribution to  | UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 255-8251 | GEnie PROHIBITED :-) |  Domain: jfh@rpp386.cactus.org
"If liberals interpreted the 2nd Amendment the same way they interpret the
 rest of the Constitution, gun ownership would be mandatory."

jim@segue.segue.com (Jim Balter) (05/20/91)

In article <519a6ad6.20b6d@apollo.HP.COM> goykhman_a@apollo.HP.COM (Alex Goykhman) writes:
>    Fclose(NULL) is only defined within the context of a single process, all it 
>    needs to do is to go through the per-process fd table and and close every fd that
>    remains open.  It should be easy to implement this call as a part of a shared
>    library, and I am not sure what kind of overhead you are referring to.

The subject at hand was global data, not ease of implementation.  In order to
implement fclose(NULL), there must be a global pointer to the head of a list of
FILE's, or a global table of FILE's.  Of course, given the global data, the
implementation is trivial.  The fd table is in the u-structure and isn't really
relevant to a discussion of stdio routines (unless you want to provide a system
call to allow a shared lirbary to access global data saved in the u-structure;
a conceptually intresting but non-pragmatic approach).  Note that I brought up
fclose(NULL) because it is contrary to jfh's point about pointers to state info
(FILE *) being explicitly passed to stdio routines.

The overhead referred to is the overhead of a wrapper routine to pass the
global data (maintained per-process) to the "real" routine in the shared
library.  This was all pretty evident from a careful reading of the thread.

mohta@necom830.cc.titech.ac.jp (Masataka Ohta) (05/20/91)

It should be noted this thread of a library of pure text is meaningless.

In article <19310@rpp386.cactus.org>
	jfh@rpp386.cactus.org (John F Haugh II) writes:

>By placing the address of certain routines in a jump-table that resides
>at a well-known address,

He is now reffering to assign static addresses (well-known address).

If we can have statically assigned addresses it is easy to have global
variable, because they also can have statically assigned addresses.

So, pureness is not neccessary.

Of course, it is wrong to simply assume we can assign static addresses.

						Masataka Ohta

jim@segue.segue.com (Jim Balter) (05/20/91)

In article <519a55d6.20b6d@apollo.HP.COM> goykhman_a@apollo.HP.COM (Alex Goykhman) writes:
>Malloc() is always called in a particular context, and always has to deal with a single
>set of memory structures.  It should not matter whether malloc() is linked to to a 
>particular process statically, or dynamically.

The thread is about global data.  The issue is how to pass "a particular
context" to routines in a shared library when that context is not passed as
explicit arguments to those routines.  What do you mean by "it should not
matter"?  Is that a moral statement?  We are discussing implementation issues,
not waving our hands vaguely.  Certainly the details of access to static data
differ ("matter") between statically and dynamically linked libraries, and
especially between libraries that are copied per-process, whether statically
or dynamically linked, and those that are shared among processes.

goykhman_a@apollo.HP.COM (Alex Goykhman) (05/23/91)

Reply-To: goykhman_a@apollo.HP.COM (Alex Goykhman)
Organization: Hewlett-Packard Apollo Division - Chelmsford, MA

In article <7616@segue.segue.com> jim@segue.segue.com (Jim Balter) writes:
>In article <519a6ad6.20b6d@apollo.HP.COM> goykhman_a@apollo.HP.COM (Alex Goykhman) writes:
>>    Fclose(NULL) is only defined within the context of a single process, all it 
>>    needs to do is to go through the per-process fd table and and close every fd that
>>    remains open.  It should be easy to implement this call as a part of a shared
>>    library, and I am not sure what kind of overhead you are referring to.
>
>The subject at hand was global data, not ease of implementation.  In order to
>implement fclose(NULL), there must be a global pointer to the head of a list of
>FILE's, or a global table of FILE's.  Of course, given the global data, the
>implementation is trivial.  The fd table is in the u-structure and isn't really
>relevant to a discussion of stdio routines (unless you want to provide a system
>call to allow a shared lirbary to access global data saved in the u-structure;
>a conceptually intresting but non-pragmatic approach).  Note that I brought up
>fclose(NULL) because it is contrary to jfh's point about pointers to state info
>(FILE *) being explicitly passed to stdio routines.
>
>The overhead referred to is the overhead of a wrapper routine to pass the
>global data (maintained per-process) to the "real" routine in the shared
>library.  This was all pretty evident from a careful reading of the thread.

    What I was getting at is this: the issue is not the presence of "global"
    data vs. the lack of it, nor is it "pure" vs. "impure" routines.

    The issue here is static vs. dynamic linking.  Another word, the issue is 
    whether external address references are resolved the old fashioned way 
    by statically linking everything into one big happy executable, or they
    are resolved at the run time via a dynamic linking (call it "shared library")
    mechanism, or via a combination of both.

    It should not matter if a routine is in a shared library or not, as long as 
    every external reference the routine employs is resolved to the same value in
    both cases.  In fact, there is no reason for a routine to know if it is a part 
    of a shared library.   While a particular shared library mechanism is bound
    to be influenced by the underlying hardware, I can't think of a modern
    computer architecture that would dictate individual "wrappers" for "impure" 
    shared library routines.
    
    As to overhead, every time we are moving from a compiler (linker) to an 
    interpreter (shared libraries mechanism), we are trading speed for memory.
    Given the current level of technology where cpus are already well ahead of 
    memories, and getting more so as compilers (RISC) are replacing interpreters
    (CISC),  shared libraries can play an important role in balancing a system's
    workload, therefore increasing the system's throughput.