[comp.arch] Smart Linking

lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (06/07/90)

In article <caPYdHG00jdX06u1kN@cs.cmu.edu> Rick.Rashid@CS.CMU.EDU writes:
>The runtime size of a lot of X programs (e.g. xterm, xclock, X, etc.)
>varies strongly with the page size of the machine being used. 

>The easiest guess is that
>what you are looking at is heavy internal fragmentation due to the
>modularity of the code (most subroutines don't call other routines in
>the same module but rather those in other modules), the lack of
>an intelligent linker (which would figure that out and relocate routines)
>and memory allocation routines which are not working set sensitive.

Intelligent linking is a strangely unaddressed problem.  There has
been some work on this in the mainframe world - I recall articles in
the mid-70's, in the IBM Systems Journal (or IBM JRD?).  But, I don't
recall any Unix products.  Most OSes don't want you to know what
pages were in your working set, much less the order of the page
faults.  Of course, good information could still be gathered at the
user level, using the calling-convention techniques of "prof" and
"gprof".

This isn't a silly concern.  In a former life I speed-tuned my
company's product, some 600 KB of code.  It was clear to me that
Unix's "ld" was giving me dumb results. For example, there were 350
init routines that were only called on startup, but placing them
together was so hard that I didn't do it.

Sounds like a job for OSF ... or someone battling the SPEC wars ...



-- 
Don		D.C.Lindsay 	leaving CMU .. make me an offer!

jeremy@cs.ua.oz.au (Jeremy Webber) (06/08/90)

In article <9557@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes:

   In article <caPYdHG00jdX06u1kN@cs.cmu.edu> Rick.Rashid@CS.CMU.EDU writes:
   >The runtime size of a lot of X programs (e.g. xterm, xclock, X, etc.)
   >varies strongly with the page size of the machine being used. 

   [ observations about page sizes and smart linking ]

Of course, if the X libraries have been made sharable (which they should on
a machine which supports shared libraries) it becomes harder to allocate more
than one module to a page.  It's still worth persuing though on machines with
large page sizes though.  Does anyone's linker handle this situation neatly?
--
--
Jeremy Webber			   ACSnet: jeremy@chook.ua.oz
Digital Arts Film and Television,  Internet: jeremy@chook.ua.oz.au
60 Hutt St, Adelaide 5001,	   Voicenet: +61 8 223 2430
Australia			   Papernet: +61 8 272 2774 (FAX)

rec@dg.dg.com (Robert Cousins) (06/08/90)

In article <JEREMY.90Jun8093813@chook.ua.oz.au> jeremy@cs.ua.oz.au (Jeremy Webber) writes:
>In article <9557@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes:
>
>   In article <caPYdHG00jdX06u1kN@cs.cmu.edu> Rick.Rashid@CS.CMU.EDU writes:
>   >The runtime size of a lot of X programs (e.g. xterm, xclock, X, etc.)
>   >varies strongly with the page size of the machine being used. 
>
>   [ observations about page sizes and smart linking ]
>
>Of course, if the X libraries have been made sharable (which they should on
>a machine which supports shared libraries) it becomes harder to allocate more
>than one module to a page.  It's still worth persuing though on machines with
>large page sizes though.  Does anyone's linker handle this situation neatly?
>Jeremy Webber			   ACSnet: jeremy@chook.ua.oz
>Digital Arts Film and Television,  Internet: jeremy@chook.ua.oz.au
>60 Hutt St, Adelaide 5001,	   Voicenet: +61 8 223 2430
>Australia			   Papernet: +61 8 272 2774 (FAX)

Actually, the issues of optimal linkage editing and shared libraries, while
similar in effects, offer additive opportunities for substantial improvement in
performance. Lets talk about linking first.

Most machines have large enough caches (or multiple set caches) to allow more
than one routine to be resident at a time. An intelligent linkage editor should
look at the call-tree and assign routines to addresses such that whenever 
possible, routines which call each other do not flush the other out of the cache
and furthermore, that the placement of these routines reduces the TLB thrashing
which is becoming more common as text areas increase. There have been papers
written on this subject.

The problems with shared libraries are two fold: many developers hate to use
them because they increase the *perceived* support burden since different 
vendors may supply different libraries with subtle differences known as *bugs*.
The second problem is that large libraries (such as some widget libraries)
which require large amounts of space ("hello world" with OSF/Motif is almost
1 megabyte on my 88K machine) are either not offered or designed for use in
a shared library format. To address this problem, ABIs and BCSs must specify
the nature of shared libraries and provide viable certification tools. Shared
libraries will not, however, reduce the memory requirements for data space
which many of these libraries require in large quantities.

Robert Cousins
Dept. Mgr, Workstation Dev't
Data General Corp.

Speaking for myself alone.

beede@sctc.com (Mike Beede) (06/11/90)

jeremy@cs.ua.oz.au (Jeremy Webber) writes:

>Of course, if the X libraries have been made sharable (which they should on
>a machine which supports shared libraries) it becomes harder to allocate more
>than one module to a page.  It's still worth persuing though on machines with
>large page sizes though.  Does anyone's linker handle this situation neatly?

Yes.  Multics.  Unfortunately, the hardware/software combination is
too expensive . . . .


-- 
Mike Beede         Secure Computing Technology Corp
beede@sctc.com     1210 W. County Rd E, Suite 100           
			Arden Hills, MN  55112
                         (612) 482-7420