[comp.arch] Large executables: Where's the space going?

teami125@ics.uci.edu (klefstad 125b team i) (06/05/90)

With all this talk of 1.3Meg 'xclock' executables, etc., has anyone
done any digging to find exactly what is taking so much space? Obviously,
it's library code, et. al., but which routines? Why? How much data?

It just floors me when a simple executable is that big. I didn't even
think the X libraries were that big! Even every X library I can find,
added all together, are only about 800k. Where's the other 500K?

Also, one issue not yet mentioned in this discussion of 'small'
programming is the impact of sloppy linking. IMHO, a smarter linker
could alleviate much of this 'code bloat'. Obviously, this requires a
higher-level view of an object file than just an array of bytes, which
is difficult with existing C compilers and object module formats. It
also requires the library writers to be more careful in segmenting
their routines so as to not suck in everything else when one routine
is referenced.

A good example of this is Ada-language programs. The simplest of
programs on the Unix system we use here (a Sequent Symmetry running
Verdix Ada) often takes 100K for a 'hello world' program. Much of this
is obviously silly linking and code generation by the compiler.

Even though some of these 'huge executable' problems are lessened with
shared libraries, they really don't make the problem go away. They just
amortize the bloat to every process.

                                           Steve Klein

ath@prosys.se (Anders Thulin) (06/06/90)

In article <266B4748.28579@paris.ics.uci.edu> teami125@ics.uci.edu (klefstad 125b team i) writes:
>
>With all this talk of 1.3Meg 'xclock' executables, etc., has anyone
>done any digging to find exactly what is taking so much space? Obviously,
>it's library code, et. al., but which routines? Why? How much data?

I suspect that there's a mistake somewhere. I've checked the X
binaries on A/UX, Sun3(3.5), Aviion, Ultrix, and HPUX, and only in a
few cases have xclock (or any other binary) been larger than 300k.
None of these, it appears, use dynamic linking.

It seems more likely that the 1.3Meg figure refers to libraries/code
compiled with debug options on, which easily produces executables
larger than a meg.

-- 
Anders Thulin       ath@prosys.se   {uunet,mcsun}!sunic!prosys!ath
Telesoft Europe AB, Teknikringen 2B, S-583 30 Linkoping, Sweden

chip@tct.uucp (Chip Salzenberg) (06/07/90)

According to ath@prosys.se (Anders Thulin):
>I suspect that there's a mistake somewhere. I've checked the X
>binaries on A/UX, Sun3(3.5), Aviion, Ultrix, and HPUX, and only in a
>few cases have xclock (or any other binary) been larger than 300k.

Any window system where a clock program binary is larger than 30K has
a problem.

No, I take that back.  Such a window system *is* a problem.
-- 
Chip Salzenberg at ComDev/TCT     <chip@tct.uucp>, <uunet!ateng!tct!chip>

davecb@yunexus.UUCP (David Collier-Brown) (06/07/90)

   Where? I don't care.
   Why?  Ah, now that's more to the point.

   It could be going many interesting places (to we denizens of comp.arch)
like compiler misoptimization or the "semantic gap" of the vCISC era, but
it's probably going somewhere dull and boring: to the learning curve.

   It sems like any time a new dimension is added to the computer
programming problem we see a number of stereotyped results
	1) the programs get large and slow
	2) managmenty complains bitterly
	3) some strange old techniques resurface, many
		of which solve the wrong problem
	4) some organization/vendor releases a non-large, non-slow
		program
	5) several years later most organizations are writing
		small fast versions.

    Dick McMurray said it best:
	First we get it right.
	Then we get it fast.
	Then we get it small.

--dave (history, how dull) c-b
-- 
David Collier-Brown,  | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | "And the next 8 man-months came up like
CANADA. 416-223-8968  |   thunder across the bay" --david kipling

Rick.Rashid@CS.CMU.EDU (06/07/90)

The runtime size of a lot of X programs (e.g. xterm, xclock, X, etc.)
varies strongly with the page size of the machine being used.  In particular,
the size of the same programs run on my Sun 3/60 (8K page) and my
Toshiba 5200 laptop (4K page) varies by about a factor of two.
Code densities for the 68000 and i386 are not identical but vary by
o(10%) so that cannot be the explanation.  The easiest guess is that
what you are looking at is heavy internal fragmentation due to the
modularity of the code (most subroutines don't call other routines in
the same module but rather those in other modules), the lack of
an intelligent linker (which would figure that out and relocate routines)
and memory allocation routines which are not working set sensitive.