[comp.lang.c] Keeping executables small

scs@adam.pika.mit.edu (Steve Summit) (01/08/89)

In article <8634@bloom-beacon.MIT.EDU> I described a way, using
a global function pointer, to keep _cleanup and the rest of stdio
out of programs which don't need it, while still having exit
call _cleanup when appropriate.

In article <9295@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>Your technique is equivalent to use of atexit(), but specialized to
>support just STDIO.  (It is more efficient, though...)

Doug is correct.  In the case of exit/cleanup, the global
function pointer technique is embarrassingly close to atexit,
amounting to a special-cased implementation of the same concept,
which I would have mentioned in my article if I had realized it
in time.  (Efficiency, run-time efficiency anyway, was hardly my
concern.)

The global function technique is quite useful, and its benefits
more obvious, in cases where a general function-call registering
mechanism (such as atexit) is neither available nor appropriate.
An example will illustrate.

In one of my (many) device-dependent libraries of Unix-style
plot subroutines, I have a function which draws dashed lines (the
hardware doesn't support them).  The code is quite large (it
draws nice looking dashed lines under almost all circumstances),
but is only needed if the calling program calls linemod() (to
request a line style of other than the default "solid").  Yet the
dashed-line drawing code is called from the main line-drawing
routine, which is called by virtually all graphics programs, and
so would at first seem to be loaded whether needed or not.

In fact, line() calls _dashline() indirectly through a function
pointer:

	int (*_dashlinefunc)() = NULL;

	int _linestyle = 0;

	line(x1, y1, x2, y2)
	int x1, y1, x2, y2;
	{
	if(_linestyle != 0 && _dashlinefunc != NULL)
		(*_dashlinefunc)(x1, y1, x2, y2);
	else	{
		/* draw simple, solid line */
		}
	}

In a separate file (linemod.c):

	extern int _linestyle;
	extern int (*_dashlinefunc)();
	extern int _dashline();

	linemod(style)
	char *style;
	{
	/* parse style string and set _linestyle accordingly */
	_dashlinefunc = _dashline;
	}

A program that never calls linemod() only draws solid lines and
needs neither linemod() nor _dashline().  If and when the program
calls linemod(), linemod.o is pulled out of the library (during
the next recompile and relink), which adds _dashline to the
undefined external list, which then (and only then) pulls in all
of the dashed-line machinery.  (The code for _dashline could
equivalently be included in the same source file with linemod.)
In any case, _dashline is not directly an undefined external of
the normal line function, even though that is the only spot from
which it is called.

(The technique requires a mildly objectionable stylistic
concession, in that the defining instances of the global
variables _linestyle and _dashlinefunc cannot be in the file with
the rest of the dashed-line code, where it seems they belong.)

If the authors of large, general-purpose libraries would pay
attention to which routines are required by all callers (as
opposed to those needed only by programs using the more exotic
features), and then use a few little tricks to make the linker's
behavior more closely match the actual call pattern (i.e. include
only those modules expected to be actually called) we wouldn't
hear as many complaints about huge, bloated executables.

(In the case of exit and stdio, I suspect that the excuse for not
taking steps to make loading of most of stdio truly optional is
that the number of useful programs that don't use stdio is
small.)

In the case of other libraries, such as plotting packages, window
managers, or user interface subsystems, the potential for
substantial trimming of executables is significant.  Associated
with most such libraries are horror stories about the incredible
resultant size of apparently simple programs (open a window and
print "Hello, world!" in it, for instance).

                                            Steve Summit
                                            scs@adam.pika.mit.edu

flaps@dgp.toronto.edu (Alan J Rosenthal) (01/12/89)

scs@adam.pika.mit.edu (Steve Summit) writes:
>In the case of other libraries ...  the potential for
>substantial trimming of executables is significant.  Associated
>with most such libraries are horror stories about the incredible
>resultant size of apparently simple programs (open a window and
>print "Hello, world!" in it, for instance).

On the other hand, don't forget that "Hello, world" isn't a typical
program, and if it's 32K but so are some much larger programs, this
might be fine.  Optimizing for "Hello, world" rather than for larger
programs would be a mistake.

ajr

--
"The goto statement has been the focus of much of this controversy."
	    -- Aho & Ullman, Principles of Compiler Design, A-W 1977, page 54.

henry@utzoo.uucp (Henry Spencer) (01/16/89)

In article <8658@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:
>(In the case of exit and stdio, I suspect that the excuse for not
>taking steps to make loading of most of stdio truly optional is
>that the number of useful programs that don't use stdio is
>small.)

Don't forget, also, that stdio was meant to be cheap enough that there
wouldn't be any significant reason *not* to use it.  The one part that
was significantly bulky in the original implementation, printf's code
for floating-point conversions, was organized so that its loading was
indeed optional.

Unfortunately, all too many Unix suppliers nowadays don't care how big
the executables are... after all, it helps sell memory...
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu