[comp.sys.atari.st] C Compiler Startup Code

HOWESDW@wsuvm1.BITNET (01/08/87)

Received: by WSUVM1 (Mailer X1.23) id 7097; Thu, 08 Jan 87 12:57:14 PLT
Date:         Thu, 08 Jan 87 12:56:28 PLT
From:         Don Howes <HOWESDW@WSUVM1>
Subject:      C Compiler Startup Code
To:           INFO-ATARI16@SCORE.STANFORD.EDU


In a recent posting, Moshe Braner asked why the code generated by a C
compiler is longer than that for an equivalent assembler program, and if
it would be possible to eliminate the additional code. In a word, the
answer to that is no, since the additional code is generated by the
compiler to handle program startup and termination.

(The following discussion is abstracted from: Rex Jaeschke, 1986,
"Solutions in C", chapter 6, Program Startup and Termination, pp. 169-194.,
see his account for the details)

Unlike other programming languages, C does not make a distinction beteen
the main() function and other C functions. In addition, the ability to pass
arguments to main() (argc, argv) suggests the presence of additional code
which calls main() and acts as the actual entry point for the program. This
code, generally called _main, is the startup code for the compiler and is
present in every program generated by the compiler.

The startup code preforms a number of tasks, such as:
(1) setting up the stack for the definition of auto variables and the storage
    of function argument lists.
(2) reserving space for the heap, to perform dynamic memory allocation by
    malloc() and calloc().
(3) ensuring the correct opening of the files, "stdin", "stdout", "stderr".
(4) passing "argc" and "argv" to main().
(5) ensuring the graceful termination of the program.
(6) an additional series of environment specific functions related to the
    hardware/software environment (variable).

To check for the size of the startup code for your compiler, compile and
link the shortest legal C program:

         main()
         {}

Here are the results from some compilers I have access to, to compare
against:

Ecosoft Eco-C88              (MSDOS)          1536 bytes
Datalight C                  (MSDOS)          2674 bytes
Microsoft C 4.0              (MSDOS)          1986 bytes
Alcyon C (pre 4.14)          (GEM)            6271 bytes

The much larger size for the Alcyon compiler may be related to a more
complex environment, or simply to bloated code. What do other people get
for their ST C compilers?

Don Howes     HOWESDW@WSUVM1  (BITNET)

braner@batcomputer.tn.cornell.edu (braner) (01/09/87)

[]

The Megamax C compiler only adds about 1500 byte or so (this is from
my memory...) for the startup code.  Not really that bad.  One trick it uses
is that if you write "main()" with no arguments it skips the I/O redirection
stuff.  Code for malloc(), for example, is linked in only if you actually
use it.  BUT: it includes a bunch of library modules, with descriptive
names like "fopen".  I can't believe all of that is really necessary,
and would like to reduce it even further.

- Moshe Braner

holloway@drivax.UUCP (Bruce Holloway) (01/09/87)

In article <8701082100.AA14694@ucbvax.Berkeley.EDU> HOWESDW%WSUVM1.BITNET@forsythe.stanford.edu writes:
>In a recent posting, Moshe Braner asked why the code generated by a C
>compiler is longer than that for an equivalent assembler program, and if
>it would be possible to eliminate the additional code. In a word, the
>answer to that is no, since the additional code is generated by the
>compiler to handle program startup and termination.

Alcyon C does not generate extra code to handle the "main" function. The
command line interpretation is handled by another small module that is linked
in later.

And the answer is 'yes, you CAN get rid of it'. If you use no command line,
and more importantly, none of the I/O subroutines, then you can replace it
with a simple module which merely shrinks the memory -- or if that's not
important, just be sure that the MAIN subroutine is the first piece of code
linked, and don't include anything at all.

I've used a lot of 'C' compilers, and I know of none that generate extra code
for the "main" subroutine. I have seen compilers that change the name of
the "main" subroutine, though.
-- 
....!ucbvax!hplabs!amdahl!drivax!holloway
"What do you mean, 'almost dead'?" "Well, when you stop breathing, and moving
around, and seeing things... that kind of almost dead."

dan@dshovax.UUCP (dan dexter) (01/16/87)

<<-- lineater beware -->>

Well, here's the sad news for those of us with Lattice C . . .

I have the Lattice 68000 C Compiler from MetaComCo Version 3.03.04 and
the linker signs on as GST 68000 Linker R132V039.
After compiling the empty program, here are the results:

           MAIN.PRG     10264 bytes
           MAIN.BIN        73 bytes

I linked using the standard C.LNK file that came with the compiler which
links in the file STARTUP.BIN and uses CLIB.BIN as a library.

           CLIB.BIN     55091 bytes
           STARTUP.BIN    887 bytes

This sure seems like a lot of extra stuff especially when I compare to what
has already been posted from other compilers.  UGH.

I have included the following numbers that have been posted for comparison.

>From:         Don Howes <HOWESDW@WSUVM1>
>
>Here are the results from some compilers I have access to, to compare
>against:
>
>Ecosoft Eco-C88              (MSDOS)          1536 bytes
>Datalight C                  (MSDOS)          2674 bytes
>Microsoft C 4.0              (MSDOS)          1986 bytes
>Alcyon C (pre 4.14)          (GEM)            6271 bytes
>
>From: LSI@UMass.BITNET (Peter Lawall, Logical Solutions, (413) 256-6800)
>Here are two more code sizes for the "empty program":
>main()
>{}
>
>Old Version of Alcyon C on CP/M68K: 9856 bytes!
>LightSpeed C on the Macintosh     : 1544 bytes
>
>From: braner@batcomputer.tn.cornell.edu (braner)
>
>The Megamax C compiler only adds about 1500 byte or so (this is from
>my memory...)