[comp.lang.ada] Sizes of executables from Unix compilers

mfeldman@seas.gwu.edu (Michael Feldman) (06/05/90)

Thought I'd try to shed a little light on the executable-size issue, since
it has come up again. First, a few statistics for "hello, world." Here
are 2 programs:

with Text_IO;
procedure Hello is
begin
  Text_IO.Put_Line("Hello, World.");
end Hello;

#include <stdio.h>
main()
{
   printf("Hello, World\n");
}

I compiled these using 4 Ada compilers and 2 C compilers. Here are the
sizes of the executables, with no optimization switches or anything else,
right out of the box:

C   (HP835)     34816
Ada (HP835)     86016

C   (Sun-3)     32768
Ada1(Sun-3)     57344
Ada2(Sun-3)    106496
Ada3(Sun-3)    139264

I am not identifying vendors because it serves no purpose to do so.
First of all, note that the C programs aren't so tiny either. Second,
the BIG differences in the Ada compilers suggests differing optimization
policies from the 4 vendors, for the _default_ case of no optimization.
My guess is that it has to do with the amount of Text_IO libraries they
are naively hauling into the executable. Given the small size of the
equivalent PC programs (as posted previously), I am conjecturing that
nobody is all that concerned about the _size_ of an executable intended
for a timesharing system, and that all vendors are probably optimizing 
for time rather than space.

To test my conjecture about Text_IO, I will try a program with no IO at all.

Regarding the size of the run-time system: there is absolutely nothing in
the "nature of the language" that precludes Ada programs being small. If
current compilers don't squeeze out every last byte, it's because "the market"
hasn't said it's that critical to do so. In any case the programs have
really come down in size over the successive compiler versions.

There is no need to link in a tasking kernel if no tasking is being done in 
the program. There is also no need to do unnecessary range checking, as the
LRM states very clearly.  The program only needs to check what it _needs_ to
check.

Suppose the programmer can second-guess the compiler about checking,
because (s)he knows that certain types or certain parts of the program
can be guaranteed not to raise Constraint_Error (or Numeric_Error)?
Well, using pragma SUPPRESS to cut out checking that the compiler leaves 
in is perfectly OK - that's why the pragma is in the language. 
This will make the program both smaller and faster, at the possible cost 
of erroneous behavior (because something happened in the program that
would've raised the exception if it hadn't been SUPPRESSed).

Here's another neat optimization issue: I believe that Ada programs have
the potential to be _faster_ than those in other languages if the type
system is used right:

Since A := B is defined for all nonlimited types, then large arrays can
be copied with a single Ada statement instead of the loop that Fortran
or C would require. Assuming the target machine has a fast block copy,
the compiler can potentially implement A := B much faster than it could
do the corresponding explicit loop. Yes, I know that a _really_ smart
compiler could detect that the whole array was being copied and throw
the loop away anyhow, but how many compilers do you know of that are
this smart?

Similarly, the statement

A := (others => (others => (others => 1.0)));

for a 3-level array can be implemented very quickly.

The bottom line: compiler optimization is a function of the maturity of
the compilers and of the marketplace. I believe there is nothing in Ada
that requires its programs to be slower or larger than _comparable_
programs in other languages. Be sure to compare apples to apples. If you
allow Ada to do range checking, for example, make sure your C code has the
corresponding "if" statements to do the same thing.

sommar@enea.se (Erland Sommarskog) (06/06/90)

Michael Feldman (mfeldman@seas.gwu.edu) gives the sizes of some
"hello world" programs:
>C   (HP835)     34816
>Ada (HP835)     86016
>
>C   (Sun-3)     32768
>Ada1(Sun-3)     57344
>Ada2(Sun-3)    106496
>Ada3(Sun-3)    139264

I don't think this is a language issue, but one of operating system.
Shared libraries is apparently not a standard features on Unix, it
has been on VMS as long as I have known. I haven't tried a "Hello
world" on VMS, but it should be less than 10 blocks (= 5120 bytes).

Of course, an Ada system under Unix could do various optmizations
at link time to keep down the size, on the other hand why adopt
to an ancient technology?
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (06/07/90)

In article <1700@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>Michael Feldman (mfeldman@seas.gwu.edu) gives the sizes of some
>"hello world" programs:
>>C   (HP835)     34816
>>Ada (HP835)     86016
>>
>>C   (Sun-3)     32768
>>Ada1(Sun-3)     57344
>>Ada2(Sun-3)    106496
>>Ada3(Sun-3)    139264
>
>I don't think this is a language issue, but one of operating system.

Perhaps. Let me add:

assembler(IBM-PC)    22
C(IBM-PC)         3-8000 depending

Does anybody recall how many (decimal digits, not bytes) it would be
on an IBM 1620 -  my guess would be 23 or 24 decimal digits (5 bits
per decimal digit). (This would, of course be machine language :-), 
input at the console typewriter.) Anybody out there still remember
the codes? 

Doug McDonald

diamond@tkou02.enet.dec.com (diamond@tkovoa) (06/07/90)

In article <1990Jun6.230600.4736@ux1.cso.uiuc.edu> mcdonald@aries.scs.uiuc.edu (Doug McDonald) writes:
>In article <1700@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>>Michael Feldman (mfeldman@seas.gwu.edu) gives the sizes of some
>>"hello world" programs [deleted]
>>I don't think this is a language issue, but one of operating system.
>Perhaps. Let me add:
>assembler(IBM-PC)    22
>C(IBM-PC)         3-8000 depending
>Does anybody recall how many (decimal digits, not bytes) it would be
>on an IBM 1620 -  my guess would be 23 or 24 decimal digits (5 bits
>per decimal digit). (This would, of course be machine language :-), 

Awright, you asked for it.
It's impossible.
OK, here goes for a "HELLO WORLD" program (which is possible in upper
case), and I've forgotten the typewriter control digit for carriage
return so I put a ? in the listing below.  Also there's no record
mark in ASCII, so I put a # in the listing.
locn.  instruction   mnemonic
00000  360002700100  WATY 27 (write alpha to ty, starting locn. 00026/00027)
00012  34000000010?  RCTY    (return carriage on typewriter)
00024  484845535356  H2      (halt)  Note that the addresses are ignored
         ||||||||||                  in a halt instruction, so we start
        //////////                   using that space for data instead of
       ||||||||||                    wasting it.
00026  48455353560066565953440#   DAC 12,HELLO WORLD@  (define alpha const)
So you would just type in:
36000270010034000000010?4848455353560066565953440#
Looks like fifty digits.

Typical software management here, underestimated by a factor of 2.

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
Proposed group comp.networks.load-reduction:  send your "yes" vote to /dev/null.

arny@cbnewsl.att.com (arny.b.engelson) (06/09/90)

In article <1930@sparko.gwu.edu> mfeldman@seas.gwu.edu (Michael Feldman) writes:
>Thought I'd try to shed a little light on the executable-size issue, since
>it has come up again. First, a few statistics for "hello, world." Here
>are 2 programs:
>
>with Text_IO;
>procedure Hello is
>begin
>  Text_IO.Put_Line("Hello, World.");
>end Hello;
>
>#include <stdio.h>
>main()
>{
>   printf("Hello, World\n");
>}

I can't believe I'm doing this.  I got curious enough to go run these on
a VAX/VMS system using DEC's Ada and C compilers.  Given the results, I
decided to post them (and start a war I'm sure, but what the hell)...

First of all, I ran the compilers with their default settings, no special
optimizations, pragmas, options, nothing.  Second, the source looks exactly
like Mike's programs above.  Last, the results:

Ada executable =  6 blocks (512 byte blocks)
C   executable = 87 blocks   "

Obviously, I now conclude that Ada is 14.5 times more efficient than C.

Go and analyze this to death if you want, it's still one of the stupidest
language comparison tests I've ever seen; I just found the results amusing
enough to post.  But then I didn't get enough sleep last night and it's
been a long week.

  -- Arny Engelson   att!wayback!arny

mfeldman@seas.gwu.edu (Michael Feldman) (06/09/90)

In article <1990Jun8.213230.10693@cbnewsl.att.com> arny@cbnewsl.att.com (arny.b.engelson) writes:
>In article <1930@sparko.gwu.edu> mfeldman@seas.gwu.edu (Michael Feldman) writes:
[ a bunch of stuff deleted ]
>
>I can't believe I'm doing this.  I got curious enough to go run these on
>a VAX/VMS system using DEC's Ada and C compilers.  Given the results, I
>decided to post them (and start a war I'm sure, but what the hell)...
>
>First of all, I ran the compilers with their default settings, no special
>optimizations, pragmas, options, nothing.  Second, the source looks exactly
>like Mike's programs above.  Last, the results:
>
>Ada executable =  6 blocks (512 byte blocks)
>C   executable = 87 blocks   "
>
>Obviously, I now conclude that Ada is 14.5 times more efficient than C.
>
>Go and analyze this to death if you want, it's still one of the stupidest
>language comparison tests I've ever seen; I just found the results amusing
>enough to post.  But then I didn't get enough sleep last night and it's
>been a long week.

This may sound really humorless, but the whole point of the "stupid"
language comparison tests I originally posted was to show precisely that
it is stupid to judge a language based on silly things like the size of
executables for trivial programs like these. I was responding to a posting
from Ted Holden (who else?) that made a preposterous claim about the size of
a "hello world" in Ada, and asserting that the reason for the humongous
executables was somehow in the nature of the language. In case I didn't
make the points articulately, here they are:

1. even programs of apparently trivial nature CAN lead to nontrivially-large
   executables, because of library stuff that gets linked in;

2. this characteristic is by no means unique to Ada, and has _very little_
   to do with the "nature of the language" - it has much more to do with
   the engineering tradeoffs made by compiler/linker implementers;

3. the whole point of showing 4 different Ada compilers was to demonstrate
   point (2) - that implementers make lots of decisions that have little
   to do with the "nature" of the language;

4. it's also the case that large executables aren't necessarily bad.
   Pragma INLINE comes to mind as an example of a time/space tradeoff in
   which a procedure is in-lined at each invocation, saving the time
   (and stack space!) of a subroutine call per invocation. But for an
   inlined procedure with a nontrivial number of invocations, this will
   increase the size over the non-inlined one. This is good, not bad,
   as it will speed the program's execution up.

Please don't anyone think I was trying to make a serious comparison of Ada
vs. C here. It was just the opposite, really - to show how foolish these
generalizations are. Sorry if I misled any readers.
---------------------------------------------------------------------------
Prof. Michael Feldman
Department of Electrical Engineering and Computer Science
The George Washington University
Washington, DC 20052
+1-202-994-5253
mfeldman@seas.gwu.edu
---------------------------------------------------------------------------

sommar@enea.se (Erland Sommarskog) (06/13/90)

arny.b.engelson (arny@cbnewsl.att.com) gives some figures for
"Hello World" programs on VMS:
>First of all, I ran the compilers with their default settings, no special
>optimizations, pragmas, options, nothing.  Second, the source looks exactly
>like Mike's programs above.  Last, the results:
>
>Ada executable =  6 blocks (512 byte blocks)
>C   executable = 87 blocks   "

Again, an operating system issue. C doesn't really fit with VMS.
Linking C is not as straightforward for other languages. I don't
use C myself, but I think a link command like

   $ LINK prog,sys$input/option
   vaxcrtlg/share

or something similar will cut down the C executable to reasonable sizes.

I like to restress the point I made a while ago: if your tiny "hello
world" program makes the disk explode, don't blame the langauge,
blame the operating system which doesn't provide shareable images.
If the OS does, blame the compiler who doesn't use them.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se