[gnu.gcc] Common Compilers for benchmarks

shs@aldebaran.Berkeley.EDU (Steve Schoettler) (10/31/88)

In article <7352@wright.mips.COM> earl@mips.COM (Earl Killian) writes:
>I think it would be interesting to benchmark various different
>machines using gcc as the compiler.  This partially removes one
>variable: how much performance is due to the compiler and how much to
>the hardware.

That might get you somewhere, but I don't think it will what you
want to find out about the architectures, and it won't completely
remove the compiler as a variable.

Consider that gcc compiles down into an intermediate RTL description
of the original C code.  The RTL describes an idealized abstract machine,
and is designed to be abstract enough so that it can be mapped into
a variety of processors: 68K,386,370, etc.  How efficiently this RTL
description is compiled into the target machine code reflects how
close the abstract machine is to the actual target machine.

So, I think what you'll find from such a study is which machine most
closely resembles the abstract machine Richard Stallman et al 
had in mind when the RTL was designed.

I am currently working on some silicon compiler software that is
designed to give instruction set designers rapid feedback about
design choices.  One thing I have been considering is automatically
generating a gcc machine description file, which can be used to generate
a C compiler, which can be used to compile and run benchmarks. 
In other words, with the right software, you should be able to input
an instruction set and get out dhrystones (etc).

If you used only results from these C benchmarks to design your processor,
I believe that you would, after a few iterations, converge on an
instruction set that very closely matched the intermediate abstract
machine used by gcc.  

Maybe that's not such a bad idea.  Perhaps the best instruction set for
the GNU C compiler is one that uses as instructions the same basic
blocks that are described by the RTL.  Of course, you'd have to do 
something about the infinite number of registers, etc, but then
you'd have a fantastic C compiler for it!

But what about the tradeoffs and assumptions made in the design of
the abstract machine?   Certainly there were intended source languages,
and intended target machines, so how ideal is it?
Let me know if you have any answers.

I believe that using gcc for benchmarks across machines is a good idea,
as it is sort of a "constant" and removes some of the variation we do
see across machines.  I just don't know how meaningful the results
will be.

For a demonstration of how much compiler technology affects "performance",
I'd ask the MIPS folks what the difference is between typical unix code
compiled with no optimizations and with all optimizations turned on.

Steve
shs@ji.Berkeley.EDU

csimmons@hqpyr1.oracle.UUCP (Charles Simmons) (11/03/88)

In article <26627@ucbvax.BERKELEY.EDU> shs@ji.Berkeley.EDU (Steve Schoettler) writes:
>In article <7352@wright.mips.COM> earl@mips.COM (Earl Killian) writes:
>>I think it would be interesting to benchmark various different
>>machines using gcc as the compiler.  This partially removes one
>>variable: how much performance is due to the compiler and how much to
>>the hardware.
>
>Consider that gcc compiles down into an intermediate RTL description
>of the original C code.  The RTL describes an idealized abstract machine,
>and is designed to be abstract enough so that it can be mapped into
>a variety of processors: 68K,386,370, etc.  How efficiently this RTL
>description is compiled into the target machine code reflects how
>close the abstract machine is to the actual target machine.

Um...  Seeing how I've worked on the GCC 370 compiler, I'd argue
with this point of view.  One of the really neat aspects of GCC is
that you can, in some sense, generate machine-dependent RTL code.

For example, the original RTL code doesn't have any support at all
for double word integers (or 'long long's).  But, in an early pass
of the compiler, the compiler calls a machine dependent routine to
generate intermediate RTL code.  By performing this pass, the generated
intermediate RTL code tends to provide an abstract machine that very closely
mimics the target machine.

>So, I think what you'll find from such a study is which machine most
>closely resembles the abstract machine Richard Stallman et al 
>had in mind when the RTL was designed.

There are, of course, other issues.  The 370 has various aspects
that make writing a compiler for it difficult.  In particular,
allowing subroutines that require more than 4K bytes of instructions
is somewhat tricky; and the fact that the 370 doesn't have negative
offsets makes it difficult to implement a stack on the 370 that
is compatible with the type of stack that GCC would like to implement.
(I guess what I'm saying here is that GCC does contain an abstract
machine that doesn't map real well onto the 370, and the abstract
machine is partially described by portions of the compiler that
have nothing to do with the RTL code.)

Of course, even using GCC to run all benchmarks still leaves you
open to differences in the skill of a compiler writer.  An untuned
implementation of GCC might not contain as many peephole optimizations
as would be desirable, or one of the instructions in the instruction
set may not have been described.  On the other hand, if you give me
a benchmark, it becomes relatively easy to tune GCC to run that particular
benchmark quickly.

>Steve

-- Chuck

anand@amax.npac.syr.edu (Anand Rangachari) (11/08/88)

In article <474@oracle.UUCP> csimmons@oracle.UUCP (Charles Simmons) writes:
[...]
>Of course, even using GCC to run all benchmarks still leaves you
>open to differences in the skill of a compiler writer.  An untuned
>implementation of GCC might not contain as many peephole optimizations
>as would be desirable, or one of the instructions in the instruction
>set may not have been described.  On the other hand, if you give me
>a benchmark, it becomes relatively easy to tune GCC to run that particular
>benchmark quickly.

  I was just wondering if that was such a bad thing after all. After all, a
benchmark is supposed to be representative of the typical programs a user
may want to run. Thus in improving the speed of a benchmark, you may 
actually improve the speed of a sizeable number of programs.

  An excellent argument against this is of course is that we dont have such 
benchmarks available (So I have gathered from the discussions on this
group).

                                                R. Anand
Internet: anand@amax.npac.syr.edu
Bitnet:   ranand@sunrise