[comp.arch] Checkered Benchmark History

jimv@radix (Jim Valerio) (08/28/88)

For several years now, I have been feeling guilty whenever SIEVE was quoted
as a benchmark.  You see, I'm afraid that I may be indirectly responsible
for it's use.  I'm hoping that by confessing my story, someone will be able
to tell me that it wasn't my fault, and that the benchmark has a different
etymology.  Herewith follows the story.

In 1979 (I think), I was working at Tektronix when Motorola's 68000 Pascal
compiler was first released to us.  My friend Roger Critchlow was working
at Sidereal, a company that had a 68000-based product written in assembly
language.  Roger asked me what kind of code the Motorola compiler generated,
and I responded, not so good.  He was interested in more detail, so I
offered to type in some random statements and give him a listing.

Instead of typing in entirely random statements, I typed in from memory
a preprocessing program that I had glanced at the previous evening.  That
trivial program generated a table of primes, using the compacted Eratosthenes'
Sieve.  (This table was then included as part of a sophisticated factoring
program built by Mike Penk, which was used earlier that year to pull a
factor off of Mersenne (sp?) 257 and find a 303 digit twin prime.)

I compiled that 1-page program, and printed the listing file for Roger.
The code was worse than I thought.  For example, initializing a variable
to "-1" was done by moving a "1" into a register, negating it, and then
storing that in the destination.  I dropped the listing off on Roger's
doorstep, only slighly wet from being toted through the rain.

A week later, I called Roger to verify that he'd received the listing.
He said yes, and that it was atrocious code generation.  He also suggested
that next time I don't leave the listing next to the gutter downspout.
Roger also mentioned, in passing, that his co-worker Chuck Forsberg liked
the sample program, and was using it to evaluate another compiler.
(Perhaps Chuck can comment more on this.)

The next thing I hear, a year or two later, is that Byte is using the
Sieve benchmark, contributed by Chuck Forsberg.  I think I remember seeing
the article, and noticing that the code was slightly different than my
original code.  I recall being irked that the computed function was not
described as a "compacted" sieve.

Over the years, the benchmark has changed enough that I don't see any of
my code there.  But I'm left with the guilty feeling that with an hour's
work and a soggy listing, I am responsible for one of the worst of the
often-quoted benchmarks.
--
Jim Valerio	jimv%radix@omepd.intel.com, {verdix,omepd}!radix!jimv

colwell@mfci.UUCP (Robert Colwell) (08/29/88)

In article <62@radix> jimv@radix.UUCP (Jim Valerio) writes:
>For several years now, I have been feeling guilty whenever SIEVE was quoted
>as a benchmark.  You see, I'm afraid that I may be indirectly responsible
>for it's use.

Valerio, you slime, you've single-handedly screwed the entire
computing world forever.  (There.  I know you expected that of
someone, and I didn't want to let you down.)

>Over the years, the benchmark has changed enough that I don't see any of
>my code there.  But I'm left with the guilty feeling that with an hour's
>work and a soggy listing, I am responsible for one of the worst of the
>often-quoted benchmarks.

But but but....it's not the benchmark, it's what the benchmarker is
trying to make of it.  The Sieve benchmark has its place.  For
instance, if you store the flags array as a set of bits, and try to
access them a bit at a time on a scientific minisuper, you may find
out just how often such a machine needs to execute extracts and
merges.  That could conceivably be an interesting metric under
certain extreme conditions.

If you're implying that we could or should be trying to make more
bulletproof benchmarks, so that it is more difficult to twist their
results to favor particular machines or architectures, then I'd say
that's a worthwhile effort.  But until that succeeds (IF it succeeds,
which I doubt) then there's a place in the world for toy benchmarks.


Bob Colwell            mfci!colwell@uunet.uucp
Multiflow Computer
175 N. Main St.
Branford, CT 06405     203-488-6090

aburto@marlin.NOSC.MIL (Alfred A. Aburto) (09/02/88)

--------------

The first published Sieve Of Eratosthenes 'benchmark' results appeared
in the Sept 1981 issue of BYTE ( Jim Gilbreath, 'A High-Level Language
Benchmark', BYTE, Sep 1981, pg 180 ).  Gilbreath says (in that article)
that he learned of the Sieve Of Eratosthenes program from Chuck Forsberg
at the Jan 1980 Unix conference at Boulder.  Gilbreath goes on to say that
he modified Knuth's program (Knuth, Donald E., 'The Art Of Computer 
Programming', Vol 2: Semi-Numerical Algorithms, Reading MA, Addison-Wesley,
every accesable high-level language.  There was a follow-on report by Jim
and Gary Gilbreath with hundreds of results (it seems) published in BYTE
Jan 1983.

Al Aburto
aburto@marlin.nosc.mil.UUCP