[comp.arch] Bar Wars again

grenley@nsc.nsc.com (George Grenley) (09/30/88)

Welcome Benchmark Fans!
The wheel's come full circle here at National, and I am again in the
32 bit micro benchmarking business.  Many moons ago I proposed on this net
a public, head to head comparison of the various 32 bit chips, preferably
"offsite" at a local tavern.  The idea met with some favorable response; let's
see if the world is still interested.

Each prospective participant should bring his iron, and the usual benchmark
code.  Be prepared to have ALL code inspected (including library sources)
by the referees.  We will compile and run the code, and all results will
be published, and placed in the public domain.  We will also verify clock
rates, etc.

We will need some "impartial referees".  Kent Porter, of Dr. Dobbs' Journal,
has expressed interest in this.  I think we should have 1 or 2 others as
well.  Volunteers?

I will be happy to arrange time, place, and other logistical details.  
Interested parties can contact me at nsc!grenley, or (408) 721-5513.
I'm especially interested in SPARC hardware, as well as 68030 and 386
platforms.  Also, I understand the 29000 is far enough along to appear in
public.  Tim Olson, are you interested?

Enogh for now.

George Grenley (I own Intel, Mot, AND NSC based systems - I'm either unbiased
or indecisive)

earl@mips.COM (Earl Killian) (09/30/88)

In article <6684@nsc.nsc.com> grenley@nsc.nsc.com (George Grenley) writes:

   From: grenley@nsc.nsc.com (George Grenley)
   Date: 29 Sep 88 20:08:54 GMT

   Welcome Benchmark Fans!
   The wheel's come full circle here at National, and I am again in
   the 32 bit micro benchmarking business.  Many moons ago I proposed
   on this net a public, head to head comparison of the various 32 bit
   chips, preferably "offsite" at a local tavern.  The idea met with
   some favorable response; let's see if the world is still
   interested.

   Each prospective participant should bring his iron, and the usual
   benchmark code.  Be prepared to have ALL code inspected (including
   library sources) by the referees.  We will compile and run the
   code, and all results will be published, and placed in the public
   domain.  We will also verify clock rates, etc.

The logistics of this are a problem.  Why don't you produce a
performance brief along the lines of the MIPS Performance Brief and
then post it, as MIPS has done.

To start with, what are your results on the following public-domain
benchmarks?

livermore fortran kernels, harmonic, geometric,
	and arithmetic mean, sp and dp

linpack 100x100, fortran and coded blas, sp and dp

1000x1000, sp and dp

spice2g6, bipole, digsr, and comparator inputs (as posted on comp.arch
	last year)

Hmm, as usual, no decent public-domain integer benchmarks come to
mind.  MIPS uses diff, grep, yacc, nroff, espresso, and timberwolf as
integer benchmarks given suitable inputs, but these aren't
public-domain, and aren't generally agreed upon.
-- 
UUCP: {ames,decwrl,prls,pyramid}!mips!earl
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086

grenley@nsc.nsc.com (George Grenley) (10/01/88)

The response this time seems less enthusiastic than before...

In article <4263@wright.mips.COM> earl@mips.COM (Earl Killian) writes:
>In article <6684@nsc.nsc.com> grenley@nsc.nsc.com (George Grenley) writes:
>   Many moons ago I proposed
>   on this net a public, head to head comparison of the various 32 bit
>   chips 
>   Each prospective participant should bring his iron, and the usual
>   benchmark code.  Be prepared to have ALL code inspected (including
>   library sources) by the referees.  We will compile and run the
>   code, and all results will be published, and placed in the public
>   domain.  We will also verify clock rates, etc.
>
>The logistics of this are a problem.  Why don't you produce a
>performance brief along the lines of the MIPS Performance Brief and
>then post it, as MIPS has done.

At last, someone at NSC is working on a comprenhensive benchmark summary.
Better late than never.  For various reasons, we at NSC have not been 
agressive in producing scientifically controlled benchmark data.

I do want to complement MIPS for the quality of their reports; in fact, I
have used them internally as examples of how it ought to be done.

I had several reasons for my proposal, though.  One was to provide an unofficial
engineer-only environment.  AT amny companies, pressures exist to go with
some official number, rather than measured data.  Also, there are frequently
variations in supposedly standard code.  I have different versions of Dhry1.1
which vary over 40%, even though they are supposedly the same code.

Also, the matter of verification remains.  LAst February, I saw a Mot 030
machine (at Buscon) running in excess of 13,000 Dhrystones.  They would not
show me the source, or let me touch it in any way.  I can only wonder about
that number - no other 030 user I know of has been able to verify it.

By doing a demo publicly, all participants can examine one anothers' machines
and code in whatever detail seems necessary.

Logistics may be awkward for some, but most participants are right here in 
Silicon Valley.

>To start with, what are your results on the following public-domain
>benchmarks?
>
>livermore fortran kernels, harmonic, geometric,
>	and arithmetic mean, sp and dp
>
>linpack 100x100, fortran and coded blas, sp and dp
>
>1000x1000, sp and dp
>
>spice2g6, bipole, digsr, and comparator inputs (as posted on comp.arch
>	last year)
>
>Hmm, as usual, no decent public-domain integer benchmarks come to
>mind.  MIPS uses diff, grep, yacc, nroff, espresso, and timberwolf as
>integer benchmarks given suitable inputs, but these aren't
>public-domain, and aren't generally agreed upon.

I will post this info as soon as I have it, but it will be awhile - I am just
getting started, and I don't even have sources for some of this yet.

George Grenley

aglew@urbsdc.Urbana.Gould.COM (10/17/88)

>/* Written  3:39 am  Oct 15, 1988 by eugene@eos.UUCP in urbsdc:comp.arch */
>In article <6005@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes:
>>rik@june.cs.washington.edu (Rik Littlefield) writes:
>>>[ large "real" program benchmarks vs. synthetic benchmarks ]
>>
>>Oh, gee, an opportunity to apply the scientific method :-)
>>
>>(a) Benchmark a bunch of computer systems (hardware/os/compiler)
>>    using synthetic benchmarks.
>>(b) Compare the benchmark performance to observations in the
>>    "real" world.
>>(c) Learn something about benchmarks, refine your synthetic
>>    benchmarks.
>>(d) go to (a)   (Oh no, not a GOTO!)
>
>I am sorry.
>
>I don't see the scientific method in this.  I don't see a theory,
>a hypothesis, a controlled experiment, nor even a control. 8-)
>Actually, don't worry, I get this all the time from the other
>"real" sciences myself.  I do see the beginnings of empirical work.
>Better luck next time.
>
>Another gross generalization from
>
>--eugene miya, NASA Ames Research Center, eugene@aurora.arc.nasa.gov

This has little to do with the discussion, but Eugene has touched
upon a sore point of mine here. 

	SCIENCE >= SCIENTIFIC METHOD

The theory-hypothesis-controlled experiment cycle is only one
part of the scientific process, the part that is applied to
relatively (1) well-understood and (2) easy to manipulate systems.
Eg. biology - at least meets criterion (2); criterion (1) is
often not met (and is the reason why so many biological experiments
cannot really be considered "controlled").
Eg. psychology (with a willing group of undergrads to study).
Etc.

Much of science goes on before the theory-hypothesis-experiment
cycle, and is purely observational. Ie. you must have some information
before you can formulate a theory. Eg. astronomy - it's kind of hard
to set up an experiment out in space. Eg. descriptive biology,
morphology - cut the animal up and describe what you see.
    True, observation can be selective, like an astronomer 
concentrating on a star cloud that he thinks will have a
supernova -- but if such an astronomer throws out evidence of
a black hole, even though he wasn't looking for it explicitly,
he is stupid.

I would even go so far to say that people who think that science
is only the theory-hypothesis-experiment cycle verge on
Aristotelian in their dogmatism.

What does this have to do with computer performance evaluation?
Well, I'll admit that we need more controlled experiments.
But, in a practical sense, the people who apply computer performance
evaluation are system operators and programmers, even individual
workstation users, who aren't in the business of doing experiments.
But they can observe, on a regular and ongoing basis.
How can we make these observations as efficient as possible
-- how can we use this great mass of potential observers to
generate hypotheses, that we can seek to prove or disprove by
experiment, that will ultimately result in improved computer
systems performance?


Andy "Krazy" Glew. 
at: Motorola Microcomputer Division, Champaign-Urbana Development Center
    (formerly Gould CSD Urbana Software Development Center).
mail: 1101 E. University, Urbana, Illinois 61801, USA.
email: (Gould addresses will persist for a while)
    aglew@gould.com     	    - preferred, if you have MX records
    aglew@fang.gould.com     	    - if you don't
    ...!uunet!uiucuxc!ccvaxa!aglew  - paths may still be the only way
   
My opinions are my own, and are not the opinions of my employer, or any
other organisation. I indicate my company only so that the reader may
account for any possible bias I may have towards our products.

PS. I promise to shorten this .signature soon.

ok@quintus.uucp (Richard A. O'Keefe) (10/26/88)

In article <28200213@urbsdc> aglew@urbsdc.Urbana.Gould.COM writes:
>The theory-hypothesis-controlled experiment cycle is only one
>part of the scientific process, the part that is applied to
>relatively (1) well-understood and (2) easy to manipulate systems.
>Eg. biology - at least meets criterion (2);

Pull the other one.  Maybe some parts of biology have easy-to-manipulate
systems, but the part which is most analogous to performance measurement
-- bioenergetic measurements on live animals engaged in natural behaviour --
is notoriously hard.  [You think not?  How would _you_ measure
instantaneous oxygen consumption in a bat chasing a moth?]