[comp.arch] brash micros ... [really: CPI versus

mash@mips.UUCP (John Mashey) (09/09/87)

In article <1251@pdn.UUCP> alan@pdn.UUCP (0000-Alan Lovejoy) writes:
>In article <649@winchester.UUCP> mash@winchester.UUCP (John Mashey) writes:
> >Again, this is not to denigrate cycles/instruction as something architects
> >use:...
>>a given benchmark.  This was the original anti-RISC argument: "yes, the
>
>Aha!  "on a given benchmark"!  Precisely!  Normalizing  to "units of
>work" NECESSARILY involves using a (set of) "given benchmark(s)".  It is
>a virtual certainty that any set of benchmarks you choose skews the
>measurement of work so that some CPU's will either suffer or benefit
>to an extreme degree in their percieved performance using normalized
>instructions to measure work accomplished.  Eventually, of course,
>one must decide on a benchmark suite, but that "binding" should be
>delayed for as long as possible, so that the performance numbers
>are not "skewed" before they have to be.

Whether you know it or not, anyone who quotes a CPI is assuming something
about the benchmark set, whether you know it or not.  You cannot compute
a CPI for any but the simplest machine without using benchmarks, because
the CPI can vary substantially on the same machine, even with honest
real benchmarks; the CPI can be driven very high or very low by artificial
ones.  CPI's DON'T EXIST APART FROM BENCHMARKS.  For example, on the same
CPU:
	1.05	small loopy benchmark that fits in caches [Dhrystone]
	2-3	bigger one, with some FP, or cache misses, depending on
		memory system [Hspice]
	30+	handcoded program that does mostly integer divides
One perhaps can argue that a CPU has a given CPI on "typical" programs,
whatever those are, but like it or not, a CPI always was derived from
benchmarks, if the CPI is an engineering number at all.  If one at least
specifies the benchmarks, then other can judge whether or not the
benchmarks are relevant or not.  For example, although the Ridge folks
have published some CPI numbers I don't quite understand, at least
they've spec'd that the CPI's are for Whetstones [good: at least I know
what they're for and can judge whether or not the number might be relevant.
For example, a CPI under 2 for Whetstone means a RISC CPU probably has
decent floating point.]

Anyway, maybe it's time to see if anybody else has useful thoughts on
this, else we ought to continue the discussion via email.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

eugene@pioneer.arpa (Eugene Miya N.) (09/09/87)

>this, else we ought to continue the discussion via email.

That's the smartest thing said ;-) in this discussion.
>-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>

BTW: for those interested in performance of graphics systems, there will
be another TIGPE lunch shortly in the valley shortly, watch for Julian
Gomez's posting on comp.graphics.

--eugene miya

lamaster@pioneer.arpa (Hugh LaMaster) (09/09/87)

In article <667@winchester.UUCP> mash@winchester.UUCP (John Mashey) writes:

>ones.  CPI's DON'T EXIST APART FROM BENCHMARKS.  For example, on the same
>CPU:
>	1.05	small loopy benchmark that fits in caches [Dhrystone]
>	2-3	bigger one, with some FP, or cache misses, depending on
>		memory system [Hspice]
>	30+	handcoded program that does mostly integer divides

Exactly right.  I only consider cycles per instruction to be useful when
measured using the LINPACK benchmark myself :-)   [Weitek, Fairchild, MIPS,
and to a lesser extent Sun, Motorola, and AMD all seem to be taking floating
point very seriously now.  Thanks, folks.]  Many people out there would no
doubt prefer using a Un*x oriented benchmark with lots of typical utilities,
fork and exec exercizers, TCP/IP/network &etc. stuff, rn ...  It all depends
upon the set of applications that you will run on the machine.  The benchmarks
used to measure the performance of the machine for your applications also can
measure the "architectural efficiency" in various ways for your applications.
IF you care.  Presumably, an end user wouldn't care about it anyway, except
out of curiosity, since the net performance is what the user sees, not the
design tradeoffs of system architects.

  Hugh LaMaster, m/s 233-9,  UUCP {seismo,topaz,lll-crg,ucbvax}!
  NASA Ames Research Center                ames!pioneer!lamaster
  Moffett Field, CA 94035    ARPA lamaster@ames-pioneer.arpa
  Phone:  (415)694-6117      ARPA lamaster@pioneer.arc.nasa.gov

                 "IBM will have it soon"

(Disclaimer: "All opinions solely the author's responsibility")

eugene@pioneer.arpa (Eugene Miya N.) (09/09/87)

In article <2716@ames.arpa> lamaster@ames.UUCP (Hugh LaMaster) writes:
>point very seriously now.  Thanks, folks.]  Many people out there would no
>doubt prefer using a Un*x oriented benchmark with lots of typical utilities,
>fork and exec exercizers, TCP/IP/network &etc. stuff, rn ...  It all depends
>upon the set of applications that you will run on the machine.

I have real problems running Unix programs like nroff as benchmarks.
The problem is that these programs are very data dependent.  I would
want consistent, well considered data run thru various programs like
this.  Consider the performance of nroff with absolutely no directives
inside it versus a file heavily loaded with directives.  I stop short of
endorsing any "standard test file," since no such thing exists.  (Actually
run both cases).  This is why I don't care for the AIM benchmark and
certain other well intended benchmarks.

From the Rock of Ages Home for Retired Hackers:

--eugene miya
  NASA Ames Research Center
  eugene@ames-aurora.ARPA
  "You trust the `reply' command with all those different mailers out there?"
  "Send mail, avoid follow-ups.  If enough, I'll summarize."
  {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!aurora!eugene