[comp.arch] $/CPUmark is a worthless measure

rcd@ico.isc.com (Dick Dunn) (04/03/91)

I guess I've seen one too many comments like...
> But I wouldn't hesitate to use $/SPECint as a performance measure...

$/SPECmark, or $/any-CPU-benchmark, is about as useful for comparing
systems as lines-of-code/day for comparing programmers, and both are
about as useful as manure output for measuring the work done by horses.
They're no better than order-of-magnitude, and they stink.

People don't buy CPUs.  They buy systems.  This discussion has focused on
workstations.  That means disk, memory, keyboard, display, and maybe a
network card.  (Or does it?  One of the egregious bogus comparisons here
actually compared $/SPECxyzzy for machines with and without disks!)

How can you divide "dollars per system" by "SPECmarks per CPU" and expect
to get anything useful is beyond me.  (The result has the entertaining
dimension of "dollar-CPUs per system-SPECmark".)

The only thing the big numbers-in-lights $/SPECstuff (usually to three
significant figures!!) emphasis produces is more incentive for manufac-
turers to produce low-ball "wheels extra" system configurations--as if
they needed any such incentive.  Face it, if you want to give your product
a cheap boost, which is easier: speed up the CPU or take hardware out of
the bottom-of-the-line?

Sure, you can try to "normalize" the comparisons by insisting that all the
machines you compare have the same configuration.  That's just painting
over the rust.  Why?  The way to win the normalized comparison is to put
your fastest CPU in your most stripped-down machine--thereby comparing
based on maximally-unbalanced and unrealistic configurations!

BTW, I'm NOT saying this is what HP has done!  I'm saying this is why
$/SPECmark is a stupid, worthless measure.  (I can't yet tell what HP has
done.  The one trade rag that quoted figures seemed to say that the $12k
basic machine was diskless, but it also suggested that a 400-Mb add-on disk
would add $22.5k to the machine price!  I assume this was usual "what--me
proofread?" trade rag quality and discounted the whole thing...some real
info on HP's configurations would help.)
-- 
Dick Dunn     rcd@ico.isc.com -or- ico!rcd       Boulder, CO   (303)449-2870
   The Official Colorado State Vegetable is now the "state legislator".

dhinds@elaine18.Stanford.EDU (David Hinds) (04/03/91)

In article <1991Apr3.010831.3603@ico.isc.com> rcd@ico.isc.com (Dick Dunn) writes:
>$/SPECmark, or $/any-CPU-benchmark, is about as useful for comparing
>systems as lines-of-code/day for comparing programmers...

>People don't buy CPUs.  They buy systems.  This discussion has focused on
>workstations.  That means disk, memory, keyboard, display, and maybe a
>network card.

    Come on, there are a significant number of users for which this is a
significant number.  Lots of people use workstations primarily as compute
servers, and would be very interested in knowing simply how much CPU power
they get per dollar, with the minimum number of frills.

>The only thing the big numbers-in-lights $/SPECstuff (usually to three
>significant figures!!) emphasis produces is more incentive for manufac-
>turers to produce low-ball "wheels extra" system configurations--as if
>they needed any such incentive.  Face it, if you want to give your product
>a cheap boost, which is easier: speed up the CPU or take hardware out of
>the bottom-of-the-line?

Hey, if I have X dollars to spend today, for my purposes, I will go out
and buy the lowest of the low-ball configurations of the fastest CPU I
can find, because that is the bottom line for the work I do.  Sure, this
number stinks for most purposes, but it isn't useless.  If the company
goes out of its way to put a superfast CPU in a supercheap stripped-down
box, for me, that is a great deal.

 -David Hinds
  dhinds@cb-iris.stanford.edu

vac@crux.fac.cs.cmu.edu (Vincent Cate) (04/03/91)

>$/SPECmark, or $/any-CPU-benchmark, is about as useful 
>[...]
>They're no better than order-of-magnitude, and they stink.
>
>People don't buy CPUs.  They buy systems.  This discussion has focused on
>workstations.  That means disk, memory, keyboard, display, and maybe a
>network card.  (Or does it?  One of the egregious bogus comparisons here
>actually compared $/SPECxyzzy for machines with and without disks!)

SPEC runs on a system not a CPU.  The compiler, system bus, memory 
speed and size, cache speed and size, and the CPU speed all go into 
this number.  Sure, its not an I/O benchmark, but you should talk to 
some i860 people or something if you still think SPEC is a CPU benchmark.  

Now SPEC/MHZ, that is a ridiculous number.  Does someone think that
all CPUs have the same number of gate-delays between pipeline stages?  
MHZ is like MIPS, it just does not compare across architectures.
Normalizing based on cost is the best that we can really do.
$/SPECmark is a number I very much like seeing even though it does 
not include things like how many bus slots, screen size, or the number 
of function keys on the keyboard.  I hope we see $/SPEC again in the
future.

    -- Vince

mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (04/03/91)

>>>>> On 3 Apr 91 01:08:31 GMT, rcd@ico.isc.com (Dick Dunn) said:

Dick> $/SPECmark, or $/any-CPU-benchmark, is about as useful for comparing
Dick> systems as lines-of-code/day for comparing programmers, and both are
Dick> about as useful as manure output for measuring the work done by horses.
Dick> They're no better than order-of-magnitude, and they stink.

In my posts to comp.benchmarks, I have used two MFLOPS/Million$
numbers.  One used the LINPACK 1000x1000 "anything goes" results from
Dongarra's report, and the other used MFLOPS based on long vector
dyads which are main memory bandwidth-limited.

In both cases I estimated the prices (with maximum university
discounts) for a configuration of
	cpu
	128 MB RAM (or less, if 128 MB won't fit)
	1.2 GB SCSI disk
	no monitor, keyboard, or mouse

I disagree with Dunn's comment that these ratios are no better than
order-of-magnitude.  I am confident that they are significant to
within a factor of two, and my experience has been that they are
useful for estimating both the performance on my codes and the actual
cost of the system to within 25% each.

The results indicate that for memory-bandwidth-limited operations, the
Cray machines are approximately as cost-effective as the "Killer
Micros".   When system utilization is taken into account, they are
likely to be significantly more cost-effective.

For number-crunching, these seem to be reasonable estimates of what
performance can be bought for how many $$$.  Do you have any
suggestions of a better approach?
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@brahms.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/04/91)

In article <1991Apr3.010831.3603@ico.isc.com> rcd@ico.isc.com (Dick Dunn) writes:

| People don't buy CPUs.  They buy systems.  This discussion has focused on
| workstations.  That means disk, memory, keyboard, display, and maybe a
| network card.  (Or does it?  One of the egregious bogus comparisons here
| actually compared $/SPECxyzzy for machines with and without disks!)

  Actually most people do buy CPUs. Because no matter who builds the
CPU, chances are that the disk was built by people who build disks for
other CPU vendors. Yes, there are exceptions, a few companies are
willing and able to build their own, or at least have them built to
proprietary specs.

  Most of us are willing to buy a system and add the disk we need, be it
ESDI, SMD, SCSI, or whatever. On workstations this is almost always
true, on large systems the chances of a some proprietary parts
increases, even if it's only the drive electronics.

  And some applications are so CPU intensive that they are insensitive
to i/o performance. Would that the CPU were fast enough to make i/o an
issue again.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"

irf@kuling.UUCP (Bo Thide') (04/06/91)

In article <1991Apr3.010831.3603@ico.isc.com> rcd@ico.isc.com (Dick Dunn) writes:
>I guess I've seen one too many comments like...
>> But I wouldn't hesitate to use $/SPECint as a performance measure...
>
>$/SPECmark, or $/any-CPU-benchmark, is about as useful for comparing
>systems as lines-of-code/day for comparing programmers, and both are
>about as useful as manure output for measuring the work done by horses.
>They're no better than order-of-magnitude, and they stink.

Who says you compare systems by comparing the cost of a workstation per
whatever *mark is appropriate for the user -- you compare PRICES.

If some people don't give a damn about having to pay $100,000+ for a
workstation that has the same (or, even worse) overall performance 
as another workstation at 1/10 of the price it's their problem.  
At least I find such price info very useful.  It opens my eyes for
the cheaper of the two. 

On the other hand, if a person responsible for computer procurement
always buys the cheapest system around regardless of performance or
buys the fastest system regardless of price I think that person
should look for another job.

Bo