[comp.arch] Benchmarking ...really Nelson benchmarks

mash@mips.UUCP (John Mashey) (05/21/87)

In article <6024@steinmetz.steinmetz.UUCP> davidsen@kbsvax.steinmetz.UUCP (William E. Davidsen Jr) writes:
>
>As the developer of a benchmark suite of my own I would love to cast
>bricks at the Nelson suite. In truth it's a pretty good set of benchmarks,
>and has been run on hundreds of configurations. I agree that it would be
>a suitable measure of machines.

Is there anybody out there who:
	a) Knows enough of what's going on inside the Nelson suite that
	can explain it, and
	b) Is legally allowed to?

I have several reasons for asking:
a) Many people have referenced it, but I haven't seen a publicly-available
description of how it works.  (If I've just missed it, please point me at it).

b) A while back, an (unnamed) prospect wished to get Nelson numbers on our
machine, which I assisted in.  There were several oddities:
	1) From the outside, the test mostly seemed to measure raw
	disk performance, and the general performance statistics (as seen
	by vmstat) didn't look very much like what I've seen real systems
	run like.  [I can't remember why; I just remember they looked odd.
	Can anybody else comment?  This was on a very early version of the
	OS, so it may have been strange.]  In particular, it seemed like
	there was a higher percentage of idle time, even with many scripts
	running, than one would expect.
	2) Although I don't still have a copy of the benchmark source
	around, I do remember in the process of getting it working to
	have seen some "arithmetic" test that looked like:
		a = b + c;
		d = e - f;
		g = h * i;
		j = k / l;
	that purported to measure arithmetic speed of the processor.
	I have no idea how that counted in the tests, or how it was
	actually used, but seeing something like that instantly raises
	my skepticism level rather high, since it has no attempt whatsoever
	to act like real code.

Anyway, this is not an attempt to throw rocks at that benchmark, but just a wish
that somebody would tell us what it's supposed to measure, and generally,
how it does it, so we can assess whether or not it means anything.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD:  	408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

bobw@wdl1.UUCP (05/26/87)

/ wdl1:comp.arch / mash@mips.UUCP (John Mashey) /  1:36 am  May 21, 1987 /
In article <6024@steinmetz.steinmetz.UUCP> davidsen@kbsvax.steinmetz.UUCP (William E. Davidsen Jr) writes:
>
>As the developer of a benchmark suite of my own I would love to cast
>bricks at the Nelson suite. In truth it's a pretty good set of benchmarks,
>and has been run on hundreds of configurations. I agree that it would be
>a suitable measure of machines.

One "curious" point about the Nelson benchmarks is their intentional
use of goto throughout the c source. This is supposed to make the
results independent of optimizing in the compilers on different
systems.

I've never been quite sure what that accomplishes. To put it another
way, what is the benchmark supposed to be measuring: SYSTEM
performance, or HARDWARE performance. If the first, then a
sophisticate compiler/optimizer seems to me a significant part of
systerm performance. After all, the application code you run on the
system will presumably get some benefit out of the optimizer too.
If the second, what relevance has it to buying a SYSTEM? What the
customer presumably cares about is how well the system runs his/her
application, not the rate at which it can access memory or increment
a register. (The latter may well be of interest to system designers,
looking for bottlenecks to relieve, or for marketing types trying to
design a benchmark that makes their system look better than the
competition, but Nelson is being used to make purchase decisions.)

-----------------------------------------------------------------
I disclaim almost everything, probably including this line.

chuck@amdahl.UUCP (05/27/87)

In article <3490003@wdl1.UUCP> bobw@wdl1.UUCP (Robert Lee Wilson Jr.) writes:
>What the
>customer presumably cares about is how well the system runs his/her
>application, not the rate at which it can access memory or increment
>a register.

I'm getting a little bored with this topic so I probably shouldn't
help proliferate it, but...

There are a number of people suggesting that a holistic approach to
measuring system performance is necessary.  Being a reductionist at
heart, it seems quite likely to me that a reductionistic approach
could be viable.

The holistic approach suggests that you need to simulate the application
environment on each system that you are considering buying.  This
obviously has a number of difficulties associated with it.  Perhaps
the biggest difficulty is accurately determining what the application
environment will actually look like.

The reductionistic approach suggests that system performance can
be estimated by examining a number of pieces of the system.  In
particular, I would suggest the following pieces:

1)  raw cpu power:  how many VAX 785 MIPs does the processor get?

2)  raw I/O power:  how long does it take to get data off the disk?
Both serial access and random access should be measured.

3)  context switching speeds:  how long does it take for the system
to process an interrupt?  how long does it take for the system to
switch between user programs?

4)  compiler performance:  how well does the compiler optimize?
how slow is the compiler?  how large are the binaries produced by
the compiler?

5)  system cost.

With numbers like these I can make some ballpark estimates as
to how well a system will perform for my applications, and if nothing
else, I can quickly narrow down the number of systems that I am
considering buying.

Any other reductionists want to help me out with my argument?

-- Chuck

tuba@ur-tut.UUCP (Jon Krueger) (05/28/87)

In article <7261@amdahl.amdahl.com> chuck@amdahl.UUCP (Charles Simmons) writes:
>...
>The holistic approach suggests that you need to simulate the application
>environment on each system that you are considering buying.  This
>obviously has a number of difficulties associated with it.  Perhaps
>the biggest difficulty is accurately determining what the application
>environment will actually look like.
>...
>Any other reductionists want to help me out with my argument?
>-- Chuck

If you know what application you're going to run, and your
requirements and software won't change significantly over time, and
you don't mind performing unnecessary tests (of systems that clearly
can't meet requirements), and you don't care which of your candidates
is more likely to remain competitive and save you money in the future,
by all means go holistic!  Throw away all those questionable
benchmarks and perform the only test that matters to you.

If life isn't that simple in your shop, you might need some
measurements and analysis to guide a purchase choice in a complex
world.  Include here Richardson's comments in the clarify.doc file he
supplies in the dhrystone distribution.

					-- jon