[net.micro] Yet Another Set of Benchmarks.

dan@idis.UUCP (Dan Strick) (12/05/83)
This is yet another collection of benchmarks.  The value of these
numbers as criteria for selecting a machine is uncertain.  There
are more important things than speed.

Most of the measurements are obsolete (machines and software change
rapidly).  The configuration (and sometimes the model) of each machine
is not specified.  The benchmarks are very simple and not comprehensive
in any sense.  They are intended to measure the speed at which a
machine can do basic things like fetching instructions and pushing words
around.  None of these benchmarks measure disk throughput or the
efficiency of the operating system in any precise way.  The compile
times were recorded, but be warned that they may be misleading
because the compilers may differ and the compile times may be very
configuration dependent.  The cpu benchmarks measure something that
depends on both the speeds of the processors and the quality of the
code produced by the compilers for the particular C language
statements that were used in the benchmarks.  It is not clear what
this has to do with the relative speeds with which a particular program
would run on the various machines.  Note in particular that these
benchmarks do not reflect the benefits of independent i/o or graphics
processors, dma terminal interfaces, or other hardware features that
can make a big difference in real applications, especially on multiuser
systems.

Unlike others who have posted benchmarks, I intend to draw some
conclusions and make some comments about the relative performance
of certain machines.  These conclusions and comments do not reflect
opinions held by my employer (or even by me as I reserve the right
to change my mind if seriously challenged).

------------------------------------------------------------------

These are the benchmarks (called "b1" and "b2").  B1 is essentially
the same loop that has been making the rounds at computer conferences
during the last year or so.  B2 is a modification of B1 that uses
32 bit integers in an expression characteristic of certain programs
that are important to me.

b1:
main ()
{
	register int i, j, k;

	k = 0;
	for (i = 1;  i <= 1000; ++i)
		for (j = 1;  j <= 1000; ++j)
			++k;
}

b2:
main ()
{
	register int i;
	register long *p;
	register long l;
	long v[1000];

	for (i = 1; i <= 1000;  ++i) {
		l = 0;
		for (p = v;  p < v+1000;)
			l += *p++ >> 5;
	}
}

-------------------------------------------------------------------------------

All times are in seconds.  All sizes are in bytes.  Most of the machines
were otherwise idle when the benchmarks were run.  The bigger time sharing
systems were exceptions.  The machines are ordered according to the run time
of the first benchmark.

The first column identifies the conference at which the measurement was made.
"B" is for Boston, "s" is for San Diego, "t" is for Toronto, "f" is for a
small "computer faire" held recently at my university, and "p" is for
measurements made on my department's machines.


			      COMPILE		      EXECUTE	       SIZE.O
			user	sys	real	user	sys	real	text cpu
			--------------------	--------------------	---- ---

s SEL 32/87	b1	 0.6	 2.6	32	 1.0	 0.0	 2	 96

t SEL 32/67	b1	 1.5	 3.0	12	 1.8	 0.0	 3	 96
		b2	 1.7	 3.0	11	 4.4	 0.0	 4	 96

p VAX 11/780	b1	 1.9	 2.7	 8	 2.1	 0.3	 3	 36
  compat mode	b2	 1.9	 2.7	 8	13.5	 0.3	14	 76

p VAX 11/780	b1	 0.7	 1.2	 3	 2.6	 0.0	 3	 32
		b2	 0.8	 1.1	 3	 5.7	 0.1	 6	 44

s plexus p/25	b1	 2.7	 3.1	16	 3.6	 0.1	 4	 38  z8k
  model 1025	b2				16.5	 0.2	20	 66

t onyx c8002a	b1	 1.7	 3.9	15	 3.7	 0.1	 4	 26  z8k
  (6 MHz)	b2	 2.0	 4.0	15	11.9	 0.1	12	 54

t crds		b1	 1.9	 2.8	24	 3.9	 0.1	 5	 54  68k
  universe 68	b2	 2.0	 2.9	24	 7.0	 0.1	 8	 66

t zilog sys 8k	b1	 2.2	 4.8	26	 3.9	 0.0	 4	 30  z8k
  model 11	b2	 2.4	 4.7	27	16.0	 0.1	16	 54

f masscomp	b1	 1.2	 2.1	11	 4.2	 0.1	 5	 48  68k
  workstation	b2	 1.3	 2.2	10	 6.5	 0.1	 7	 56

t pixel 100/ap	b1	 3.1	 3.5	43	 4.4	 0.1	 8	     68k
		b2	 3.6	 3.3	44	 6.8	 0.1	10

b onyx c8002	b1	 2.5	 4.6	18	 4.6	 0.0	 6	     z8k
  (4 MHz?)

t hp 9000	b1	 1.5	 4.7	17	 5.1	 0.0	 6
  series 500	b2	 1.9	 4.7	18	 9.3	 0.0	11

s cci power5	b1	 3.9	 6.9	16	 5.2	 0.0	 6	     68k
  model 20

t callan	b1	 7.4	 8.3	31	 5.5	 0.1	 6	 52  68k
  unistar 200	b2	 8.0	 8.2	30	 8.5	 0.1	 9	 60

s onyx c5002a	b1	 2.5	 6.0	20	 5.7	 0.1	 6	 26  z8k

t wicat sys 200	b1	 1.6	 5.3	18	 6.0	 0.1	 6	 48  68k
		b2	 2.0	 5.3	19	 9.2	 0.1	10	 56

f ncr tower	b1	 3.8	 6.1	30	 6.1	 0.2	 7	 52  68k
		b2	 4.0	 6.2	28	 9.2	 0.1	10	 60

t spectrix	b1	 4.8	 3.8	25	 6.1	 0.1	 7	     68k
  series 10	b2	 4.8	 3.9	25	 7.7	 0.1	10

s altos 586	b1	 0.9	 3.9	17	 6.2	 0.2	 6	    8086
		b2				29.0	 0.1	29

s pacific	b1	 3.6	 5.9	22	 6.6	 0.2	 7	     68k
  pm400		b2				 8.9	 0.2	 9

s Hawk 32	b1	 3.6	12.8	26	 6.9	 0.2	 8	 52  68k
		b2				10.7	 0.1	11	 60

t intel ?286	b1	 5.8	 4.2	20	 6.9	 0.0	 7	   80286
		b2	 6.2	 3.5	20	34.6	 0.0	35

p pdp 11/40	b1	 1.3	 5.2	15	 7.8	 0.0	 8	 36
  (unix v6.5)	b2	 1.7	 5.2	15	38.0	 0.0	39	 76

s wicat sys 150	b1	 2.4	 5.3	23	 8.2	 0.1	 8	     68k
		b2				12.3	 0.1	13

b intel 86/330x	b1	 3.2	 4.6	13	 8.5	 0.1	 9	    8086

t fortune	b1	 3.4	 5.0	13	 8.0	 0.1	 8       64  68k
  32:16		b2	 3.8	 5.0	14	10.6	 0.1	11	 68

t apple lisa	b1	 3.7	 6.0	39	 8.0	 0.1	 8	 52  68k
  (uniplus)	b2	 3.9	 6.1	40	12.9	 0.1	13	 60

s dual sys 83	b1	 5.3	 6.0	20	 8.8	 0.1	 9	     68k
		b2	 5.4	 7.2	22	13.1	 0.1	13

s altos 6800	b1	 4.7	 8.9	40	 9.1	 0.2	10	     68k
		b2				23.1	 0.2	24

t dec pro 350	b1	 1.1	 5.9	25	 9.2	 0.0	10	   11/23
  (venix)	b2	 1.5	 6.1	26	29.0	 0.1	29

p crds mb-211	b1	 1.2	 8.5	23	10.1	 0.1	11	36 11/23
  (venix)	b2	 1.6	 8.6	23	59.2	 0.2	76	76

b altos 8600	b1	 2.6	 8.7	26	10.4	 0.0	11	    8086
  (xenix)

t ibm pc	b1	 2.1	 7.8	19	18.7	 0.2	19	    8088
  (coherent)	b2	 2.5	 6.5	20	81.4	 0.1	82

t ibm pc	b1	 8.4	 6.5	37	20.3	 0.2	22	 48 8088
  (venix)	b2	 8.9	 6.5	38     118.2	 0.2   119	102


------------------------------------------------------------------------------

The conclusions:

It is amazing that all the microcomputers seem to run at pretty much the
same speed even though they are based on different chips with different
architectures that have been developed from different proprietary flavors
of chip technology.  We can find certain machines that are absurdly slow
(Are you listening, Big Blue?) but we can't find a (modern 16 bit) chip
that is an order of magnitude faster than another.  Either the research
groups of the semiconductor companies are all plodding along making steady
engineering improvements in their technology without making any sudden
breakthroughs or they are all stealing each other's technology.
The speed at which a microcomputer runs may be limited mainly by memory
and I/O bus.  Everyone seems to be using pretty much the same memory
chips.

There are notable differences in architecture.  The Z8000 and 8086 chips
support smaller featureless (i.e. single segment) address spaces than the
68000 chips.  This is not reflected by any of the benchmarks.  The details
of the instruction sets are less important when one programs in a high level
language, but some general characteristics of an architecture can be
felt.  For example, if we arrange the systems in the order of second
benchmark run times, we see a pattern:

	SEL 32/67			  4.4
	VAX 11/780			  5.7
	masscomp workstation		  6.5		68k
	pixel 100/ap			  6.8		68k
	crds universe 68		  7.0		68k
	spectrix series 10		  7.7	 	68k
	callan unistar 200		  8.5		68k
	pacific pm400			  8.9		68k
	wicat sys 200			  9.2		68k
	ncr tower			  9.2		68k
	hp 9000	series 500		  9.3
	fortune 32:16			 10.6		68k
	Hawk 32				 10.7		68k
	onyx c8002a (6 MHz)		 11.9		z8k
	wicat sys 150			 12.3		68k
	apple lisa (uniplus)		 12.9		68k
	dual sys 83			 13.1		68k
	VAX 11/780 compat mode		 13.5
	zilog sys 8k model 11		 16.0		z8k
	plexus p/25 model 1025		 16.5		z8k
	altos 6800			 23.1		68k
	dec pro 350 (venix)		 29.0		11/23
	altos 586			 29.0		8086
	intel ?286			 34.6		80286
	pdp 11/40 (unix v6.5)		 38.0
	crds mb-211 (venix)		 59.2		11/23
	ibm pc (coherent)		 81.4		8088
	ibm pc (venix)			118.2		8088

The 68000 machines seem to handle 32 bit stuff better than the others.
The few Z8000 machines ended up in the middle of the pack while the 11/23
and 8086 family systems finished last.  I tend to regard the first
benchmark as more relevant to most applications than the second (an
offhand opinion), but I award first prize on the basis of the second
benchmark because it differentiates more strongly between the architectures.
(These benchmarks use no 32 bit multiply or divide instructions and no floating
point instructions and therefore are not appropriate for number crunching
applications.)

The 16032 chip may eventually be a winner, but it is too new to enter into
the competition just yet.

-----------------------------------------------------------------------

The comments:

I refuse to nominate any machine for "best buy".  Partly because different
machines are better for different applications and partly because the answer
for any particular application depends less on speed and more on features
considered here.  I chose my machine (a Sun Workstation) because it had
nice graphics, a nice version of UNIX, and a few other features I liked.
I will post benchmarks for my machine as soon as I get my SUN 2 processor
board.  I expect it to run about as fast as the Masscomp machine.

I would like to express considerable irritation at the extent to which
hype has replaced fact in recent advertising.  The Z8000, 8086, and 68000
instruction sets and addressing modes have been described by their
manufacturers as "orthogonal, just like the pdp11."  This is garbage.
The 68000 is probably the best of a bad lot.  Fortunately we don't have
to program in assembler language any more.  A common violation of good taste
and polite behavior is to compare not-yet-developed processor chips running
at hypothetical clock rates to currently available chips.  Other tricks are
to use favorable high level benchmarks (your XXX compiler produces more
efficient code than his) or benchmarks designed to fit a particular
architecture (just happens to correspond exactly to a sophisticated
microprogrammed instruction on your machine).

It is difficult to single out individual companies or products for awards,
perhaps because there are so many imaginative performances to choose from.
There may be one that stands out a little beyond the rest.  On page 67
of the February 1983 Mini-Micro Systems magazine:
	"The Intel, iAPX 286.  It gives you three times the performance
	of what you thought was the fastest chip in the market".
On page 252 of the June issue:
	"It offers three times the performance of any other microprocessor".
This claim was repeated almost endlessly during the netnews microprocessor
architecture wars waged earlier this year.  I got my hands on one and ran
benchmarks at the Toronto meeting (see above).  The system was some sort
of prototype and only ran at 5 MHz, but even at 7.7 MHz (available early
in 1984 according to the June advertisement) the system would be merely
competitive with the fastest Z8000 and 68000 systems.

The fabled HP 9000 series 500 was another disappointment.  This is the
32 bit machine with a proprietary stack architecture and a big price tag.
It ran fairly well, but it didn't blow away the competition as expected.

On the bright side, the LISA did much better than I expected as it
is said to run at 5 MHz (most 68k machines seem to run at 10 MHz
these days) and has an amazingly slow disk.  The PIXEL and WICAT machines
have improved a great deal since I first tried them out at earlier
conferences.  The DEC Professional 350 ran well for a machine with an
old architecture and no dma devices.

The new super micros are nice, but it is hard to beat the VAX 11/780
as a UNIX machine.  Take a look at the real compile times.  We run
4.0bsd with a 2k block size.

				Dan Strick
				University of Pittsburgh
				School of Library and Information Science
				decvax!idis!dan
				mcnc!idis!dan