[comp.arch] Will the real 10 mips machine please stand up?

jaw@aurora.UUCP (James A. Woods) (08/19/87)

# "Well done is quickly done." -- Augustus Caesar 

     Below please find a small scalar integer benchmark, easy to test on
UNIX machines.  It hashes around nearly randomly in a 3/4 MB address space.

		time compress /usr/dict/words

	        u+s    user   sys

VAX 11/780     14.7    13.6   1.1	(4.3bsd, no asm assist ifdef)
Sun  3/180      5.9	5.5   0.4	(3.2 OS)	
Mips M/500  	3.2     2.7   0.5
Sun  3/2x0       ?			(feel free to fill in this fig.)
Sun  4/260	2.8	2.6   0.2	(machine sun!plaid)
Mips M/800      1.9	1.7   0.2 
Cray 2		1.8	1.81  0.06
Mips M/1000	1.8	1.5   0.3	(systime anomaly -- high load ave.?)

Figures for Mips are courtesy mips!dce, for the Sun 4 courtesy sun!chuq;
VAX 11/780, Sun 3/180, and Cray 2 runs were made at NASA Ames.  The stock
4.3 /usr/dict/words amounts to 198KB.  Disks all are Fujitsu Eagles except
on the Cray 2.

     Comment:  Results support the MIPS general integer benchmark claims.
Like much code written on Vaxen, 'compress' exhibits more than a few
instructions per function call.  I have a faster 'compress' algorithm
in mind which would index sparse data in 2-16MB -- at this point in the
time/space tradeoff curve, one would see better performance from designs
with the better TLB/virtual memory system.

     -- James A. Woods (ames!jaw)

dce@mips.UUCP (David Elliott) (08/19/87)

In article <905@aurora.UUCP> jaw@aurora.UUCP (James A. Woods) writes:
>
>		time compress /usr/dict/words
>
>	        u+s    user   sys
>
>Mips M/500  	3.2     2.7   0.5
...
>Mips M/800     1.9	1.7   0.2 
...
>Mips M/1000	1.8	1.5   0.3	(systime anomaly -- high load ave.?)

In reality, the machine I thought was an M/1000 was just an M/800. A
real M/1000 yields

		1.6     1.4   0.2-  (varies between 0.1 and 0.2)

>Figures for Mips are courtesy mips!dce, for the Sun 4 courtesy sun!chuq;
>VAX 11/780, Sun 3/180, and Cray 2 runs were made at NASA Ames.  The stock
>4.3 /usr/dict/words amounts to 198KB.  Disks all are Fujitsu Eagles except
>on the Cray 2.

The MIPS machines are all running untuned Fujitsu 2333 disks with 63
sectors. Using 64 sectors has been proven to be faster, and future
systems will be set up this way.

-- 
David Elliott		{decvax,ucbvax,ihnp4}!decwrl!mips!dce

jaw@aurora.UUCP (James A. Woods) (08/19/87)

# fools speed ahead

Here is an updated table, with new Sun figures provided by 
David DiGiacomo of Sun.

 	        u+s    user   sys
 
VAX 11/780     14.7    13.6   1.1	(4.3bsd, no asm assist ifdef)
Sun-3/180	5.9	5.5   0.4	(SunOS 3.2)	
Sun-3/260	3.5	3.2   0.3	(SunOS 3.4)
Mips M/500  	3.2     2.7   0.5
Sun-4/260	2.1	1.9   0.2	(SunOS Sys4BETA2, 65 ns clock)
Mips M/800      1.9	1.7   0.2 
Cray 2		1.86	1.80  0.06
Mips M/1000	1.8	1.5   0.3	(measured w/high load ave.)

It's interesting to watch the in-progress compiler/clock/memory tuning. 
Awaiting another round in the Sun/Mips benchmark wars...

ames!jaw

jaw@aurora.UUCP (James A. Woods) (08/20/87)

# "Leave the factory, leave the forge, and dance to the new St. George!"
       -- Richard Thompson

Given that no vectorization is involved, picking on the poor
Cray 2 is a bit unfair, I admit.

Just about any junk C code would do as well on a modern RISC board.
It's the old word addressing problem (code for a simple loop like

	while (*s++ == *t++) ;

generates about 20-25 instrunctions), coupled with the abysmal 238 nsec
memory, which only fetches vectors at one 4.1 nsec clock tick per word.

Working over the code to expand chars to ints might help, or getting
the compiler to stick array buffers into register storage would
probably work wonders.

Aside from such desiderata, one new hope for Seymour (for scalar work) is
is the "new improved 2S" with 40% faster RAM (static vs. dynamic).  Until
the "three" comes out, an Amdahl or an NEC would likely be the champ:

Cray X-MP/14      0.94  0.91  0.03	(9.5 ns clock; -hcisl instead of -O)
Cray X-MP/416	  0.84	0.82  0.02	(8.5 ns clock; -hcisl instead of -O)
Amdahl 5890-190   0.54  0.51  0.03	(== half of a model 5890-300 CPU)

Thanks to ucbvax!violet!luzmoor for the X-MP results.
(Bell Labs code 1127 people
might show off their rumored new X-MP C compiler if they dare.)
Also hats off to amdahl!{mat,chuck} for the 5890 figure.

ames!jaw