[comp.sys.dec] DECstation 3100 - is it fast?

cracraft@wheaties.ai.mit.edu (Stuart Cracraft) (04/28/89)

After benchmarking the new 3100 with a C dhrystone program
(25,000 dhrystones per second), I felt it would probably
do pretty well on an application, since a VAX 8810 scored
8,333 dhrystones per second with the same C program.

Imagine my surprise when the application ran only marginally
faster on the 3100 (unloaded) than on the 8810!

Do you think (dare say) these machines are optimized for
the typical benchmarks?

	Stuart

jg@jumbo.dec.com (Jim Gettys) (04/29/89)

In article <2056@gluteus.ai.mit.edu> cracraft@wheaties.ai.mit.edu (Stuart Cracraft) writes:
>
>After benchmarking the new 3100 with a C dhrystone program
>(25,000 dhrystones per second), I felt it would probably
>do pretty well on an application, since a VAX 8810 scored
>8,333 dhrystones per second with the same C program.
>
>Imagine my surprise when the application ran only marginally
>faster on the 3100 (unloaded) than on the 8810!

How about more information about the program, and how you compiled it....

Did you use higher levels of optimization, cache ordering, or what?

				- Jim

rwood@vajra.uucp (Richard Wood) (04/29/89)

Cracraft@wheaties.ai.mit.edu (Stuart Cracraft) writes:
> 
> After benchmarking the new 3100 with a C dhrystone program
> (25,000 dhrystones per second), I felt it would probably
> do pretty well on an application, since a VAX 8810 scored
> 8,333 dhrystones per second with the same C program.
> 
> Imagine my surprise when the application ran only marginally
> faster on the 3100 (unloaded) than on the 8810!
> 
> Do you think (dare say) these machines are optimized for
> the typical benchmarks?
> 
> 	Stuart

If you compare that Dhrystone number to "MIPS" as defined by some of our
more aggressive competitors, you'd probably discover that our box is
rated at over 16 "Dhrystone MIPS".  This isn't a term Digital likes to
use, because it implies more than we (or anyone) can deliver.

To judge a machine by a toy program that will fit in the cache of about
every machine that has one is silly.  It also ignores that fact that the
overwhelming number of applications are not wholly compute intensive -
they usually have a good mix of compute and I/O.

DIGITAL DOES NOT DESIGN MACHINES TO RUN WELL IN BENCHMARKS.

If we did, we'd also be the kind of company that would claim as our
performance numbers some point way off the curve.  We do things
conservatively, as befits a corporation that other companies depend on
to state reasonable claims.

(Note that the VAX 8810 is rated by DEC at 7 VUPS (where VUP is a
conservative estimate of the factor greater than a VAX 11/780 running
the same operating system and compilers), while the DECstation 3100 is
rated at about 12 VUPS.  It is quite within reason that the DS3100
would only be "marginally" faster - there are some things that the
large VAXes are very good at, while the MIPS processor might not do
quite as well.

Tell us more about the application, and perhaps the mystery will
evaporate.

-- ----------------------------------------------------------------------------
Does it need saying that I'm not speaking as an official representative of DEC?
===============================================================================
Richard Wood  !  U. S. Worksystems, Palo Alto  !  Digital Equipment Corporation

rec@dg.dg.com (Robert Cousins) (05/01/89)

In article <2056@gluteus.ai.mit.edu> cracraft@wheaties.ai.mit.edu (Stuart Cracraft) writes:
>
>After benchmarking the new 3100 with a C dhrystone program
>(25,000 dhrystones per second), I felt it would probably
>do pretty well on an application, since a VAX 8810 scored
>8,333 dhrystones per second with the same C program.
>
>Imagine my surprise when the application ran only marginally
>faster on the 3100 (unloaded) than on the 8810!
>
>Do you think (dare say) these machines are optimized for
>the typical benchmarks?
>
>	Stuart

You must remember that just because a processor does some things
well, it doesn't do everything well.  There are several potential problems
with the Rxxxx series such as the MMU.  Whenever the address translation 
unit misses, a trap is taken and SOFTWARE has to reload the TLB.  This 
can take a large amount of time away from applications if memory access 
patterns are sufficiently random.  Dhrystone is quite localized so this
will not normally be seen there.  Also, I have noticed that MIPS has 
some of the most sophisticated optimizing compilers around yet if you 
run another RISC at the same speed (such as the 88K), the MIPS looses 
in the dhrystones race.  There is something non-obvious about predicting 
the performance of MIPS based machines in general which I don't know about. 

As for the general case of RISCs versus CISCs, I can say that my experience
with the 88K products (AViiON in particular) tells me that for most 
applications a RISC will substantially outperform a CISC.  We have noticed
that DG/UX runs MUCH better on a multiple processor 88K than it does on
a Dual MV20000.

It is possible to optimize architectures and compilers for certain 
benchmarks, but this almost always gets caught.  Independent benchers
will usually change around the standard benchmarks enough to fool 
special recognition routines in the compilers.

Robert Cousins
Dept. Mgr, Workstation Dev't
Data General

Speaking for myself alone.