[comp.sys.m88k] Fastest 88k

cpca@iceman.jcu.oz (C Adams) (10/31/90)

I have recently hear about an advertisement claiming 60 MIPS for an
88k based machine (from motorola).  I find this figure way too high,
at least for a single CPU system.

So, what is the fastest 88k around? I'd like to know the clockspeed
and some meaningless MIPS figures.

I'm sure this has been discussed before, so reply by email if it
has please.

Thanks in advance.

Colin Adams        You need to have a woman, before you can have a Sun :-)
Email Address -    cpca@marlin.jcu.edu.au

mash@mips.COM (John Mashey) (11/01/90)

In article <1172@iceman.jcu.oz> cpca@iceman.jcu.oz (C Adams) writes:
>I have recently hear about an advertisement claiming 60 MIPS for an
>88k based machine (from motorola).  I find this figure way too high,
>at least for a single CPU system.
>
>So, what is the fastest 88k around? I'd like to know the clockspeed
>and some meaningless MIPS figures.

The fastest 88K for which SPEC numbers have been published
was the  33MHz Motorola 8612, with 2 88200 chips, which was
published Winter 1990.
(However, note that, unless something has changed recently, neither
the Motorola systems products nor any DG products are yet shipped at 33MHz.)

Regarding mips-ratings, there is a fairly confusing state.  In particular,
it is sad, but true, that mips-ratings are pretty arbitrary.
Hedre is a small table, taken from "Your Mileage May Vary", Issue 2.0,
which summarizes a lot of SPEC data.
SPEC Intgr = Integer Subset of SPEC, which corresponds pretty closely to
many people's idea of a real-VAX-mip.
SPEC Float = FLoat susbset of SPEC
SPECmark = SPECmark, all 10 benchmarks, somewhat loaded towards FP.
Publ mips = rating the vendor assigns
Integer % = SPEC integer / (published mips)
Publ data: 2 = Winter 90, 3 = Spring 90, 4 = Summer 90, c = estimated
	(The 2 MPC numbers were estimated by giving them the same performance
	as the earlier 8864SP machines, even though those have 4X bigger
	caches.)

SPEC	SPEC 	SPEC	Publ	Integer	Date	System
Intgr	Float	mark	mips	%	Publ
11.3	8.3	9.4	16.0	71%	3	DG AV410, 20MHz, 32K cache
15.3	11.3	12.7	20.0	76%	2	DG AV6200, 25MHz, 32K

14.6	10.8	12.2	17.0	86%	2	Motorola 8864SP, 20MHz, 128K
18.3	13.5	15.2	21.0	87%	2	Motorola 8864SP, 25MHz, 128K
21.4	15.8	17.8	25.0	85%	2	Motorola 8612, 33MHz, 32K
14.6	10.8	12.2	27.0	54%	c	Motorola MPC-100, 20MHz, 32K
18.3	13.5	15.2	33.8	54%	c	Motorola MPC-200, 25MHz, 32K

19.4	16.8	17.8	20.0	97%	4	MIPS Magnum, 25MHz, 64K

Also, there's a new DG product, the AV100, which runs at 16.7MHz, and is
now labeled 17 mips, for a new low cost of $235/mips.
As illustrated above, when calibrated against realistic integer benchmarks,
the industry's idea of a mips varies by factor of almost 2,
ranging from 54% to 97%. Above,  I picked the two ends of the spectrum.
the number of product lines in each % group is as follows:
<60%	3
60-70%	3
70-80%	2
80-90%	4
90-100%	3

(There are no existing product lines, for which the SPEC integer number
is higher than the claimed vax-mips number.  surprise :-)
Of course, in such an environment, it is unclear what is meant by 3 digits
of accuracy in a cost/mips number....

A little explanation might be need to explain the difference between the
earlier Motorola products (whose integer % is around 86%),
and the later ones, which are around 54%.
The later ones apparently switched to Dhrystone 1.1, probably following
IBM, boosting the mips-rating by >50%, although the performance
probably stayed about the same (compiler improvements versus smaller
cache, probably cancel, within a few percent.) 

To answer the original question, the 60MIPS number is probably for
a two-processor system (MPC-200 with 2 CPUs), gotten by adding
the MIPS together.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

tom@ssd.csd.harris.com (Tom Horsley) (11/01/90)

OK, I have been reading these notes on SPEC numbers for the 88k, and my
fingers started twitching out of control with the desire to publish our
SPECtacular numbers, so here they are:

Machine:   25MZ Harris Night Hawk - 1 processor - 128K cache (64I/64D)
Compilers: Harris Common Code Generator (CCG) C and Fortran

SPEC benchmark run times (all times are in seconds):

    BENCHMARK     VAX time   wall time     SPEC  Ratios
                                         Motorola   Harris

   001.gcc1.35      1482.0        64.0     17.5      23.2
  008.espresso      2266.0       108.0     19.4      21.0
  013.spice2g6     23951.0      1804.0     12.5      13.3
    015.doduc       1863.0       158.0     10.1      11.8
    020.nasa7      20093.0      1258.0     15.2      16.0
     022.li         6206.0       315.0     20.7      19.7
   023.eqntott      1101.0        65.0     16.0      16.9
  030.matrix300     4525.0       218.0     18.4      20.8
    042.fpppp       3038.0       167.0     14.7      18.2
   047.tomcatv      2649.0       188.0     11.6      14.1
                                                    ******
 GEOMETRIC MEAN     3867.7       226.0     15.2     *17.1*
                                                    ******

Yes, that's 17.1 on a box that is fairly similar to the same Motorola box
that got 15.2, and its all done with compiler technology (not blue smoke and
mirrors :-). (The Motorola numbers were taken from the last published
numbers in a SPEC newsletter for a 25MHZ box).

Harris CCG compilers are a family of highly optimizing compilers (C,
Fortran, and Ada) available for the entire line of Harris Night Hawk
realtime computer systems (OK, OK, I apoligize for the blatant plug).
Besides standard optimizations we have also put a lot of work into things
that greatly benefit RISC architectures in general and the 88k in
particular.

It is also worth noting that many of the above times are better than the
best numbers for 25MHZ MIPS workstations (most of the double precision
intensive benchmarks do much better on MIPS - this should not suprise
anyone, it certainly does not suprise me).

Disclaimer:

No these are not the compilers we are shipping in our current release, they
are our inhouse development versions which will eventually filter out into
future releases. We certainly hope that the compilers we eventually release
will do even better.
--
======================================================================
domain: tahorsley@csd.harris.com       USMail: Tom Horsley
  uucp: ...!uunet!hcx1!tahorsley               511 Kingbird Circle
                                               Delray Beach, FL  33444
+==== Censorship is the only form of Obscenity ======================+
|     (Wait, I forgot government tobacco subsidies...)               |
+====================================================================+

mash@mips.COM (John Mashey) (11/01/90)

In article <TOM.90Oct31160947@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
>OK, I have been reading these notes on SPEC numbers for the 88k, and my
>fingers started twitching out of control with the desire to publish our
>SPECtacular numbers, so here they are:

In my opinion, this is a GOOD posting: it actually has some useful data,
as opposed to random rumor-mongering.

>Machine:   25MZ Harris Night Hawk - 1 processor - 128K cache (64I/64D)
>Compilers: Harris Common Code Generator (CCG) C and Fortran

>SPEC benchmark run times (all times are in seconds):
....


>It is also worth noting that many of the above times are better than the
>best numbers for 25MHZ MIPS workstations (most of the double precision
>intensive benchmarks do much better on MIPS - this should not suprise
>anyone, it certainly does not suprise me).
Of these times: on 4/10 it beats a Magnum, and loses on 6 (64KB cache).
	on 3/10 it beats a DS5000/200, and loses on 7 (128KB cache)

>Disclaimer:

>No these are not the compilers we are shipping in our current release, they
>are our inhouse development versions which will eventually filter out into
>future releases. We certainly hope that the compilers we eventually release
>will do even better.

Good disclaimer.  Be warned (from past experience!) sometimes people put in
optimizations that break things when lots of programs are compiled,
and the optimizations have to be pulled back to meet a release schedule.
However, this is a fair, and properly-disclaimered posting,
and clearly shows the best numbers I've seen for an 88K.
Now, just out of curiosity:
	a) What are the numbers you get using the current production compilers?
	b) About how far apart (in time) are those two versions?
	c) Do you feel that tuneups done to improve SPEC numbers carry over
	into improvements on other programs ... or not?
Any comments on those that you'd be willing to make would be good...
especially item c) would be interesting to a lot of people.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

mash@mips.COM (John Mashey) (11/01/90)

In article <205@coplex.UUCP> dean@coplex.UUCP (Dean Brooks) writes:

>Our Motorola 8864DP (Dual Processor 88100 RISC) clocks in around 39 MIPS
>per processor.  This is much lower than their claim of 60 MIPS; but
>technically, the DP would give in around 78 MIPS in ideal processor slicing
>situations.

Since you've posted this number, could you explain for us on what
basis the 39 MIPS number is computed? 

In the absence of a calibration of a MIPS-rating, the only thing that
can be said is "no meaningful performance information" :-)
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

tom@ssd.csd.harris.com (Tom Horsley) (11/01/90)

>>>>> Regarding Re: Fastest 88k; mash@mips.COM (John Mashey) adds:

mash> Now, just out of curiosity:
mash> 	a) What are the numbers you get using the current production compilers?
mash> 	b) About how far apart (in time) are those two versions?
mash> 	c) Do you feel that tuneups done to improve SPEC numbers carry over
mash> 	into improvements on other programs ... or not?
mash> Any comments on those that you'd be willing to make would be good...
mash> especially item c) would be interesting to a lot of people.

a) I didn't pay a lot of attention to the numbers for the released compiler
   because we already had many of the optimizations under development at
   that time and knew we would get a lot better. All I remember was that the
   number was better than 15.2, but the individual benchmarks results were
   much more mixed (and the number was a lot closer to 15.2 than the 17.1 we
   are getting today).

b) It is difficult to say how far apart in time the compilers are since the
   advanced development was going on at the same time as a different
   baseline was being stabalized and packaged up for the release. A
   ball-park figure would be "a few months".

c) I would say that all the improvements we made are generally useful.  We
   look at a lot more benchmarks than just SPEC (some of them are rather
   large real customer applications, or benchmarks derived from those
   applications). We like to pick which optimizations to work on based on
   cost/benefit analysis - if we don't see the need for something in a lot
   of places, we generally don't work on it.

   Some of the SPEC benchmarks reacted fairly dramatically to some of our
   optimizations, but the optimizations were not designed specifically to
   get that reaction from SPEC. For example: the biggest single improvement
   came from a combination of loop-unrolling, teaching the instruction
   scheduler how to safely shuffle some loads past some stores (to keep the
   data unit pipeline going), and teaching the register allocator to pick
   registers in such a way as to allow the instruction scheduler maximum
   flexibility (to keep the floating point pipeline going). All of this is
   great stuff and is useful in almost any program.

   The SPEC matrix300 benchmark, however, spends 99.9% of its time in a
   single matrix multiply-and-add loop. When the above set of optimizations
   hit the matrix300 benchmark, the performance skyrocketed. This does not
   mean our optimizations are not generally useful, but it does mean that
   real programs which do actual work may not see a similar performance
   boost (but they certainly should get better).
--
======================================================================
domain: tahorsley@csd.harris.com       USMail: Tom Horsley
  uucp: ...!uunet!hcx1!tahorsley               511 Kingbird Circle
                                               Delray Beach, FL  33444
+==== Censorship is the only form of Obscenity ======================+
|     (Wait, I forgot government tobacco subsidies...)               |
+====================================================================+

andrew@frip.WV.TEK.COM (Andrew Klossner) (11/02/90)

C Adams (cpca@iceman.jcu.oz) writes:

	"I have recently hear about an advertisement claiming 60 MIPS
	for an 88k based machine (from motorola).  I find this figure
	way too high, at least for a single CPU system."

Motorola's current line of systems (based on their VME188) come with
one, two, or four CPUs on a card.  The 60 MIPS number no doubt comes
from multiplying some per-CPU number by either two or four.

It's a nice card, but with a cache disadvantage: since it implements a
single M bus, there can be no more than eight 88200 (CMMU) chips.  As
you increase the number of CPUs, you decrease the number of CMMUs per
CPU, and the cache miss rate climbs noticeably for large application
programs.

John Mashey (mash@mips.COM) writes:

	"The fastest 88K for which SPEC numbers have been published was
	the 33MHz Motorola 8612, with 2 88200 chips, which was
	published Winter 1990."

We had a 33MHz system with eight 88200 chips, but never got a chance to
run SPEC, sigh.  Those systems are now powered down and are sitting in
a sealed building on the Tektronix Wilsonville campus.

  -=- Andrew Klossner   (uunet!tektronix!frip.WV.TEK!andrew)    [UUCP]
                        (andrew%frip.wv.tek.com@relay.cs.net)   [ARPA]

mash@mips.COM (John Mashey) (11/02/90)

This seemed worth reposting into comp.arch: the topic (is making your
compiler better for SPEC benchmarks SPEC-specific are not) has come up
now and then.  Here's an additional opinion:

In article <TOM.90Nov1072249@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
>>>>>> Regarding Re: Fastest 88k; mash@mips.COM (John Mashey) adds:
....
>mash> 	c) Do you feel that tuneups done to improve SPEC numbers carry over
>mash> 	into improvements on other programs ... or not?
....
>c) I would say that all the improvements we made are generally useful.  We
>   look at a lot more benchmarks than just SPEC (some of them are rather
>   large real customer applications, or benchmarks derived from those
>   applications). We like to pick which optimizations to work on based on
>   cost/benefit analysis - if we don't see the need for something in a lot
>   of places, we generally don't work on it.
>
>   Some of the SPEC benchmarks reacted fairly dramatically to some of our
>   optimizations, but the optimizations were not designed specifically to
>   get that reaction from SPEC. For example: the biggest single improvement
>   came from a combination of loop-unrolling, teaching the instruction
>   scheduler how to safely shuffle some loads past some stores (to keep the
>   data unit pipeline going), and teaching the register allocator to pick
>   registers in such a way as to allow the instruction scheduler maximum
>   flexibility (to keep the floating point pipeline going). All of this is
>   great stuff and is useful in almost any program.
>
>   The SPEC matrix300 benchmark, however, spends 99.9% of its time in a
>   single matrix multiply-and-add loop. When the above set of optimizations
>   hit the matrix300 benchmark, the performance skyrocketed. This does not
>   mean our optimizations are not generally useful, but it does mean that
>   real programs which do actual work may not see a similar performance
>   boost (but they certainly should get better).

Can anyone else add any more comments, or examples?
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

newton@ggumby.cs.caltech.edu (Mike Newton) (11/02/90)

In <TOM....cx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
>>>>>> Regarding Re: Fastest 88k; mash@mips.COM (John Mashey) adds:
>mash> Now, just out of curiosity:
>mash> 	a) What are the numbers you get using the current production compilers?
>mash> 	b) About how far apart (in time) are those two versions?

I'd like to add:
() How do the results compare to the latest version of Tom Wood/DG's port 
of GCC?  (Compared to earlier versions it should do better, esp. for fp.)

- mike

(newton@gumby.cs.caltech.edu

        ^^^^^ ___ even newer addresss.... )

--
newton@csvax.cs.caltech.edu   Beach Bums Anonymous, Pasadena President
Caltech 256-80		      (Hilo -- it's not just another rainy day!)
Pasadena CA 91125	      Life's a beach.  Then you graduate.

malc@iconsys.icon.com (Malcolm Weir) (11/03/90)

In article <1172@iceman.jcu.oz> cpca@iceman.jcu.oz (C Adams) writes:
>I have recently hear about an advertisement claiming 60 MIPS for an
>88k based machine (from motorola).  I find this figure way too high,
>at least for a single CPU system.
>

Moto's 60 MIPS figure is allegedly derived from running the tests that
IBM used to "prove" that the RS/6000 is faster than a speeding bullet...
It's attributed to their dual 25MHz 88K MPC/300.

>Colin Adams        You need to have a woman, before you can have a Sun :-)
>Email Address -    cpca@marlin.jcu.edu.au

Malc.