[comp.arch] i860 benchmarks?

cca@newton.physics.purdue.edu (Charles C. Allen) (05/03/90)

Where can I get ahold of some benchmarks for the i860 processor?  Any
system configuration is OK (please mention it), but I'm looking at a
brochure for the "Mercury MC860" in particular.  They claim 80Mflops
for a single processor at 40Mhz.  What kind of MFlops are those?

Charles Allen			Internet: cca@newton.physics.purdue.edu
Department of Physics		HEPnet:   purdnu::allen, fnal::cca
Purdue University		talknet:  317/494-9776
West Lafayette, IN  47907

pjg@acsu.Buffalo.EDU (Paul Graham) (05/07/90)

cca@newton.physics.purdue.edu (Charles C. Allen) writes:

|Where can I get ahold of some benchmarks for the i860 processor?  Any
|system configuration is OK (please mention it), but I'm looking at a
|brochure for the "Mercury MC860" in particular.  They claim 80Mflops
|for a single processor at 40Mhz.  What kind of MFlops are those?

as i understand it they simply replay the numbers they got from intel.
i believe these have some basis in linpack.

however (if it matters) despite announcements to the contrary mercury
does *not* have a fortran compiler and the people who return phone
calls don't seem to know much about it and the people who know about
it don't return calls.  i decided to buy something else.

preston@titan.rice.edu (Preston Briggs) (05/07/90)

In article <24706@eerie.acsu.Buffalo.EDU> pjg@acsu.Buffalo.EDU (Paul Graham) writes:
>cca@newton.physics.purdue.edu (Charles C. Allen) writes:
>|  They claim 80Mflops
>|for a single processor at 40Mhz.  What kind of MFlops are those?
>
>as i understand it they simply replay the numbers they got from intel.
>i believe these have some basis in linpack.

The 80 MFlop at 40 MHz number is the peak performance.
That is, it's the number you will never surpass.
Actual performance will be a lot lower, depending on your
application and compiler.

The best I've achieved is 48+ MFlops at 33 MHz,
for a 400x400 sigle precision matrix multiply.
This scales to about 64 MFlops if your memory will
keep up.  Note also that this was seriously restructured
code with the inner loop rewritten in assembly.
Note also that it's a factor of 21 faster than 
you'll get with a straightforward optimizing compiler.

--
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

mccoy@pixar.UUCP (Daniel McCoy) (05/08/90)

In article <3632@newton.physics.purdue.edu> cca@newton.physics.purdue.edu (Charles C. Allen) writes:
>Where can I get ahold of some benchmarks for the i860 processor?  Any
>system configuration is OK (please mention it), but I'm looking at a
>brochure for the "Mercury MC860" in particular.  They claim 80Mflops
>for a single processor at 40Mhz.  What kind of MFlops are those?

Peak hand-coded pipelined assembly language.  Guaranteed not to exceed. 

In my experience with porting a heavy scalar floating point
renderer (100,000+ lines of C), using a beta release Greenhills compiler
from last september, with -O turned on only in the most critical areas
(the code wouldn't run otherwise, I did say beta), a 33MHz i860 came in 
with times faster than an R2000 (DecStation 3100) but slower than a 
25Mhz R3000 (SGI-4D/2xx) (MIPS 1.31 compiler with -O2 everywhere).

Intel has said that they will release Specmarks soon.  
Probably around the time that a supported released compiler is available.
(Over a year after the announcement of the i860.)
I think the Specmarks will inject a much needed dose of reality
into Intel's marketing hype.  Don't get me wrong, the i860 is a pretty 
fast chip so the reality isn't that bad.  I'm sure a 40Mhz i860 with 
a well designed memory system will be competitive.  But that 80 MFlop 
number is only meaningful for applications that have really tight main 
loops that can be hand-coded, and that happen to fit the i860 pipeline 
model well.

The tools have a way to go to catch up with MIPS. 
The chip above might actually be faster than the R3000,
but it doesn't matter to me until a compiler can push it faster.
Touting this as a RISC chip (whatever that means) and then
saying you should hand code your loops and use a machine level
debugger seems a little strange to me.

Dan McCoy	{ucbvax,sun}!pixar!mccoy

(Standard disclaimers apply. Personal opinion only.)

pjg@acsu.Buffalo.EDU (Paul Graham) (05/08/90)

preston@titan.rice.edu (Preston Briggs) writes:

|pjg@acsu.Buffalo.EDU (Paul Graham) writes:
|>cca@newton.physics.purdue.edu (Charles C. Allen) writes:
|>|  They claim 80Mflops
|>|for a single processor at 40Mhz.  What kind of MFlops are those?
|>
|>as i understand it they simply replay the numbers they got from intel.
|>i believe these have some basis in linpack.

|The 80 MFlop at 40 MHz number is the peak performance.

gahhhh, indeed the linpack value i heard tossed about was 10-13 for what i
assume was the 100x100 matrix.  80 is guaranteed . . . not to be exceeded.

sorry about any confusion.