[comp.benchmarks] IBM RS/6000 MFLOPS

mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (11/22/90)
On 19 Nov 90 20:41:35 GMT,andyrose@batcomputer.tn.cornell.edu(Andy Rose)said:

	[... IBM 550 shown running LINPACK at 65 MFLOPS...]

On 20 Nov 90 02:02:49 GMT, mccalpin@perelandra.cms.udel.edu I replied:

JDM> Please note that this performance level is for the 1000x1000 "anything
JDM> goes" LINPACK test, not the more commonly quoted 100x100 "you can't
JDM> even modify the comments" version. 

On 21 Nov 90 16:05:08 GMT, grunwald@foobar.colorado.edu (Dirk Grunwald) said:

Dirk> One should also note that the ``anything goes'' doesn't mean that they
Dirk> coded it in assembler or anything, which is what you need to do to get
Dirk> speed out of the Intel i860. In fact, they just used block algorithms
Dirk> to increase cache locality, if I recall.

I should have been more explicit.  Yes, it is absolutely true that
there is not need to go to assembler to get such performance levels on
the IBM RS/6000 machines.  Researchers at the Supercomputer
Computations Research Institute at Florida State University have
attained LINPACK 1000x1000 performances of up to 36 MFLOPS on an IBM
RS/6000 Model 530 by using block-mode algorithms written in ordinary
FORTRAN.   Presumably IBM will release the full LAPACK package with
these sorts of performance levels within the next few (many?)
months....   If we assume that the machines' speeds scale linearly
with the clock, then the SCRI/FSU code should run at 59 MFLOPS --
within 10% of what IBM is claiming....
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@brahms.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET