mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (11/22/90)
On 19 Nov 90 20:41:35 GMT,andyrose@batcomputer.tn.cornell.edu(Andy Rose)said: [... IBM 550 shown running LINPACK at 65 MFLOPS...] On 20 Nov 90 02:02:49 GMT, mccalpin@perelandra.cms.udel.edu I replied: JDM> Please note that this performance level is for the 1000x1000 "anything JDM> goes" LINPACK test, not the more commonly quoted 100x100 "you can't JDM> even modify the comments" version. On 21 Nov 90 16:05:08 GMT, grunwald@foobar.colorado.edu (Dirk Grunwald) said: Dirk> One should also note that the ``anything goes'' doesn't mean that they Dirk> coded it in assembler or anything, which is what you need to do to get Dirk> speed out of the Intel i860. In fact, they just used block algorithms Dirk> to increase cache locality, if I recall. I should have been more explicit. Yes, it is absolutely true that there is not need to go to assembler to get such performance levels on the IBM RS/6000 machines. Researchers at the Supercomputer Computations Research Institute at Florida State University have attained LINPACK 1000x1000 performances of up to 36 MFLOPS on an IBM RS/6000 Model 530 by using block-mode algorithms written in ordinary FORTRAN. Presumably IBM will release the full LAPACK package with these sorts of performance levels within the next few (many?) months.... If we assume that the machines' speeds scale linearly with the clock, then the SCRI/FSU code should run at 59 MFLOPS -- within 10% of what IBM is claiming.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@brahms.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET