[comp.sys.super] Records and I/O

lerici@super.ORG (Peter W. Brewer) (06/08/90)
>        ===QCDPAX attained 12.25 GFLOPS peak speed===
> 
> Parallel Computer QCDPAX has reached the world-fastest(probably)
> effective speed in scientific calculations.   If any computer can
> exceed the speed of QCDPAX, please let us know.
>
> ...
> consuming part, 3 by 3 unitary matrix product, QCDPAX with 432 PU's
> pp.2-9).  Single link update time for the subspace heat bath method
> recorded the speed nearly 4 times as fast as that of CM-2, (CM-2's

I have heard GAMs rumblings in the background somewhere here Eugene why
don't we hear from him more .. :-) even if we risk sexual harassment 
suits :-)

First I don't think this group should degrade to become a HE_MAN bragging
session in MACHO FLOPS as above. Second the CM-2 has a peak rate of 
20GFlops , 16 has been achieved in some applications. The individual mentioned
below may not have used the latest compiler technology, may not have used
the sprint node model and may have not have done everything which could be
done. You can send mail to rlk@think.com, he will happily send you the results
of his fast FFT benchmark and he could point you to others in his organization
who such as Guy Steele, Danny Hillis, and consultants like Wolfram and Feynman.
Right now the CM-2 is the only commercially available supercomputer which can
deliver 16GFlops sustained that I know of.

> If any computer can exceed this speed, please let us know.
> We would like to know if our machine is really the world-fastest
> or not.

You've been informed.

Now getting back to I/O. I agree with GAM, I have found that big memory and
wide I/O paths make a big difference and this is where Crays really shine.
One reason why the CM-2 achieves what it does is the effectively memory \
bandwidth.. best seen with embarassingly parallel problems. Their big weakness
however is I/O. This is probably due to the fact it was built with late 70's
early 80s technology with an unfortunate strong early slant toward the slow
VAX architecture. Which is why I about 2 years ago when I was having problems
getting Suns to front-end it I thought that a Cray and specifically a Cray2 
with its great I/O characteristics would make a great front end. The CMIOC
runs at 3-4MB sec peak from a Sun4.. what happens if you could raise that to
100MB/sec. That is the major bottleneck in the CM. Data can be transfered now
at 10MB/sec using VMEIO. But the instruction queue is so short and slow due
to living with a Vax for so long. A Cray-2 with a CM-2 backend array processor..
or multiple CM-2s frontended by the Foreground processor could improve the Cray2
performance to >50GFLOPS sustained for many embarassingly parallel problems.
The Cray I/O channels are great! They just need something useful to feed besides
DD49s. I heard that Los Alamos may be trying to do this. I wonder what Eugene's
colleague Creon Leavitt is up to with his CM

-- Petet
-- 
Peter Brewer             ||||     |||||  |||||||||  ||||||  //|||||\  ||||||
lerici@super.org	 ||       ||__   ||     ||    ||   ||           ||
THE Supercomputing       ||       ||     ||^^^^^^\\   ||   ||           ||
Research Center ~~~      |||||||| |||||  ||       || |||||  \\|||||/  ||||||