lerici@super.ORG (Peter W. Brewer) (06/08/90)
> ===QCDPAX attained 12.25 GFLOPS peak speed=== > > Parallel Computer QCDPAX has reached the world-fastest(probably) > effective speed in scientific calculations. If any computer can > exceed the speed of QCDPAX, please let us know. > > ... > consuming part, 3 by 3 unitary matrix product, QCDPAX with 432 PU's > pp.2-9). Single link update time for the subspace heat bath method > recorded the speed nearly 4 times as fast as that of CM-2, (CM-2's I have heard GAMs rumblings in the background somewhere here Eugene why don't we hear from him more .. :-) even if we risk sexual harassment suits :-) First I don't think this group should degrade to become a HE_MAN bragging session in MACHO FLOPS as above. Second the CM-2 has a peak rate of 20GFlops , 16 has been achieved in some applications. The individual mentioned below may not have used the latest compiler technology, may not have used the sprint node model and may have not have done everything which could be done. You can send mail to rlk@think.com, he will happily send you the results of his fast FFT benchmark and he could point you to others in his organization who such as Guy Steele, Danny Hillis, and consultants like Wolfram and Feynman. Right now the CM-2 is the only commercially available supercomputer which can deliver 16GFlops sustained that I know of. > If any computer can exceed this speed, please let us know. > We would like to know if our machine is really the world-fastest > or not. You've been informed. Now getting back to I/O. I agree with GAM, I have found that big memory and wide I/O paths make a big difference and this is where Crays really shine. One reason why the CM-2 achieves what it does is the effectively memory \ bandwidth.. best seen with embarassingly parallel problems. Their big weakness however is I/O. This is probably due to the fact it was built with late 70's early 80s technology with an unfortunate strong early slant toward the slow VAX architecture. Which is why I about 2 years ago when I was having problems getting Suns to front-end it I thought that a Cray and specifically a Cray2 with its great I/O characteristics would make a great front end. The CMIOC runs at 3-4MB sec peak from a Sun4.. what happens if you could raise that to 100MB/sec. That is the major bottleneck in the CM. Data can be transfered now at 10MB/sec using VMEIO. But the instruction queue is so short and slow due to living with a Vax for so long. A Cray-2 with a CM-2 backend array processor.. or multiple CM-2s frontended by the Foreground processor could improve the Cray2 performance to >50GFLOPS sustained for many embarassingly parallel problems. The Cray I/O channels are great! They just need something useful to feed besides DD49s. I heard that Los Alamos may be trying to do this. I wonder what Eugene's colleague Creon Leavitt is up to with his CM -- Petet -- Peter Brewer |||| ||||| ||||||||| |||||| //|||||\ |||||| lerici@super.org || ||__ || || || || || THE Supercomputing || || ||^^^^^^\\ || || || Research Center ~~~ |||||||| ||||| || || ||||| \\|||||/ ||||||