[comp.parallel] LINPACK vs. parallel LINPACK

jet@karazm.math.uh.edu (J Eric Townsend) (06/06/91)

In article <1991Jun3.233741.8570@elroy.jpl.nasa.gov> stevo@elroy.jpl.nasa.gov (Steve Groom) writes:
>Can someone explain how "massively parallel LINPACK" is different from
>regular LINPACK?  What considerations for communication are made

Here's my guess, based on working with these machines, going to Intel
iPSC/860 class, etc etc.  I'm not a numerical analyst, I'm a system
geek, so take this with a grain or two of salt.

If "parallel LINPACK" means the problem size is too large to fit
in the memory of a single node, then it carries a lot of meaning.
The problem becomes not just "how fast can I do LINPACK", but "how fast
can I do some part of LINPACK, shuffle some data from processor to
processor, and do some more LINPACK", etc etc..

Intel has a product (that used to be) called "LOOCS", which stands
for "Large Out Of Core Solver".  This package solves not only systems
which are too large for a single node, but systems that are too
large for the *total ram of the system*.  David Scott claims about
38 MFLOPS per node in double precision, which is pretty amazing
since the theorhetical peak is 40 MFLOPS double precision. (Of
course, this speed is achieved by use of hand-coding i860
assmebly.  Ouch.)

On a similar vein, a researcher here at UH doing work on molecular
dynamics has a 16 node iPSC/860 going faster than a Cray-2 using
the stock fortran compiler.  (Look for a paper coming soon to a
journal near you. :-)

--
J. Eric Townsend - jet@uh.edu - bitnet: jet@UHOU - vox: (713) 749-2126
Skate UNIX! (curb fault: skater dumped)

   --  If you're hacking PowerGloves and Amigas, drop me a line. --