[comp.arch] massive linpack

jbs@WATSON.IBM.COM (06/07/91)

         Richard Lethin asks:
At what size N for massive linpack does the accumulated numerical fuzz
swamp the precision available in a double precision number?
         Very large.  Typical relative error (in the vector norm sense)
in the solution is n*c*e, where n is matrix size, c is condition number
and e is machine epsilon (2**-52 for IEEE double).  Therefore if you
want a solution accurate to 1 part in 1000 (10 bits) and your matrix
has condition number less than 1000 (10 bits) your n can be 2**30 or
10**9.  This would require 10**18 storage and 10**27 ops to solve so
is well beyond current machines.  Note once you have any bits in your
answer you can obtain more by iterative improvement an order N*2 pro-
cess.
         Richard Lethin asks:
Aren't really large problems of interest sparse matrices anyway?  Who
holds the record for massive sparse matrix solves?
         People should try to avoid solving problems in ways which in-
volve large dense matrix solves simply because they are so expensive.
I suspect in many cases where large dense matrices are solved, a diff-
erent method would be faster but I have no real evidence for this.
         It is unreasonable to ask for record sparse solves since the
difficulty is strongly dependent on the structure of the problem.
                       James B. Shearer

chased@rbbb.Eng.Sun.COM (David Chase) (06/08/91)

jbs@WATSON.IBM.COM writes:
>
>         Richard Lethin asks:
>At what size N for massive linpack does the accumulated numerical fuzz
>swamp the precision available in a double precision number?

>         Very large.  Typical relative error (in the vector norm sense)
>in the solution is n*c*e, where n is matrix size, c is condition number
>and e is machine epsilon (2**-52 for IEEE double).  Therefore if you
>want a solution accurate to 1 part in 1000 (10 bits) and your matrix
>has condition number less than 1000 (10 bits) your n can be 2**30 or
>10**9.  This would require 10**18 storage and 10**27 ops to solve so
>is well beyond current machines.  Note once you have any bits in your
>answer you can obtain more by iterative improvement an order N*2 pro-
>cess.

I don't have the time to prove that you're wrong here, but I think
that you are either wrong, or else being very misleading.  In general,
why should we believe you?  What do you usually do?  Goldberg and
Hough are experts, as are the people that I took numerical analysis
from too many years ago.  Are you an expert?  (I only know enough to
know when I should be suspicious, but I'm suspicious.)

First, you haven't said anything about worst case, which must be
worried about.  In addition, you haven't discussed a typical condition
number.

Second, I know that the NA people I once worked with were interested
in using exact arithmetic for error analysis. (There's a paper by
Boehm, O'Donnell, and Riggle in the 1986 Lisp & FP conference -- when
I say "exact", I mean an answer guaranteed to be within epsilon of the
true answer).  This means that double precision isn't good enough for
very accurate error analysis, though I don't know how they defined
"very accurate".  I suspect that the worst case bounds on error
estimation were not satisfactory.

Third, "LINPACK" is not a numerical analysis algorithm.  There is a
wide variety of algorithms in LINPACK, with varying stability and
cost.  If you don't specify the algorithm, how are we to know what you
are talking about?  Is it GE, GEPP, GEFP, QR, SVD, or Cholesky?  Band
matrix, or not?  The worst case on GEPP (which is what I think you
meant) is still pretty bad, though I've also heard (from an expert)
that it is almost never seen.  If (as I think you are saying) GEPP is
so good, then why do we bother with more stable algorithms like QR and
SVD?

David Chase
Sun

jbs@WATSON.IBM.COM (06/08/91)

          Richard Lethin asked:
>At what size N for massive linpack does the accumulated numerical fuzz
>swamp the precision available in a double precision number?
          I answered:
>         Very large.  Typical relative error (in the vector norm sense)
>in the solution is n*c*e, where n is matrix size, c is condition number
>and e is machine epsilon (2**-52 for IEEE double).  Therefore if you
>want a solution accurate to 1 part in 1000 (10 bits) and your matrix
>has condition number less than 1000 (10 bits) your n can be 2**30 or
>10**9.  This would require 10**18 storage and 10**27 ops to solve so
>is well beyond current machines.  Note once you have any bits in your
>answer you can obtain more by iterative improvement an order N*2 pro-
>cess.
          David Chase objects:
I don't have the time to prove that you're wrong here, but I think
that you are either wrong, or else being very misleading.  In general,
why should we believe you?  What do you usually do?  Goldberg and
Hough are experts, as are the people that I took numerical analysis
from too many years ago.  Are you an expert?  (I only know enough to
know when I should be suspicious, but I'm suspicious.)
First, you haven't said anything about worst case, which must be
worried about.  In addition, you haven't discussed a typical condition
number.
Second, I know that the NA people I once worked with were interested
in using exact arithmetic for error analysis. (There's a paper by
Boehm, O'Donnell, and Riggle in the 1986 Lisp & FP conference -- when
I say "exact", I mean an answer guaranteed to be within epsilon of the
true answer).  This means that double precision isn't good enough for
very accurate error analysis, though I don't know how they defined
"very accurate".  I suspect that the worst case bounds on error
estimation were not satisfactory.
Third, "LINPACK" is not a numerical analysis algorithm.  There is a
wide variety of algorithms in LINPACK, with varying stability and
cost.  If you don't specify the algorithm, how are we to know what you
are talking about?  Is it GE, GEPP, GEFP, QR, SVD, or Cholesky?  Band
matrix, or not?  The worst case on GEPP (which is what I think you
meant) is still pretty bad, though I've also heard (from an expert)
that it is almost never seen.  If (as I think you are saying) GEPP is
so good, then why do we bother with more stable algorithms like QR and
SVD?

          To clarify I was referring to the numerical properties of
Gaussian elimination with partial pivoting (GEPP) when solving a large
dense system of linear equations.  This is what the TPP (towards peak
performance) linpack benchmark does (on a system of size 1000).  I
presume the massive linpack benchmark is the same thing with larger
matrices. Someone queried whether the quoted results were for problems
which required pivoting which I agree is a good question.
          Anyone desiring further authority for my statements about
GEPP should be able to find it in any good mathematics library.  One
reference is the book "Numerical Methods and Software" (David Kahaner,
Cleve Moler and Stephen Nash, Prentice Hall, 1989).  The accuracy of
GEPP is discussed in chapter 3 (esp. p.66-p.67).  It is well known
there are certain pathological matrices for which GEPP does not perform
well numerically. These matrices have the property that in the process
of computing the factorization certain intermediate terms are formed
which are much larger than the elements in the original matrix.  These
terms spoil the error analysis.  If this does not happen then you are
certain to be ok. This problem is almost never seen in practice.  There-
fore my statement about typical behavior.
          If you are using an appropriate numerical method the condition
number generally will be determined by the stabilty of the physical
system you are modeling and not by the number of points in your discret-
ization.  If your problem uses real inexact data the condition number
best not be too large or your answer is going to be garbage regardless
of how accurately you compute it.
          QR and SVD are used for solving least squares problems.  Least
squares problems can be formulated as linear equations (the normal equa-
tions) but doing so unnecessarily increases the condition number.  This
is not a fault with GEPP but with the formulation.  (This is discussed
on p202-203 of the above referenced book.)
          As to credibility: I prefer to judge participants in the news-
groups by the quality of their posts rather than by their "credentials"
(or lack thereof).  However for those readers who care about credentials
I do have a few.  I have a PhD in mathematics from MIT.  Numerical anal-
ysis is not my field of specialization but I have been exposed to it
over the years, I have access to a mathematics library and I hopefully
have enough mathematical sophistication to avoid gross misstatements.
As to practical experience I have contributed to IBM's VS Fortran (ver-
sion 2) intrinsic function library and to IBM's Engineering and Scienti-
fic Subroutine Library (ESSL).
                         James B. Shearer

csrdh@marlin.jcu.edu.au (Rowan Hughes) (06/08/91)

In <9106070135.AA02947@ucbvax.Berkeley.EDU> jbs@WATSON.IBM.COM writes:

>         Richard Lethin asks:
>At what size N for massive linpack does the accumulated numerical fuzz
>swamp the precision available in a double precision number?
>         Very large.  Typical relative error (in the vector norm sense)
>in the solution is n*c*e, where n is matrix size, c is condition number
>and e is machine epsilon (2**-52 for IEEE double).  Therefore if you
>want a solution accurate to 1 part in 1000 (10 bits) and your matrix
>has condition number less than 1000 (10 bits) your n can be 2**30 or

I would have thought  (N**2)*C*E was more appropriate.

>         Richard Lethin asks:
>Aren't really large problems of interest sparse matrices anyway?  Who
>holds the record for massive sparse matrix solves?

Yes, large problems (N > 10,000) are typically sparse, and with strong
diagonal banding. These can only be solved with iterative methods, and
only on vector, or large parallel machines. Iterative methods usually
have no sensitivity to round-off errors since the original matrix is
used continuously in the solution. Successive Over Relaxation was
the most commonly used method until a few years ago (SOR). It had some
very severe restrictions on the types of matrix problems it could solve;
the diagonal dominance condition. Biconjugate gradient methods, using
Chebychev orthogonalilization, are now the rage. They're iterative,
but round-off errors can eventually destroy the solution. The largest
problem I've seen solved is for N=100,000. Bico does'nt have such
a severe restriction regarding diagonal dominance. Bico is well
suited to vector architectures; thats what it was designed for. It
can also work quite well on parallel machines.

-- 
Rowan Hughes                                James Cook University
Marine Modelling Unit                       Townsville, Australia.
Dept. Civil and Systems Engineering         csrdh@marlin.jcu.edu.au

dik@cwi.nl (Dik T. Winter) (06/09/91)

(It would simplify matters a bit if you clearly distinguish what you quote and
what you write yourself.  On occasion your answer comes immediately after a
quoted paragraph, but you insert a blank line when quoting multiple paragraphs.
Is that correct?)

In article <9106080254.AA12624@ucbvax.Berkeley.EDU> jbs@WATSON.IBM.COM writes:
 >                   I suspect that the worst case bounds on error
 > estimation were not satisfactory.
They are reasonable if you do backward error analysis, they tend to be very
bad when doing forward error analysis.

 >                                  If this does not happen then you are
 > certain to be ok. This problem is almost never seen in practice.  There-
 > fore my statement about typical behavior.
This is again considering backward error analysis.  Yes, most numerical
algebra literature is concerned with the backward analysis.

 >           QR and SVD are used for solving least squares problems.
SVD is a very good tool to solve a set of linear equations if:
1.  the condition number is large.
2.  the matrix is too large to be stored in memory.
There may be more reasons.

 >                                                                    Least
 > squares problems can be formulated as linear equations (the normal equa-
 > tions) but doing so unnecessarily increases the condition number.
True, the condition number is squared.

 >           As to credibility:
If you want my credibility:

I have been working in numerical algebra since 1968.
Linear equations, eigenvalues, singular value decomposition etc.
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

carroll@ssc-vax (Jeff Carroll) (06/14/91)

In article <1991Jun8.055711.13457@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>
>I would have thought  (N**2)*C*E was more appropriate.
>
>>         Richard Lethin asks:
>>Aren't really large problems of interest sparse matrices anyway?  Who
>>holds the record for massive sparse matrix solves?
>
>Yes, large problems (N > 10,000) are typically sparse, and with strong
>diagonal banding. These can only be solved with iterative methods, and
>only on vector, or large parallel machines. Iterative methods usually

There is at least one fallacy here. I have worked on at least one class
of problem involving boundary integral methods which is both large by
this definition and *dense*. We typically solve these by LU decomposition.

I submit that if we can solve large dense problems by direct methods, we
can certainly solve sparse problems of the same size by the same methods.

One (here nameless) vendor of vector processors put himself at a
considerable disadvantage in a procurement here once by assuming that a
large problem had to be sparse, rather than reading the RFP.

-- 
Jeff Carroll		carroll@ssc-vax.boeing.com

"...and of their daughters it is written, 'Cursed be he who lies with 
any manner of animal.'" - Talmud

csrdh@marlin.jcu.edu.au (Rowan Hughes) (06/14/91)

In <4112@ssc-bee.ssc-vax.UUCP> carroll@ssc-vax (Jeff Carroll) writes:
>>  csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>>Yes, large problems (N > 10,000) are typically sparse, and with strong
>>diagonal banding. These can only be solved with iterative methods, and
>>only on vector, or large parallel machines. Iterative methods usually
>There is at least one fallacy here. I have worked on at least one class
>of problem involving boundary integral methods which is both large by
>this definition and *dense*. We typically solve these by LU decomposition.
>I submit that if we can solve large dense problems by direct methods, we
>can certainly solve sparse problems of the same size by the same methods.

This is now only slightly related to architecture, but I'll give my 2c
worth.
If Jeff has to solve large (N>5000) dense problems, then he has a real
problem. In my area (fluid dynamics) the bulk of matrices, originating from
the Navier-Stokes eqns, are large (n=100,000 is common) and very sparse,
typically  25 non-zero elements per row. Unfortunately they often have
zero's on the diagonal (psi-zeta form of NS). These types of problems are
best solved by methods specifically designed for sparse banded matrices.
Direct methods (LU, Householder etc) will work quite well, but they
will never have the capabilities of routines like BICO. I'd be so bold
as to suggest that Jeff's type of problem is more the exception, than
the rule. The largest problems capable of actually being solved will
always be sparse.

Philisophical note:
BICO is one of the most elegant solution methods I've ever seen; its
far more powerful than SOR, and fits perfectly into vector and parallel
architectures with very little code change.

-- 
Rowan Hughes                                James Cook University
Marine Modelling Unit                       Townsville, Australia.
Dept. Civil and Systems Engineering         csrdh@marlin.jcu.edu.au

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (06/14/91)

In article <4112@ssc-bee.ssc-vax.UUCP> carroll@ssc-vax (Jeff Carroll) writes:
>In article <1991Jun8.055711.13457@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>>
>>I would have thought  (N**2)*C*E was more appropriate.
>>
>>>         Richard Lethin asks:
>>>Aren't really large problems of interest sparse matrices anyway?  Who
>>>holds the record for massive sparse matrix solves?
>>
>>Yes, large problems (N > 10,000) are typically sparse, and with strong
>>diagonal banding. These can only be solved with iterative methods, and
>>only on vector, or large parallel machines. 
>

Eigenvalue matrix problems in computational chemistry are often sparse,
of size 10000x10000 to 10 million by 10 million, not diagonally banded, and
are frequently done on NON-vector, NON-parallel machines (slowly). The
methods are indeed iterative.


Doug McDonald

carroll@ssc-vax (Jeff Carroll) (06/15/91)

In article <1991Jun14.062703.20325@marlin.jcu.edu.au> csrdh@marlin.jcu.edu.au (Rowan Hughes) writes:
>This is now only slightly related to architecture, but I'll give my 2c
>worth.

Yup.

Probably belongs in sci.math.num-analysis, but I'm afraid to post there :^)

On the other hand, the IEEE floating point argument probably belongs there
too, but the guy from IBM would find that audience even less hospitable
than this one.

>If Jeff has to solve large (N>5000) dense problems, then he has a real
>problem. In my area (fluid dynamics) the bulk of matrices, originating from
>the Navier-Stokes eqns, are large (n=100,000 is common) and very sparse,
>typically  25 non-zero elements per row. Unfortunately they often have
>zero's on the diagonal (psi-zeta form of NS). These types of problems are
>best solved by methods specifically designed for sparse banded matrices.
>Direct methods (LU, Householder etc) will work quite well, but they
>will never have the capabilities of routines like BICO. 

You're right; we had a real problem. We had two FPS 264s running 168
hours a week, and we needed a special dispensation from upper management
to take the machines down for periodic maintenance. Then we bought a couple
of hypercubes, and suddenly we had more computer time than we knew what to
do with... for a few weeks.

My field is electromagnetics (not static magnetics), and I'll admit that
computational EM has not yet reached the level of sophistication of CFD,
but we haven't figured out a good way to use FEM on Maxwell's equations
yet, due to the hyperbolicity of the PDEs. There are some people who
use finite differences on the type of problems we're solving, but we've
found that the boundary integral methods we use give much more satisfactory
answers. A couple of years ago I worked with some guys who were trying to solve
Maxwell's equations by modifying one of their aero codes, but the answers
were just plain wrong.

I have nothing against using sparse algorithms (or iterative methods) on 
sparse problems. I'm just pointing out that there is a certain segment of
the aerospace industry that is interested in solving large dense problems,
sometimes in environments in which you can't devote $O(10^7) of machinery
to solving them.

>I'd be so bold
>as to suggest that Jeff's type of problem is more the exception, than
>the rule. 

Probably. EM engineers are just learning how to use large computers, for
the most part; but they'll be using them more and more as time goes by.
There's a small group known as the Applied Computational Electromagnetics
Society (ACES) which I keep intending to join, which is largely composed
of people who do this sort of thing (as well as the people who use iterative
methods to solve finite difference formulations). It's about two years off
the ground now.

But I'm sure that EM people aren't the only ones in the world using boundary
integral methods that result in dense formulations...

>The largest problems capable of actually being solved will
>always be sparse.

This is probably true too in general, but the answer to this question in
any specific case is somewhat architecture dependent (have we been booted
out of comp.arch yet?); direct matrix methods parallelize *very* well.

-- 
Jeff Carroll		carroll@ssc-vax.boeing.com

"...and of their daughters it is written, 'Cursed be he who lies with 
any manner of animal.'" - Talmud