[comp.arch] More on condition numbers of uniform random matrices, especially Linpack

dgh@validgh.com (David G. Hough on validgh) (06/20/91)

A while back I wrote in comp.arch:

> IEEE 754 floating-point arithmetic was intended to increase the domain
> of problems for which good results could be obtained, and to increase the
> likelihood of getting error indications if good results could not be obtained,
> all without significantly degrading performance on "common" cases and
> without significantly increasing total system cost relative to sloppy
> arithmetic.
 
> Since none of these goals are very quantitative, 
> people will argue about how well they've been achieved.
 
> Part of the problem is that the benchmark programs in use measure only
> common cases.  

[for instance]

> Various versions of the Linpack benchmark all have in common
> that the data is taken from a uniform random distribution, producing problems
> of very good condition.   So the worst possible linear
> equation solver algorithm running on the dirtiest possible floating-point
> hardware should be able to produce a reasonably small residual, even for
> input problems of very large dimension.
 
[I should have said "the worst likely linear equation solver algorithm -
with no pivoting..." since Shearer pointed out that an aggressively bad
algorithm that sought the smallest available pivot would soon fail.]

Although I stand by my point - programs that run acceptably well on
all the common platforms can't prove much about superior robustness of one
arithmetic system over another - some of the Linpack condition
remarks merited further examination.

The test matrices A for the linpack benchmark are generated by
a nominally uniform random congruential linear method

      init = 1325
      do 30 j = 1,n
         do 20 i = 1,n
            init = mod(3125*init,65536)
            a(i,j) = (init - 32768.0)/16384.0
   20    continue
   30 continue

of period 2**14.  To attribute extremely good condition to such
matrices in general may be an exaggeration,
depending on how much the "random" numbers resemble a
truly random sample from a truly continuous interval.

In any event for the Linpack benchmark matrices, condition numbers are mostly in
the range from 300 to 1000, which is somewhat higher than I'd expected,
although not disastrous, and acceptable for some single-precision applications.

For certain dimensions n, the matrices generated are singular in exact
arithmetic and very ill conditioned in floating-point arithmetic.
These include obvious ones 256, 512, 1024, as well as others like
320 and 800, all of which have rank substantially
smaller than their dimension, as well as 128 itself, which is
more interesting.  A 128x128 matrix contains the entire period of the
random number generator exactly once.  I don't know if there is a theorem
that such matrices should be singular, but this particular one seems to have
rank 127.

Aside from those exactly singular cases, the worst turned up in my survey
was n=266 which has condition number 129540.

The condition number figures in a bound for the magnification of error in x
from errors in b or A, for general b.  I suspect that it overstates the 
susceptibility of x to changes in A or b for b of the special form generated
in the Linpack benchmark, where bi is computed as the sum of aji, corresponding
to an exact solution x consisting of a vector of 1's.   
In general terms, this method
of choosing b tends to immunize x against the ill condition of A by insuring
that ||b|| = ||Ax|| is not very much less than ||A|| ||x||, compared to choosing
an x like the last right singular vector of A, for which ||b|| would
be much smaller than ||A|| ||x||.   Greater insight than mine would be 
required to definitely refute or confirm the foregoing.
But this method of generating data does seem to maximize the chances of
getting acceptable numerical results even with short sloppy arithmetic,
and perhaps even with unstable algorithms: how much faster does a Linpack
1000x1000 benchmark run if no pivoting is employed?  Not much fortunately.
When I investigated gaussian elimination without pivoting on the Linpack
benchmark data, I found the effect on the residual to be smaller than I
might have guessed:

dimension       method  minimum normalized      error
  precision              pivot   residual        x-1
 
100 single      normal  2       1.6             1e-5
100 single      nopivot .13     70              3e-4
1000 single     normal  2       11              2e-4
1000 single     nopivot 7e-2    2600            8e-2
1000 double     normal  2       10              5e-13
1000 double     nopivot 6e-2    3900            6e-11
 
The information tabulated
below was computed with SSVDC from netlib, and indicates
dimension, computation time, the "info" error parameter value,
the largest singular value s1, the smallest singular value sn, the
condition number s1/sn, and the corresponding ratios for the next smaller
singular values, s1/s(n-1) and s1/s(n-2).
Following the SVD data is an anecdote I uncovered from the first time 
I noticed the periodicity of matgen.

           info
   n  time     s1      sn     s1/sn   s1/sn-1   s1/sn-2
   3   0.0 0  2.6 0.3E+00        10         2         1
   4   0.0 0  3.8 0.5E+00         7         3         2
   5   0.0 0  4.3 0.2E+00        18         3         2
   6   0.0 0  5.3 0.6E+00         9         4         3
   7   0.0 0  5.4 0.5E-01       120        15         3
   8   0.0 0  5.5 0.2E+00        26         5         4
   9   0.0 0  6.8 0.2E+00        45         9         9
  10   0.0 0  6.2 0.5E+00        13        11         5
  11   0.0 0  6.7 0.2E+00        30        13         7
  12   0.0 0  6.2 0.3E+00        20        10         6
  13   0.0 0  7.0 0.3E+00        27         7         5
  14   0.0 0  7.9 0.3E+00        26        13         7
  15   0.0 0  7.5 0.4E-01       182        20         6
  16   0.0 0  8.4 0.1E+00        57         8         8
  17   0.0 0  8.2 0.6E-01       147        13         8
  18   0.0 0  8.4 0.7E-01       127        13        12
  19   0.0 0  9.0 0.1E-01       916        64        12
  20   0.0 0  9.0 0.2E+00        46         9         7
  21   0.0 0  9.4 0.1E+00        64        23         8
  22   0.0 0  9.6 0.6E-01       164        23         7
  23   0.0 0 10.2 0.2E+00        49        24        13
  24   0.0 0 10.6 0.1E+00        89        38        14
  25   0.0 0 11.4 0.7E+00        16        11         9
  26   0.0 0 10.8 0.2E-01       518        24        10
  27   0.0 0 11.3 0.4E-01       313        35        22
  28   0.0 0 10.8 0.5E-01       203        36        22
  29   0.0 0 12.0 0.7E-02      1825        22        11
  30   0.0 0 11.9 0.8E-01       150        22        16
  31   0.0 0 12.6 0.2E+00        50        28        23
  32   0.0 0 12.5 0.2E+00        82        46        15
  33   0.1 0 13.0 0.4E-02      3370        17        16
  34   0.1 0 12.6 0.3E+00        49        24        15
  35   0.1 0 12.5 0.9E-01       144        35        27
  36   0.1 0 13.0 0.3E-01       406        39        24
  37   0.1 0 13.6 0.7E-01       196        39        28
  38   0.1 0 12.9 0.3E+00        39        26        15
  39   0.1 0 14.0 0.2E+00        56        32        25
  40   0.1 0 14.4 0.7E-01       193        30        23
  41   0.1 0 13.9 0.6E-01       241        42        34
  42   0.1 0 14.1 0.5E-01       280        31        20
  43   0.1 0 14.8 0.5E-01       300        61        35
  44   0.1 0 14.4 0.5E-01       291        59        36
  45   0.1 0 14.3 0.3E-01       562        43        22
  46   0.1 0 14.7 0.7E-02      2049        51        30
  47   0.1 0 15.1 0.5E-01       305        52        25
  48   0.1 0 15.0 0.2E+00        91        40        20
  49   0.1 0 15.0 0.8E-01       184        91        30
  50   0.1 0 15.8 0.3E-01       475        82        41
  51   0.1 0 15.4 0.6E-01       250        39        22
  52   0.1 0 15.4 0.8E-01       203        43        29
  53   0.2 0 16.0 0.3E-01       488        31        28
  54   0.2 0 15.5 0.5E-01       285        40        26
  55   0.2 0 15.6 0.1E+00       115        61        27
  56   0.2 0 16.1 0.2E+00        92        29        27
  57   0.2 0 16.5 0.2E-01       815       124        27
  58   0.2 0 17.0 0.7E-01       241        49        34
  59   0.2 0 18.0 0.1E+00       130        64        22
  60   0.2 0 17.6 0.9E-01       192        38        33
  61   0.2 0 17.6 0.5E-01       339        59        33
  62   0.2 0 18.2 0.6E-02      3142        75        51
  63   0.2 0 18.2 0.3E-01       695       101        69
  64   0.2 0 17.1 0.6E-01       304        34        21
  65   0.2 0 17.8 0.5E-01       389        51        44
  66   0.3 0 18.3 0.3E+00        62        54        30
  67   0.3 0 17.6 0.4E-01       402        52        37
  68   0.3 0 18.6 0.1E+00       155        45        37
  69   0.3 0 18.1 0.1E+00       142        68        46
  70   0.3 0 19.0 0.1E+00       159        67        24
  71   0.3 0 18.7 0.5E-01       372        87        47
  72   0.3 0 18.7 0.7E-01       267        47        27
  73   0.3 0 19.0 0.6E-01       307        64        41
  74   0.4 0 19.3 0.5E-01       375        55        30
  75   0.4 0 19.8 0.6E-01       331        44        30
  76   0.4 0 19.3 0.2E-01       969        67        43
  77   0.4 0 19.0 0.1E-01      1437        73        44
  78   0.4 0 20.1 0.6E-01       314        76        42
  79   0.4 0 20.2 0.2E+00        90        64        25
  80   0.4 0 18.9 0.1E+00       138       122        47
  81   0.4 0 21.1 0.6E-03     36617        77        48
  82   0.5 0 20.3 0.8E-01       259        60        46
  83   0.4 0 20.1 0.1E-01      1367       136        48
  84   0.5 0 20.6 0.5E-01       457        41        31
  85   0.5 0 20.4 0.2E-01      1273       111        44
  86   0.5 0 21.2 0.6E-01       367       117        58
  87   0.5 0 20.5 0.4E-01       498       167        48
  88   0.5 0 20.3 0.8E-01       266       113        65
  89   0.5 0 22.0 0.3E-01       662        96        51
  90   0.6 0 21.2 0.9E-01       240        90        49
  91   0.6 0 21.2 0.7E-01       325        97        53
  92   0.6 0 21.3 0.2E+00       116        61        58
  93   0.6 0 21.5 0.2E-01      1080       182        44
  94   0.6 0 22.0 0.3E-01       804       231        39
  95   0.7 0 21.2 0.1E+00       188        66        54
  96   0.7 0 21.6 0.1E+00       172        76        47
  97   0.7 0 22.7 0.4E-01       554        78        58
  98   0.7 0 22.1 0.5E-01       469       142        59
  99   0.7 0 22.5 0.4E-01       586       121        73
 100   0.7 0 22.9 0.2E-01      1257        54        40
 101   0.8 0 23.9 0.2E+00       130        91        57
 102   0.8 0 22.9 0.4E-01       594        58        35
 103   0.8 0 22.7 0.2E+00       139        58        38
 104   0.8 0 22.2 0.3E-01       752       121        52
 105   0.8 0 22.6 0.1E+00       193       112        85
 106   0.9 0 23.7 0.2E-01      1140       113        83
 107   0.9 0 22.9 0.8E-01       292        71        54
 108   0.9 0 23.0 0.1E-01      1917       172        51
 109   0.9 0 23.2 0.2E+00       121       108        77
 110   1.0 0 23.2 0.1E+00       232       112        65
 111   1.0 0 24.1 0.1E+00       184       107        64
 112   1.0 0 22.6 0.4E-02      5336        55        47
 113   1.0 0 24.4 0.1E-01      1715        63        52
 114   1.0 0 23.9 0.7E-02      3605       147        60
 115   1.1 0 24.0 0.7E-02      3365       159        59
 116   1.1 0 23.9 0.5E-01       489       218        64
 117   1.2 0 24.4 0.6E-01       380       213       112
 118   1.3 0 24.4 0.2E-02     10670       156        69
 119   1.2 0 24.4 0.8E-02      3092       110        63
 120   1.3 0 23.7 0.9E-01       251       161        80
 121   1.3 0 24.4 0.6E-02      3844       139        43
 122   1.3 0 25.3 0.3E-02      8732       129        59
 123   1.3 0 24.9 0.1E+00       184        72        52
 124   1.4 0 24.6 0.7E-03     36809       147        55
 125   1.4 0 25.3 0.6E-01       440       147       131
 126   1.5 0 25.0 0.1E+00       211       162        79
 127   1.4 0 25.1 0.1E+00       244       132        56
 128   1.4 0 25.3 0.4E-06  62712320       913       260
 129   1.5 0 25.8 0.1E+00       220       157        76
 130   1.5 0 26.1 0.9E-01       279       121        75
 131   1.5 0 25.9 0.9E-01       281       121        57
 132   1.6 0 25.5 0.7E-01       342        95        66
 133   1.6 0 26.0 0.9E-01       279       114        62
 134   1.7 0 25.8 0.8E-01       308       145        57
 135   1.7 0 26.0 0.2E-01      1053        80        59
 136   1.7 0 25.7 0.4E-02      5720       709       111
 137   1.9 0 26.1 0.8E-01       330        86        59
 138   1.8 0 26.3 0.5E-01       480        79        50
 139   1.8 0 26.6 0.2E+00       171        73        61
 140   1.9 0 26.0 0.2E+00       146        79        58
 141   1.9 0 26.5 0.9E-01       307       130        85
 142   1.9 0 27.0 0.2E-01      1737       426        69
 143   2.0 0 26.9 0.5E-01       587       134        64
 144   2.0 0 26.1 0.1E+00       190       108        69
 145   2.1 0 27.5 0.1E-01      2259       143        56
 146   2.1 0 27.5 0.1E-01      2173       197        90
 147   2.1 0 27.7 0.5E-01       609       135        99
 148   2.4 0 27.4 0.2E-01      1124       138        95
 149   2.4 0 27.9 0.4E-01       717       282        92
 150   2.6 0 27.2 0.1E+00       267       115        99
 151   2.6 0 27.6 0.7E-01       395       169        85
 152   2.6 0 26.7 0.8E-01       319       143        62
 153   2.5 0 28.1 0.3E-01       813       134        66
 154   2.5 0 28.0 0.9E-01       310       158       143
 155   2.5 0 27.8 0.9E-01       319       206       115
 156   2.6 0 27.7 0.6E-01       433        90        60
 157   2.6 0 28.1 0.4E-01       709        90        71
 158   2.6 0 28.3 0.1E-01      2144        85        68
 159   2.7 0 28.8 0.2E+00       189       111        83
 160   2.8 0 29.1 0.6E-01       510       200        94
 161   2.8 0 29.2 0.2E-01      1391       248        72
 162   2.8 0 28.3 0.1E+00       222       167        95
 163   2.9 0 28.7 0.1E+00       238       187        54
 164   3.0 0 28.6 0.8E-01       375       120        55
 165   3.0 0 28.7 0.1E+00       302       288        66
 166   3.0 0 29.2 0.4E-01       669       185       150
 167   3.1 0 28.8 0.9E-01       324       186       105
 168   3.1 0 28.5 0.7E-03     38553       153        96
 169   3.2 0 29.1 0.2E+00       180        99        82
 170   3.3 0 29.7 0.5E-01       596       130        67
 171   3.4 0 29.9 0.5E-01       552       216       134
 172   3.4 0 29.1 0.6E-02      5059       118       104
 173   3.5 0 29.3 0.5E-01       628       128        77
 174   3.6 0 29.6 0.3E-01      1021       148        92
 175   3.6 0 29.9 0.8E-01       382       263        96
 176   3.8 0 28.5 0.2E+00       160       123        84
 177   3.7 0 29.7 0.2E-01      1396       216        76
 178   3.8 0 30.0 0.8E-01       384       128        68
 179   3.9 0 30.9 0.1E+00       208       180        96
 180   4.0 0 29.1 0.5E-01       563       110        94
 181   4.0 0 30.9 0.3E-01      1066       272        87
 182   4.0 0 30.0 0.1E-01      2543       199       100
 183   4.1 0 30.0 0.1E+00       300       201        97
 184   4.2 0 29.2 0.1E+00       274       124       104
 185   4.2 0 30.2 0.1E-01      2859       396       154
 186   4.3 0 30.6 0.4E-01       687       365       129
 187   4.4 0 31.1 0.7E-01       468       299       176
 188   4.4 0 30.9 0.6E-01       548       123        91
 189   4.5 0 30.8 0.4E-01       746       184        89
 190   4.6 0 31.1 0.1E+00       247       129        93
 191   4.6 0 31.3 0.1E-01      2230       134       100
 192   4.7 0 34.3 0.2E-01      1746       277       117
 193   4.7 0 31.2 0.5E-01       580       190        80
 194   4.8 0 31.8 0.8E-03     39441       303       162
 195   4.9 0 31.1 0.8E-01       384       293        99
 196   5.0 0 31.2 0.1E-01      2643       150        97
 197   5.1 0 31.2 0.9E-01       355       138       107
 198   5.2 0 31.8 0.9E-02      3564       216        93
 199   5.2 0 31.7 0.7E-01       464       150       117
 200   5.3 0 30.4 0.2E-01      1382       390       109
 201   5.4 0 32.1 0.4E-01       902       213       100
 202   5.5 0 32.3 0.3E-01      1233       190        98
 203   5.5 0 31.7 0.1E+00       302       159       122
 204   5.6 0 31.9 0.2E-01      1346       104        85
 205   5.7 0 32.1 0.2E+00       191       128        99
 206   5.7 0 32.2 0.6E-03     57730       155       104
 207   5.9 0 32.3 0.4E-01       913       318       162
 208   6.0 0 30.6 0.6E-02      5264       245       102
 209   6.0 0 32.5 0.7E-01       434       214       192
 210   6.1 0 32.3 0.5E-01       672       197       132
 211   6.2 0 32.5 0.1E+00       250       150        79
 212   6.2 0 32.6 0.4E-01       883       456       122
 213   6.4 0 33.4 0.1E-01      2995       518       141
 214   6.4 0 33.1 0.8E-02      4069       298       204
 215   6.5 0 32.5 0.7E-01       449       374       112
 216   6.6 0 31.8 0.2E-01      1812       161        73
 217   6.7 0 33.4 0.4E-01       856       239        78
 218   6.8 0 33.2 0.9E-01       376       215        82
 219   6.9 0 33.2 0.7E-01       445       216       163
 220   7.4 0 32.6 0.6E-01       557       188       114
 221   7.3 0 33.6 0.5E-01       714       346       185
 222   7.2 0 33.0 0.2E-01      1665       392       153
 223   7.2 0 33.4 0.4E-02      7875       281       140
 224   7.4 0 33.7 0.6E-01       553       285       159
 225   7.6 0 33.2 0.4E-01       767       157       124
 226   7.6 0 34.0 0.3E-01      1117       137       125
 227   8.1 0 34.0 0.2E-01      2167       421       264
 228   8.2 0 33.4 0.3E-01      1327       470       109
 229   8.4 0 34.3 0.5E-01       691       337       131
 230   8.5 0 33.0 0.1E+00       333       138        96
 231   8.5 0 34.6 0.8E-01       455       233       143
 232   8.6 0 33.2 0.2E-01      2138       557       156
 233   8.6 0 34.0 0.5E-01       681       285       145
 234   8.7 0 34.3 0.1E-01      2375       217       152
 235   8.7 0 35.1 0.5E-01       658       243       151
 236   8.8 0 34.3 0.4E-01       843       419       168
 237   8.7 0 34.8 0.3E-01      1075       182       100
 238   8.8 0 35.3 0.1E+00       308       185       108
 239   9.0 0 34.8 0.2E-02     18905       130       113
 240   9.0 0 32.4 0.7E-01       487       160       116
 241   9.1 0 34.4 0.3E-01      1034       271        92
 242   9.2 0 34.7 0.1E-01      3626       278       119
 243   9.3 0 34.8 0.2E-01      1825       229       146
 244   9.4 0 34.9 0.7E-01       493       195       150
 245   9.5 0 35.5 0.1E-01      3237       134        96
 246   9.8 0 35.1 0.1E-01      2454       216       154
 247   9.8 0 35.4 0.8E-01       455       394       133
 248  10.0 0 33.8 0.7E-01       473       193        99
 249  10.0 0 36.3 0.5E-01       762       353       142
 250  10.1 0 35.7 0.6E-01       592       412       195
 251  10.3 0 35.8 0.2E+00       237       169       130
 252  10.4 0 34.9 0.1E+00       348       237       115
 253  10.5 0 36.9 0.6E-02      6626       439       172
 254  10.6 0 35.5 0.1E-01      3611       281       140
 255  10.7 0 36.1 0.1E+00       378       225       118
 256  10.7 0 59.1 0.1E-20 2147483647 2147483647 2147483647
 257  11.0 0 36.5 0.1E+00       255       208       144
 258  11.1 0 35.8 0.2E-01      1544       277        96
 259  11.3 0 36.0 0.2E-01      1443       288       205
 260  11.3 0 35.8 0.4E-01       818       324       127
 261  11.5 0 35.9 0.1E+00       261       218       115
 262  11.6 0 36.2 0.1E+00       298       165       138
 263  11.8 0 37.5 0.4E-01       888       352       238
 264  11.8 0 34.5 0.9E-01       369       170       128
 265  12.0 0 37.0 0.4E-02      9510       285       155
 266  12.2 0 36.2 0.3E-03    129540       276       118
 267  12.3 0 38.8 0.1E-01      3443       291       173
 268  12.4 0 36.3 0.7E-01       538       197       120
 269  12.5 0 36.8 0.4E-01       861       199       152
 270  13.4 0 37.2 0.5E-01       748       226       181
 271  13.3 0 36.9 0.3E-01      1353       570       134
 272  13.9 0 35.4 0.4E-01       995       160       112
 273  13.8 0 37.1 0.2E-01      1872       258       111
 274  13.8 0 37.6 0.2E-01      1609       338       115
 275  13.4 0 37.7 0.1E-02     26836       330       226
 276  13.5 0 37.1 0.6E-01       634       221       146
 277  13.6 0 38.7 0.5E-01       764       180       124
 278  14.4 0 38.1 0.1E+00       369       322       163
 279  14.1 0 37.1 0.4E-01       929       287       149
 280  14.6 0 35.8 0.8E-01       476       127       121
 281  14.3 0 37.8 0.9E-03     42590       173       153
 282  14.2 0 37.0 0.5E-01       720       236       167
 283  14.4 0 37.7 0.2E-01      1686       266       152
 284  14.6 0 38.2 0.4E-01       955       172       107
 285  15.0 0 37.9 0.2E-02     24267       246       127
 286  14.8 0 37.7 0.2E-01      2501       288       168
 287  15.0 0 38.4 0.5E-01       700       287       189
 288  15.1 0 41.6 0.5E-01       832       206       177
 289  15.3 0 38.4 0.4E-01       917       384       159
 290  15.5 0 38.6 0.5E-01       705       459       179
 291  15.6 0 38.5 0.2E-01      2315       214       135
 292  15.8 0 38.5 0.3E-01      1155       201       141
 293  15.9 0 38.2 0.4E-01      1086       474        92
 294  16.1 0 38.5 0.3E-01      1380       378       255
 295  16.2 0 38.8 0.6E-02      6553       447       178
 296  16.4 0 36.8 0.1E-01      3100       343       196
 297  16.6 0 38.8 0.3E-01      1211       174       136
 298  16.7 0 38.4 0.2E-01      1981       437       219
 299  16.9 0 39.4 0.7E-01       528       248       219
 300  17.1 0 38.4 0.4E-01       899       256       154
 310  18.7 0 39.1 0.6E-01       633       315       155
 320  20.0 0 54.1 0.5E-07 995625728 314481184 136633984
 330  22.4 0 40.5 0.4E-01      1062       481       210
 340  24.5 0 41.2 0.1E+00       322       237       151
 350  26.6 0 41.7 0.8E-01       497       250       208
 360  28.9 0 40.1 0.2E-01      1727       654       182
 370  31.2 0 43.5 0.2E-01      2518       263       210
 380  33.8 0 43.0 0.2E-01      1906      1525       270
 390  36.4 0 43.7 0.8E-01       565       431       221
 400  39.2 0 44.2 0.1E-01      3794       443       193
 410  42.1 0 44.6 0.9E-02      4826       370       269
 420  45.1 0 45.0 0.1E-01      3303       557       254
 430  48.3 0 45.7 0.1E-01      3997       412       268
 440  51.7 0 44.2 0.5E-01       886       359       290
 450  55.1 0 46.9 0.3E-01      1589       305       218
 460  59.9 0 46.9 0.1E+00       318       239       175
 470  63.8 0 47.7 0.1E+00       449       315       189
 480  67.3 0 61.8 0.5E-01      1277       777       445
 490  71.4 0 48.4 0.2E-01      2164       292       249
 500  75.3 0 49.4 0.1E+00       462       299       190
 550 100.2 0 52.7 0.2E-01      2637       498       208
 600 129.1 0 53.4 0.6E-01       839       521       401
 650 162.1 0 56.3 0.2E-01      2920      1342       284
 700 200.7 0 56.7 0.4E-01      1430       470       322
 750 245.9 0 60.2 0.3E-01      2399       591       368
 800 292.5 0 97.5 0.1E-07 2147483647 1187167360 666092416
 850 358.1 0 63.3 0.5E-01      1300       645       265
 900 421.6 0 63.2 0.1E+00       629       541       330
 950 495.3 0 67.1 0.5E-02     12392      1252       469
1000 575.9 0 70.5 0.2E-01      3624      2427       664

From 6 January 1989:

     It's not often that I apply what I learned at Berkeley to my daily work,
which primarily involves finding very low-level bugs in hardware and software,
mostly under development by other people.

     Since I have to test hardware with a wide performance range, benchmarks
have to be adjustable in size so that they don't run too quickly on fast
hardware to be timed accurately, nor too slowly on slow hardware to finish
before that hardware is obsolete.

     So I have added adjustable parameters to a number of benchmarks.  To
calibrate them I run a little measurement program derived from the infamous
Linpack benchmark.  (When I first came to Sun I was so ignorant that I thought
Linpack was a library of linear algebra subroutines rather than a benchmark
program.)

     This little Linpack starts off factoring a 32x32 matrix.  Even a Sun-2
can do that in acceptable time.  If the time is too fast, it then automatically
tries a larger matrix, up to 512x512.  Then it computes what the execution 
time would have been for a 512x512 on the particular system, scaling the
time for whatever matrix it settled on by (512/n)**3.

     Some Sun hardware currently under development has been getting fast
enough that the calibration program tries 512x512.  On both projects in ques-
tion, however, I noticed that the program was getting hung up and taking an
inordinate amount of time to finish the calibration program.  This of course
indicated a hardware bug, of which I informed the hardware guys and left them
to find it.  Oddly enough, the bug only affected IEEE single precision; double
precision was fine; most of our bugs tend to occur in double precision for
various reasons.  Furthermore the bug didn't seem to show up in the 100x100,
300x300, or 1000x1000 single precision Linpack benchmarks that I also run.

     In the first project the microcoder dutifully tracked things down with a
logic analyzer to where an underflow was occurring.   Now I knew that the Lin-
pack benchmark generates uniform random data over an innocuous interval, and
we all know that matrices composed of random data from a uniform distribution
are remarkably well conditioned.  I remember this particularly well because
one of my Berkeley qualifying examination questions - put to me a week ahead
of time to ponder fruitlessly - was to come up with a plausible explanation of
how an eminent mathematician had conducted an empirical investigation of
iterative improvement, studying published test matrices and matrices of
uniformly-distributed random data, and had come to the conclusion that itera-
tive improvement never improved the answer by more than one or two digits,
which seemed to argue against what's routinely taught in elementary numerical
analysis.

     Anyway I told the microcoder that there was still a hardware bug, but
shortly thereafter something else changed and the calibration benchmark went
back to 256x256 and there never was any other evidence of a single precision
problem, so I quit worrying about it.  Underflows, if they could have
occurred, would account for the program apparently hanging up, because the way
Sun obtains IEEE conformance on underflow in systems built upon hardware like
the Weitek 1164/5, is to trap on underflow and recompute the correct result
very slowly in software.  Such underflows are supposed to be rare.

     Then a totally unrelated but even more critical project ran into a simi-
lar problem.  This afternoon I went to a high level crisis meeting attended by
at least two vice presidents, at which I gloomily reported that a new, previ-
ously unknown hardware bug had appeared that affected single precision.

     Upon returning from that meeting one of my colleagues, who'd been helping
at the logic analyzer until he got suspicious,  informed me that he thought
there was no hardware or compiler bug, rather that the program had started to
underflow because the pivots had gotten too small and fallen off the end of
single precision.  What about the well-known wonderful condition of matrices
of uniform random data, I asked?  He suspected that the high dimensionality
(512x512) was just too much for single precision.  I wondered why the
1000x1000 benchmark worked in single precision.

     Since no progress was being made on the hardware bug, I started printing
out the pivots in the program.  They started out as normal numbers like 1 or
-10, then suddenly dropped to about 1e-7, then later to 1e-14, and then:

 k 82 pivot -1.8666e-20
 k 83 pivot -2.96595e-14
 k 84 pivot 2.46156e-14
 k 85 pivot 2.40541e-14
 k 86 pivot -4.99053e-14
 k 87 pivot 1.7579e-14
 k 88 pivot 1.69295e-14
 k 89 pivot -1.56396e-14
 k 90 pivot 1.37869e-14
 k 91 pivot -3.10221e-14
 k 92 pivot 2.35206e-14
 k 93 pivot 1.32175e-14
 k 94 pivot -7.77593e-15
 k 95 pivot 1.34815e-14
 k 96 pivot -1.02589e-21
 k 97 pivot 4.27131e-22
 k 98 pivot 1.22101e-21
 k 99 pivot -7.12407e-22
 k 100 pivot -1.75579e-21
 k 101 pivot 3.13343e-21
 k 102 pivot -6.99946e-22
 k 103 pivot 3.82048e-22
 k 104 pivot 8.05538e-22
 k 105 pivot -1.18164e-21
 k 106 pivot -6.349e-22
 k 107 pivot -2.48245e-21
 k 108 pivot -8.89452e-22
 k 109 pivot -8.23235e-22
 k 110 pivot 4.40549e-21
 k 111 pivot 1.12387e-21
 k 112 pivot -4.78853e-22
 k 113 pivot 4.38739e-22
 k 114 pivot 7.3868e-28
SIGFPE 8: numerical exception, CHK, or TRAP
stopped at       daxpy+0x18c:           movl    a4@(0xe10),a3@
                -

Those sudden drops were certainly perplexing - almost full word width, as if
the matrix were actually exactly singular.  Could there be a bug in the matrix
generator routine (matgen) causing some wild data to be thrown into the pot?
I looked and found a perfectly conventional portable linear congruential ran-
dom number generator...
-- 

David Hough

dgh@validgh.com		uunet!validgh!dgh	na.hough@na-net.ornl.gov