dgh@validgh.com (David G. Hough on validgh) (06/20/91)
A while back I wrote in comp.arch: > IEEE 754 floating-point arithmetic was intended to increase the domain > of problems for which good results could be obtained, and to increase the > likelihood of getting error indications if good results could not be obtained, > all without significantly degrading performance on "common" cases and > without significantly increasing total system cost relative to sloppy > arithmetic. > Since none of these goals are very quantitative, > people will argue about how well they've been achieved. > Part of the problem is that the benchmark programs in use measure only > common cases. [for instance] > Various versions of the Linpack benchmark all have in common > that the data is taken from a uniform random distribution, producing problems > of very good condition. So the worst possible linear > equation solver algorithm running on the dirtiest possible floating-point > hardware should be able to produce a reasonably small residual, even for > input problems of very large dimension. [I should have said "the worst likely linear equation solver algorithm - with no pivoting..." since Shearer pointed out that an aggressively bad algorithm that sought the smallest available pivot would soon fail.] Although I stand by my point - programs that run acceptably well on all the common platforms can't prove much about superior robustness of one arithmetic system over another - some of the Linpack condition remarks merited further examination. The test matrices A for the linpack benchmark are generated by a nominally uniform random congruential linear method init = 1325 do 30 j = 1,n do 20 i = 1,n init = mod(3125*init,65536) a(i,j) = (init - 32768.0)/16384.0 20 continue 30 continue of period 2**14. To attribute extremely good condition to such matrices in general may be an exaggeration, depending on how much the "random" numbers resemble a truly random sample from a truly continuous interval. In any event for the Linpack benchmark matrices, condition numbers are mostly in the range from 300 to 1000, which is somewhat higher than I'd expected, although not disastrous, and acceptable for some single-precision applications. For certain dimensions n, the matrices generated are singular in exact arithmetic and very ill conditioned in floating-point arithmetic. These include obvious ones 256, 512, 1024, as well as others like 320 and 800, all of which have rank substantially smaller than their dimension, as well as 128 itself, which is more interesting. A 128x128 matrix contains the entire period of the random number generator exactly once. I don't know if there is a theorem that such matrices should be singular, but this particular one seems to have rank 127. Aside from those exactly singular cases, the worst turned up in my survey was n=266 which has condition number 129540. The condition number figures in a bound for the magnification of error in x from errors in b or A, for general b. I suspect that it overstates the susceptibility of x to changes in A or b for b of the special form generated in the Linpack benchmark, where bi is computed as the sum of aji, corresponding to an exact solution x consisting of a vector of 1's. In general terms, this method of choosing b tends to immunize x against the ill condition of A by insuring that ||b|| = ||Ax|| is not very much less than ||A|| ||x||, compared to choosing an x like the last right singular vector of A, for which ||b|| would be much smaller than ||A|| ||x||. Greater insight than mine would be required to definitely refute or confirm the foregoing. But this method of generating data does seem to maximize the chances of getting acceptable numerical results even with short sloppy arithmetic, and perhaps even with unstable algorithms: how much faster does a Linpack 1000x1000 benchmark run if no pivoting is employed? Not much fortunately. When I investigated gaussian elimination without pivoting on the Linpack benchmark data, I found the effect on the residual to be smaller than I might have guessed: dimension method minimum normalized error precision pivot residual x-1 100 single normal 2 1.6 1e-5 100 single nopivot .13 70 3e-4 1000 single normal 2 11 2e-4 1000 single nopivot 7e-2 2600 8e-2 1000 double normal 2 10 5e-13 1000 double nopivot 6e-2 3900 6e-11 The information tabulated below was computed with SSVDC from netlib, and indicates dimension, computation time, the "info" error parameter value, the largest singular value s1, the smallest singular value sn, the condition number s1/sn, and the corresponding ratios for the next smaller singular values, s1/s(n-1) and s1/s(n-2). Following the SVD data is an anecdote I uncovered from the first time I noticed the periodicity of matgen. info n time s1 sn s1/sn s1/sn-1 s1/sn-2 3 0.0 0 2.6 0.3E+00 10 2 1 4 0.0 0 3.8 0.5E+00 7 3 2 5 0.0 0 4.3 0.2E+00 18 3 2 6 0.0 0 5.3 0.6E+00 9 4 3 7 0.0 0 5.4 0.5E-01 120 15 3 8 0.0 0 5.5 0.2E+00 26 5 4 9 0.0 0 6.8 0.2E+00 45 9 9 10 0.0 0 6.2 0.5E+00 13 11 5 11 0.0 0 6.7 0.2E+00 30 13 7 12 0.0 0 6.2 0.3E+00 20 10 6 13 0.0 0 7.0 0.3E+00 27 7 5 14 0.0 0 7.9 0.3E+00 26 13 7 15 0.0 0 7.5 0.4E-01 182 20 6 16 0.0 0 8.4 0.1E+00 57 8 8 17 0.0 0 8.2 0.6E-01 147 13 8 18 0.0 0 8.4 0.7E-01 127 13 12 19 0.0 0 9.0 0.1E-01 916 64 12 20 0.0 0 9.0 0.2E+00 46 9 7 21 0.0 0 9.4 0.1E+00 64 23 8 22 0.0 0 9.6 0.6E-01 164 23 7 23 0.0 0 10.2 0.2E+00 49 24 13 24 0.0 0 10.6 0.1E+00 89 38 14 25 0.0 0 11.4 0.7E+00 16 11 9 26 0.0 0 10.8 0.2E-01 518 24 10 27 0.0 0 11.3 0.4E-01 313 35 22 28 0.0 0 10.8 0.5E-01 203 36 22 29 0.0 0 12.0 0.7E-02 1825 22 11 30 0.0 0 11.9 0.8E-01 150 22 16 31 0.0 0 12.6 0.2E+00 50 28 23 32 0.0 0 12.5 0.2E+00 82 46 15 33 0.1 0 13.0 0.4E-02 3370 17 16 34 0.1 0 12.6 0.3E+00 49 24 15 35 0.1 0 12.5 0.9E-01 144 35 27 36 0.1 0 13.0 0.3E-01 406 39 24 37 0.1 0 13.6 0.7E-01 196 39 28 38 0.1 0 12.9 0.3E+00 39 26 15 39 0.1 0 14.0 0.2E+00 56 32 25 40 0.1 0 14.4 0.7E-01 193 30 23 41 0.1 0 13.9 0.6E-01 241 42 34 42 0.1 0 14.1 0.5E-01 280 31 20 43 0.1 0 14.8 0.5E-01 300 61 35 44 0.1 0 14.4 0.5E-01 291 59 36 45 0.1 0 14.3 0.3E-01 562 43 22 46 0.1 0 14.7 0.7E-02 2049 51 30 47 0.1 0 15.1 0.5E-01 305 52 25 48 0.1 0 15.0 0.2E+00 91 40 20 49 0.1 0 15.0 0.8E-01 184 91 30 50 0.1 0 15.8 0.3E-01 475 82 41 51 0.1 0 15.4 0.6E-01 250 39 22 52 0.1 0 15.4 0.8E-01 203 43 29 53 0.2 0 16.0 0.3E-01 488 31 28 54 0.2 0 15.5 0.5E-01 285 40 26 55 0.2 0 15.6 0.1E+00 115 61 27 56 0.2 0 16.1 0.2E+00 92 29 27 57 0.2 0 16.5 0.2E-01 815 124 27 58 0.2 0 17.0 0.7E-01 241 49 34 59 0.2 0 18.0 0.1E+00 130 64 22 60 0.2 0 17.6 0.9E-01 192 38 33 61 0.2 0 17.6 0.5E-01 339 59 33 62 0.2 0 18.2 0.6E-02 3142 75 51 63 0.2 0 18.2 0.3E-01 695 101 69 64 0.2 0 17.1 0.6E-01 304 34 21 65 0.2 0 17.8 0.5E-01 389 51 44 66 0.3 0 18.3 0.3E+00 62 54 30 67 0.3 0 17.6 0.4E-01 402 52 37 68 0.3 0 18.6 0.1E+00 155 45 37 69 0.3 0 18.1 0.1E+00 142 68 46 70 0.3 0 19.0 0.1E+00 159 67 24 71 0.3 0 18.7 0.5E-01 372 87 47 72 0.3 0 18.7 0.7E-01 267 47 27 73 0.3 0 19.0 0.6E-01 307 64 41 74 0.4 0 19.3 0.5E-01 375 55 30 75 0.4 0 19.8 0.6E-01 331 44 30 76 0.4 0 19.3 0.2E-01 969 67 43 77 0.4 0 19.0 0.1E-01 1437 73 44 78 0.4 0 20.1 0.6E-01 314 76 42 79 0.4 0 20.2 0.2E+00 90 64 25 80 0.4 0 18.9 0.1E+00 138 122 47 81 0.4 0 21.1 0.6E-03 36617 77 48 82 0.5 0 20.3 0.8E-01 259 60 46 83 0.4 0 20.1 0.1E-01 1367 136 48 84 0.5 0 20.6 0.5E-01 457 41 31 85 0.5 0 20.4 0.2E-01 1273 111 44 86 0.5 0 21.2 0.6E-01 367 117 58 87 0.5 0 20.5 0.4E-01 498 167 48 88 0.5 0 20.3 0.8E-01 266 113 65 89 0.5 0 22.0 0.3E-01 662 96 51 90 0.6 0 21.2 0.9E-01 240 90 49 91 0.6 0 21.2 0.7E-01 325 97 53 92 0.6 0 21.3 0.2E+00 116 61 58 93 0.6 0 21.5 0.2E-01 1080 182 44 94 0.6 0 22.0 0.3E-01 804 231 39 95 0.7 0 21.2 0.1E+00 188 66 54 96 0.7 0 21.6 0.1E+00 172 76 47 97 0.7 0 22.7 0.4E-01 554 78 58 98 0.7 0 22.1 0.5E-01 469 142 59 99 0.7 0 22.5 0.4E-01 586 121 73 100 0.7 0 22.9 0.2E-01 1257 54 40 101 0.8 0 23.9 0.2E+00 130 91 57 102 0.8 0 22.9 0.4E-01 594 58 35 103 0.8 0 22.7 0.2E+00 139 58 38 104 0.8 0 22.2 0.3E-01 752 121 52 105 0.8 0 22.6 0.1E+00 193 112 85 106 0.9 0 23.7 0.2E-01 1140 113 83 107 0.9 0 22.9 0.8E-01 292 71 54 108 0.9 0 23.0 0.1E-01 1917 172 51 109 0.9 0 23.2 0.2E+00 121 108 77 110 1.0 0 23.2 0.1E+00 232 112 65 111 1.0 0 24.1 0.1E+00 184 107 64 112 1.0 0 22.6 0.4E-02 5336 55 47 113 1.0 0 24.4 0.1E-01 1715 63 52 114 1.0 0 23.9 0.7E-02 3605 147 60 115 1.1 0 24.0 0.7E-02 3365 159 59 116 1.1 0 23.9 0.5E-01 489 218 64 117 1.2 0 24.4 0.6E-01 380 213 112 118 1.3 0 24.4 0.2E-02 10670 156 69 119 1.2 0 24.4 0.8E-02 3092 110 63 120 1.3 0 23.7 0.9E-01 251 161 80 121 1.3 0 24.4 0.6E-02 3844 139 43 122 1.3 0 25.3 0.3E-02 8732 129 59 123 1.3 0 24.9 0.1E+00 184 72 52 124 1.4 0 24.6 0.7E-03 36809 147 55 125 1.4 0 25.3 0.6E-01 440 147 131 126 1.5 0 25.0 0.1E+00 211 162 79 127 1.4 0 25.1 0.1E+00 244 132 56 128 1.4 0 25.3 0.4E-06 62712320 913 260 129 1.5 0 25.8 0.1E+00 220 157 76 130 1.5 0 26.1 0.9E-01 279 121 75 131 1.5 0 25.9 0.9E-01 281 121 57 132 1.6 0 25.5 0.7E-01 342 95 66 133 1.6 0 26.0 0.9E-01 279 114 62 134 1.7 0 25.8 0.8E-01 308 145 57 135 1.7 0 26.0 0.2E-01 1053 80 59 136 1.7 0 25.7 0.4E-02 5720 709 111 137 1.9 0 26.1 0.8E-01 330 86 59 138 1.8 0 26.3 0.5E-01 480 79 50 139 1.8 0 26.6 0.2E+00 171 73 61 140 1.9 0 26.0 0.2E+00 146 79 58 141 1.9 0 26.5 0.9E-01 307 130 85 142 1.9 0 27.0 0.2E-01 1737 426 69 143 2.0 0 26.9 0.5E-01 587 134 64 144 2.0 0 26.1 0.1E+00 190 108 69 145 2.1 0 27.5 0.1E-01 2259 143 56 146 2.1 0 27.5 0.1E-01 2173 197 90 147 2.1 0 27.7 0.5E-01 609 135 99 148 2.4 0 27.4 0.2E-01 1124 138 95 149 2.4 0 27.9 0.4E-01 717 282 92 150 2.6 0 27.2 0.1E+00 267 115 99 151 2.6 0 27.6 0.7E-01 395 169 85 152 2.6 0 26.7 0.8E-01 319 143 62 153 2.5 0 28.1 0.3E-01 813 134 66 154 2.5 0 28.0 0.9E-01 310 158 143 155 2.5 0 27.8 0.9E-01 319 206 115 156 2.6 0 27.7 0.6E-01 433 90 60 157 2.6 0 28.1 0.4E-01 709 90 71 158 2.6 0 28.3 0.1E-01 2144 85 68 159 2.7 0 28.8 0.2E+00 189 111 83 160 2.8 0 29.1 0.6E-01 510 200 94 161 2.8 0 29.2 0.2E-01 1391 248 72 162 2.8 0 28.3 0.1E+00 222 167 95 163 2.9 0 28.7 0.1E+00 238 187 54 164 3.0 0 28.6 0.8E-01 375 120 55 165 3.0 0 28.7 0.1E+00 302 288 66 166 3.0 0 29.2 0.4E-01 669 185 150 167 3.1 0 28.8 0.9E-01 324 186 105 168 3.1 0 28.5 0.7E-03 38553 153 96 169 3.2 0 29.1 0.2E+00 180 99 82 170 3.3 0 29.7 0.5E-01 596 130 67 171 3.4 0 29.9 0.5E-01 552 216 134 172 3.4 0 29.1 0.6E-02 5059 118 104 173 3.5 0 29.3 0.5E-01 628 128 77 174 3.6 0 29.6 0.3E-01 1021 148 92 175 3.6 0 29.9 0.8E-01 382 263 96 176 3.8 0 28.5 0.2E+00 160 123 84 177 3.7 0 29.7 0.2E-01 1396 216 76 178 3.8 0 30.0 0.8E-01 384 128 68 179 3.9 0 30.9 0.1E+00 208 180 96 180 4.0 0 29.1 0.5E-01 563 110 94 181 4.0 0 30.9 0.3E-01 1066 272 87 182 4.0 0 30.0 0.1E-01 2543 199 100 183 4.1 0 30.0 0.1E+00 300 201 97 184 4.2 0 29.2 0.1E+00 274 124 104 185 4.2 0 30.2 0.1E-01 2859 396 154 186 4.3 0 30.6 0.4E-01 687 365 129 187 4.4 0 31.1 0.7E-01 468 299 176 188 4.4 0 30.9 0.6E-01 548 123 91 189 4.5 0 30.8 0.4E-01 746 184 89 190 4.6 0 31.1 0.1E+00 247 129 93 191 4.6 0 31.3 0.1E-01 2230 134 100 192 4.7 0 34.3 0.2E-01 1746 277 117 193 4.7 0 31.2 0.5E-01 580 190 80 194 4.8 0 31.8 0.8E-03 39441 303 162 195 4.9 0 31.1 0.8E-01 384 293 99 196 5.0 0 31.2 0.1E-01 2643 150 97 197 5.1 0 31.2 0.9E-01 355 138 107 198 5.2 0 31.8 0.9E-02 3564 216 93 199 5.2 0 31.7 0.7E-01 464 150 117 200 5.3 0 30.4 0.2E-01 1382 390 109 201 5.4 0 32.1 0.4E-01 902 213 100 202 5.5 0 32.3 0.3E-01 1233 190 98 203 5.5 0 31.7 0.1E+00 302 159 122 204 5.6 0 31.9 0.2E-01 1346 104 85 205 5.7 0 32.1 0.2E+00 191 128 99 206 5.7 0 32.2 0.6E-03 57730 155 104 207 5.9 0 32.3 0.4E-01 913 318 162 208 6.0 0 30.6 0.6E-02 5264 245 102 209 6.0 0 32.5 0.7E-01 434 214 192 210 6.1 0 32.3 0.5E-01 672 197 132 211 6.2 0 32.5 0.1E+00 250 150 79 212 6.2 0 32.6 0.4E-01 883 456 122 213 6.4 0 33.4 0.1E-01 2995 518 141 214 6.4 0 33.1 0.8E-02 4069 298 204 215 6.5 0 32.5 0.7E-01 449 374 112 216 6.6 0 31.8 0.2E-01 1812 161 73 217 6.7 0 33.4 0.4E-01 856 239 78 218 6.8 0 33.2 0.9E-01 376 215 82 219 6.9 0 33.2 0.7E-01 445 216 163 220 7.4 0 32.6 0.6E-01 557 188 114 221 7.3 0 33.6 0.5E-01 714 346 185 222 7.2 0 33.0 0.2E-01 1665 392 153 223 7.2 0 33.4 0.4E-02 7875 281 140 224 7.4 0 33.7 0.6E-01 553 285 159 225 7.6 0 33.2 0.4E-01 767 157 124 226 7.6 0 34.0 0.3E-01 1117 137 125 227 8.1 0 34.0 0.2E-01 2167 421 264 228 8.2 0 33.4 0.3E-01 1327 470 109 229 8.4 0 34.3 0.5E-01 691 337 131 230 8.5 0 33.0 0.1E+00 333 138 96 231 8.5 0 34.6 0.8E-01 455 233 143 232 8.6 0 33.2 0.2E-01 2138 557 156 233 8.6 0 34.0 0.5E-01 681 285 145 234 8.7 0 34.3 0.1E-01 2375 217 152 235 8.7 0 35.1 0.5E-01 658 243 151 236 8.8 0 34.3 0.4E-01 843 419 168 237 8.7 0 34.8 0.3E-01 1075 182 100 238 8.8 0 35.3 0.1E+00 308 185 108 239 9.0 0 34.8 0.2E-02 18905 130 113 240 9.0 0 32.4 0.7E-01 487 160 116 241 9.1 0 34.4 0.3E-01 1034 271 92 242 9.2 0 34.7 0.1E-01 3626 278 119 243 9.3 0 34.8 0.2E-01 1825 229 146 244 9.4 0 34.9 0.7E-01 493 195 150 245 9.5 0 35.5 0.1E-01 3237 134 96 246 9.8 0 35.1 0.1E-01 2454 216 154 247 9.8 0 35.4 0.8E-01 455 394 133 248 10.0 0 33.8 0.7E-01 473 193 99 249 10.0 0 36.3 0.5E-01 762 353 142 250 10.1 0 35.7 0.6E-01 592 412 195 251 10.3 0 35.8 0.2E+00 237 169 130 252 10.4 0 34.9 0.1E+00 348 237 115 253 10.5 0 36.9 0.6E-02 6626 439 172 254 10.6 0 35.5 0.1E-01 3611 281 140 255 10.7 0 36.1 0.1E+00 378 225 118 256 10.7 0 59.1 0.1E-20 2147483647 2147483647 2147483647 257 11.0 0 36.5 0.1E+00 255 208 144 258 11.1 0 35.8 0.2E-01 1544 277 96 259 11.3 0 36.0 0.2E-01 1443 288 205 260 11.3 0 35.8 0.4E-01 818 324 127 261 11.5 0 35.9 0.1E+00 261 218 115 262 11.6 0 36.2 0.1E+00 298 165 138 263 11.8 0 37.5 0.4E-01 888 352 238 264 11.8 0 34.5 0.9E-01 369 170 128 265 12.0 0 37.0 0.4E-02 9510 285 155 266 12.2 0 36.2 0.3E-03 129540 276 118 267 12.3 0 38.8 0.1E-01 3443 291 173 268 12.4 0 36.3 0.7E-01 538 197 120 269 12.5 0 36.8 0.4E-01 861 199 152 270 13.4 0 37.2 0.5E-01 748 226 181 271 13.3 0 36.9 0.3E-01 1353 570 134 272 13.9 0 35.4 0.4E-01 995 160 112 273 13.8 0 37.1 0.2E-01 1872 258 111 274 13.8 0 37.6 0.2E-01 1609 338 115 275 13.4 0 37.7 0.1E-02 26836 330 226 276 13.5 0 37.1 0.6E-01 634 221 146 277 13.6 0 38.7 0.5E-01 764 180 124 278 14.4 0 38.1 0.1E+00 369 322 163 279 14.1 0 37.1 0.4E-01 929 287 149 280 14.6 0 35.8 0.8E-01 476 127 121 281 14.3 0 37.8 0.9E-03 42590 173 153 282 14.2 0 37.0 0.5E-01 720 236 167 283 14.4 0 37.7 0.2E-01 1686 266 152 284 14.6 0 38.2 0.4E-01 955 172 107 285 15.0 0 37.9 0.2E-02 24267 246 127 286 14.8 0 37.7 0.2E-01 2501 288 168 287 15.0 0 38.4 0.5E-01 700 287 189 288 15.1 0 41.6 0.5E-01 832 206 177 289 15.3 0 38.4 0.4E-01 917 384 159 290 15.5 0 38.6 0.5E-01 705 459 179 291 15.6 0 38.5 0.2E-01 2315 214 135 292 15.8 0 38.5 0.3E-01 1155 201 141 293 15.9 0 38.2 0.4E-01 1086 474 92 294 16.1 0 38.5 0.3E-01 1380 378 255 295 16.2 0 38.8 0.6E-02 6553 447 178 296 16.4 0 36.8 0.1E-01 3100 343 196 297 16.6 0 38.8 0.3E-01 1211 174 136 298 16.7 0 38.4 0.2E-01 1981 437 219 299 16.9 0 39.4 0.7E-01 528 248 219 300 17.1 0 38.4 0.4E-01 899 256 154 310 18.7 0 39.1 0.6E-01 633 315 155 320 20.0 0 54.1 0.5E-07 995625728 314481184 136633984 330 22.4 0 40.5 0.4E-01 1062 481 210 340 24.5 0 41.2 0.1E+00 322 237 151 350 26.6 0 41.7 0.8E-01 497 250 208 360 28.9 0 40.1 0.2E-01 1727 654 182 370 31.2 0 43.5 0.2E-01 2518 263 210 380 33.8 0 43.0 0.2E-01 1906 1525 270 390 36.4 0 43.7 0.8E-01 565 431 221 400 39.2 0 44.2 0.1E-01 3794 443 193 410 42.1 0 44.6 0.9E-02 4826 370 269 420 45.1 0 45.0 0.1E-01 3303 557 254 430 48.3 0 45.7 0.1E-01 3997 412 268 440 51.7 0 44.2 0.5E-01 886 359 290 450 55.1 0 46.9 0.3E-01 1589 305 218 460 59.9 0 46.9 0.1E+00 318 239 175 470 63.8 0 47.7 0.1E+00 449 315 189 480 67.3 0 61.8 0.5E-01 1277 777 445 490 71.4 0 48.4 0.2E-01 2164 292 249 500 75.3 0 49.4 0.1E+00 462 299 190 550 100.2 0 52.7 0.2E-01 2637 498 208 600 129.1 0 53.4 0.6E-01 839 521 401 650 162.1 0 56.3 0.2E-01 2920 1342 284 700 200.7 0 56.7 0.4E-01 1430 470 322 750 245.9 0 60.2 0.3E-01 2399 591 368 800 292.5 0 97.5 0.1E-07 2147483647 1187167360 666092416 850 358.1 0 63.3 0.5E-01 1300 645 265 900 421.6 0 63.2 0.1E+00 629 541 330 950 495.3 0 67.1 0.5E-02 12392 1252 469 1000 575.9 0 70.5 0.2E-01 3624 2427 664 From 6 January 1989: It's not often that I apply what I learned at Berkeley to my daily work, which primarily involves finding very low-level bugs in hardware and software, mostly under development by other people. Since I have to test hardware with a wide performance range, benchmarks have to be adjustable in size so that they don't run too quickly on fast hardware to be timed accurately, nor too slowly on slow hardware to finish before that hardware is obsolete. So I have added adjustable parameters to a number of benchmarks. To calibrate them I run a little measurement program derived from the infamous Linpack benchmark. (When I first came to Sun I was so ignorant that I thought Linpack was a library of linear algebra subroutines rather than a benchmark program.) This little Linpack starts off factoring a 32x32 matrix. Even a Sun-2 can do that in acceptable time. If the time is too fast, it then automatically tries a larger matrix, up to 512x512. Then it computes what the execution time would have been for a 512x512 on the particular system, scaling the time for whatever matrix it settled on by (512/n)**3. Some Sun hardware currently under development has been getting fast enough that the calibration program tries 512x512. On both projects in ques- tion, however, I noticed that the program was getting hung up and taking an inordinate amount of time to finish the calibration program. This of course indicated a hardware bug, of which I informed the hardware guys and left them to find it. Oddly enough, the bug only affected IEEE single precision; double precision was fine; most of our bugs tend to occur in double precision for various reasons. Furthermore the bug didn't seem to show up in the 100x100, 300x300, or 1000x1000 single precision Linpack benchmarks that I also run. In the first project the microcoder dutifully tracked things down with a logic analyzer to where an underflow was occurring. Now I knew that the Lin- pack benchmark generates uniform random data over an innocuous interval, and we all know that matrices composed of random data from a uniform distribution are remarkably well conditioned. I remember this particularly well because one of my Berkeley qualifying examination questions - put to me a week ahead of time to ponder fruitlessly - was to come up with a plausible explanation of how an eminent mathematician had conducted an empirical investigation of iterative improvement, studying published test matrices and matrices of uniformly-distributed random data, and had come to the conclusion that itera- tive improvement never improved the answer by more than one or two digits, which seemed to argue against what's routinely taught in elementary numerical analysis. Anyway I told the microcoder that there was still a hardware bug, but shortly thereafter something else changed and the calibration benchmark went back to 256x256 and there never was any other evidence of a single precision problem, so I quit worrying about it. Underflows, if they could have occurred, would account for the program apparently hanging up, because the way Sun obtains IEEE conformance on underflow in systems built upon hardware like the Weitek 1164/5, is to trap on underflow and recompute the correct result very slowly in software. Such underflows are supposed to be rare. Then a totally unrelated but even more critical project ran into a simi- lar problem. This afternoon I went to a high level crisis meeting attended by at least two vice presidents, at which I gloomily reported that a new, previ- ously unknown hardware bug had appeared that affected single precision. Upon returning from that meeting one of my colleagues, who'd been helping at the logic analyzer until he got suspicious, informed me that he thought there was no hardware or compiler bug, rather that the program had started to underflow because the pivots had gotten too small and fallen off the end of single precision. What about the well-known wonderful condition of matrices of uniform random data, I asked? He suspected that the high dimensionality (512x512) was just too much for single precision. I wondered why the 1000x1000 benchmark worked in single precision. Since no progress was being made on the hardware bug, I started printing out the pivots in the program. They started out as normal numbers like 1 or -10, then suddenly dropped to about 1e-7, then later to 1e-14, and then: k 82 pivot -1.8666e-20 k 83 pivot -2.96595e-14 k 84 pivot 2.46156e-14 k 85 pivot 2.40541e-14 k 86 pivot -4.99053e-14 k 87 pivot 1.7579e-14 k 88 pivot 1.69295e-14 k 89 pivot -1.56396e-14 k 90 pivot 1.37869e-14 k 91 pivot -3.10221e-14 k 92 pivot 2.35206e-14 k 93 pivot 1.32175e-14 k 94 pivot -7.77593e-15 k 95 pivot 1.34815e-14 k 96 pivot -1.02589e-21 k 97 pivot 4.27131e-22 k 98 pivot 1.22101e-21 k 99 pivot -7.12407e-22 k 100 pivot -1.75579e-21 k 101 pivot 3.13343e-21 k 102 pivot -6.99946e-22 k 103 pivot 3.82048e-22 k 104 pivot 8.05538e-22 k 105 pivot -1.18164e-21 k 106 pivot -6.349e-22 k 107 pivot -2.48245e-21 k 108 pivot -8.89452e-22 k 109 pivot -8.23235e-22 k 110 pivot 4.40549e-21 k 111 pivot 1.12387e-21 k 112 pivot -4.78853e-22 k 113 pivot 4.38739e-22 k 114 pivot 7.3868e-28 SIGFPE 8: numerical exception, CHK, or TRAP stopped at daxpy+0x18c: movl a4@(0xe10),a3@ - Those sudden drops were certainly perplexing - almost full word width, as if the matrix were actually exactly singular. Could there be a bug in the matrix generator routine (matgen) causing some wild data to be thrown into the pot? I looked and found a perfectly conventional portable linear congruential ran- dom number generator... -- David Hough dgh@validgh.com uunet!validgh!dgh na.hough@na-net.ornl.gov