dgh@validgh.com (David G. Hough on validgh) (06/20/91)
Nelson Beebe (beebe@math.utah.edu) recollected the following message to a colleague: ------------------------------------------- The following little program can be used to illustrate the effect of truncating arithmetic has on your larger program: real dt,t0,t1,t2,tend integer n n = 0 dt = 0.018 t0 = 4000.0 tend = 5000.0 t1 = t0 t2 = t0 10 n = n + 1 t1 = t1 + dt t2 = t0 + float(n)*dt if (t2 .lt. tend) go to 10 write (6,*) t1,t2,(t1 - t2)/t2 end On the IBM 3090, this single precision version prints: 4879.89844 5000.00781 -0.240218341E-01 That is, the relative error is 2.4%. On the Sun 4, it produces 5003.70 5000.01 7.37889E-04 The effect of truncating arithmetic on the running sum is large. The double precision version is: double precision dt,t0,t1,t2,tend integer n n = 0 dt = 0.018D+00 t0 = 4000.0D+00 tend = 5000.0D+00 t1 = t0 t2 = t0 10 n = n + 1 t1 = t1 + dt t2 = t0 + dfloat(n)*dt if (t2 .lt. tend) go to 10 write (6,*) t1,t2,(t1 - t2)/t2 end The IBM 3090 result is 5000.00799995563648 5000.00799999999981 -0.887265231637227285E-11 The Sun 4 result is 5000.0080000016 5000.0080000000 3.2341579848518D-13 ------------------------------------------- Note that satisfactory results are obtained if you use enough precision or if you round rather than chop. Also note that this is not the program that failed, but rather a drastic simplification of the user's actual application to reveal the essential problem. It's a simple example where the superior statistics of rounding rather than chopping imply a broader domain of applicability for a particular program. Correct rounding and chopping, and several other good paradigms, can be characterized by the property The rounded computed result is chosen from the two machine-representable numbers nearest the unrounded infinitely-precise exact result, according to a rule that depends only on the infinitely-precise exact result, and not on the operands or operation (or phase of moon). Most "fast" sort-of-rounding or sort-of-chopping schemes invented by hardware designers eventually frustrate error analysts because they can't be so characterized. As for the first IBM RS/6000 implementation, I have heard that the original floating-point unit was designed to implement IBM 370 arithmetic, and was changed to IEEE 754 format relatively late in the game. If true then it would not be surprising that some aspects of 754 were problematic to add in. The interesting question would then be which aspects of IEEE arithmetic will be really problematic for a high-performance RS/6000 implementation designed from scratch to support 754. -- David Hough dgh@validgh.com uunet!validgh!dgh na.hough@na-net.ornl.gov
jbs@watson.ibm.com (06/22/91)
David Hough posted the following: >Nelson Beebe (beebe@math.utah.edu) recollected the following message >to a colleague: >------------------------------------------- > The following little program can be used to illustrate the effect of > truncating arithmetic has on your larger program: > real dt,t0,t1,t2,tend > integer n > n = 0 > dt = 0.018 > t0 = 4000.0 > tend = 5000.0 > t1 = t0 > t2 = t0 > 10 n = n + 1 > t1 = t1 + dt > t2 = t0 + float(n)*dt > if (t2 .lt. tend) go to 10 > write (6,*) t1,t2,(t1 - t2)/t2 > end > On the IBM 3090, this single precision version prints: > 4879.89844 5000.00781 -0.240218341E-01 > That is, the relative error is 2.4%. On the Sun 4, it produces > 5003.70 5000.01 7.37889E-04 > The effect of truncating arithmetic on the running sum is large. > The double precision version is: > double precision dt,t0,t1,t2,tend > integer n > n = 0 > dt = 0.018D+00 > t0 = 4000.0D+00 > tend = 5000.0D+00 > t1 = t0 > t2 = t0 > 10 n = n + 1 > t1 = t1 + dt > t2 = t0 + dfloat(n)*dt > if (t2 .lt. tend) go to 10 > write (6,*) t1,t2,(t1 - t2)/t2 > end > The IBM 3090 result is > 5000.00799995563648 5000.00799999999981 -0.887265231637227285E-11 > The Sun 4 result is > 5000.0080000016 5000.0080000000 3.2341579848518D-13 >------------------------------------------- >Note that satisfactory results are obtained if you use enough precision >or if you round rather than chop. Also note that this is not the program >that failed, but rather a drastic simplification of the user's actual >application to reveal the essential problem. It's a simple example where >the superior statistics of rounding rather than chopping imply a broader >domain of applicability for a particular program. I believe this example is misleading for 2 reasons. 1.) The main source of error in the hex results is the following. Since 16**3 = 4096 and t0 and tend have been chosen so that the running sum t1 goes from 4000 to 5000, most of the time t1 will have leading hex digit 1. As is well known this causes a loss of accuracy in the hex computation. Running the sum from 3000 to 4000 improves the relative error in the hex results by factors of 10 and 23. While the possibilty of this sort of accuracy loss is undeniably a defect in the hex repres- entation, it says nothing about the relative merits of BINARY chopped vrs binary rounded. 2.) As pointed out above we should run the program using IEEE chopped for a fair comparison. Now the relative error is -.44*10**-2 (vrs .73*10**-3 for rounded) for single precision and -.92*10**-11 (vrs .32*10**-12 for rounded) for double precision. Hence the magnitude of the chopped error is about 6 times the magnitude of the rounded error in the single precision case and about 29 times in the double precision case. These ratios seem substantial but they are not typical for the following reason. As soon as the running sum surpasses 2**12=4096 the binary representation of .018 is such that chopping makes an error of .86 ulps per addition in the single precision case, .968 ulps per addition in the double precision case. Clearly rounding reduces this error to .14 ulps in the single precision case, .032 ulps in the double precision case. The ratios .86/.14 and .968/.032 are consistent with the observed difference in accuracy. However it is easy to see that the average error made by chopping is .5 ulps just twice the average error made by rounding (.25 ulps). Similarly the worse case error made by chopping is 1 ulp just twice the worst case error (.5 ulps) made by rounding. Hence on the average rounding gains one bit of accuracy. This is the result of the fact that the uncertain- ty in the rounded answer is two-sided while the uncertainty in the chopped answer is one-sided. The width of the interval of uncertainty is not any less for rounding. If the rounding errors were independent (which is not the case here) then the answer would be more likely to lie near the center of the uncertainty interval than near the ends. This would decrease the expected error in the rounded answer vrs the chopped answer by a factor of about sqrt(n) where n is the number of rounds. The worst case error ratio however would remain 2. David Hough also said >Correct rounding and chopping, and several other good paradigms, can be >characterized by the property > The rounded computed result is chosen from the two machine-representable > numbers nearest the unrounded infinitely-precise exact result, > according to a rule that depends only on the infinitely-precise > exact result, and not on the operands or operation (or phase of moon). >Most "fast" sort-of-rounding or sort-of-chopping schemes invented by >hardware designers eventually frustrate error analysts because they >can't be so characterized. The chopping scheme I mentioned if properly interpreted has this property. The machine numbers for this scheme lie exactly half-way in- between the IEEE machine numbers (ie they have an implicit 1 bit at the end instead of an implicit 0 bit). Chopping then corresponds to round- ing to nearest (with half way cases always going up). As far as I can see the main problem with this scheme is that small integers are no longer machine numbers which some would no doubt find unsettling. David Hough also said: >As for the first IBM RS/6000 implementation, I have heard that the original >floating-point unit was designed to implement IBM 370 arithmetic, and was >changed to IEEE 754 format relatively late in the game. If true then it >would not be surprising that some aspects of 754 were problematic to add in. >The interesting question would then be >which aspects of IEEE arithmetic will be really >problematic for a high-performance RS/6000 implementation designed from >scratch to support 754. They may be some truth to this. On the other hand the design may have benefited from the efforts of the designers to lose as little performance as possible relative to hex. Certainly the performance of this chip seems to compare favorably to competitive IEEE from the start designs. IEEE designers may take the standard for granted and not spend a lot of time thinking about how fast alternatives would be thus not fully understanding how much performance the IEEE standard is giving up. James B. Shearer