amull@Morgan.COM (Andrew P. Mullhaupt) (05/25/90)
In article <411@yonder.UUCP>, michael@yonder.UUCP (Michael E. Haws) writes: > Is it reasonable to expect that the following code should > work the same on all platforms, and produce "good" results? > > if ((45.0 / 100.0) == .45) > printf("good\n"); > else > printf("bad\n"); No. Thou shalt not compare floating point results for equality. They might depend on compiler options for optimization, etc. even on the same machine. > I remember seeing somewhere (possible a different group) an article which > made reference to some software which analyzed the accuracy of floating > point arithmetic on whatever host it was running. Anyone know where > this code can be found? You may be referring to paranoia, which I recommend for investigating your floating point reliability. However, it does not work by doing computations and then simply comparing results to fixed values. It usually computes various relationships which should be true among its results and reports on discrepancies. You might wonder if there are clever ways around this kind of problem, like subtracting and comparing to zero, but it if you are expecting to get the code to agree bit for bit across "all platforms", you're going to have to something like resorting to emulating the floating point arithmetic in integer C code. Consider that IBM 360 style floating point is not bit-normalized, and that Cray double precision is 64 bit, etc. Later, Andrew Mullhaupt
john@newave.UUCP (John A. Weeks III) (05/25/90)
In article <411@yonder.UUCP> michael@yonder.UUCP (Michael E. Haws) writes: >Is it reasonable to expect that the following code should >work the same on all platforms, and produce "good" results? > > if ((45.0 / 100.0) == .45) > printf("good\n"); > else > printf("bad\n"); E-gads. Don't ever try "exact" comparisions of floating point numbers. After all, floating point numbers in a computer are a binary representation of an "approximation" to the real number value that you immagined when you typed in the base 10 symbols. You can compare integers representations, within the limits of the machine representation, but floating point numbers should be range checked. For example, try the following: double ratio; ratio = 45.0 / 100.0; if ( ratio > 0.44999 && ratio < 0.45001 ) printf( "good\n" ); else printf( "bad\n" ); Some people like to use a subtract and a "fabs" of some sort (macro or a procedure): if ( fabs( (45.0 / 100.0) ) < 0.001 ) printf( "good\n" ); else printf( "bad\n" ); In this manner, you can substiture a define or a variable for the 0.001 incase you decide to globally change your tolerance level. There are a few legit places that you can compare real numbers, but this is best left for experts that know _when_ and _why_ it is legit. -john- -- =============================================================================== John A. Weeks III (612) 942-6969 john@newave.mn.org NeWave Communications ...uunet!rosevax!bungia!wd0gol!newave!john ===============================================================================
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/25/90)
In article <1990May24.132423.3080@eddie.mit.edu> aryeh@eddie.MIT.EDU (Aryeh M. Weiss) writes: >The moral of the story is, just don't do == between two FP numbers. >Exactitude is simply unrealistic in floating point. Although this is correct as far as it goes, you can still get into trouble. To avoid trouble, you must first truly believe that floating-point representations are APPROXIMATIONS to real numbers. This mind set will enable you to be suspicious of any algorithmic assumptions that rely on mathematical properties of the real-number system being true of floating-point representations. Some such properties ARE true for some implementations of floating point, but not ALL such properties and not for ALL implementations. Textbooks on numerical methods in computation should be consulted for more details.
amull@Morgan.COM (Andrew P. Mullhaupt) (05/25/90)
In article <1990May24.132423.3080@eddie.mit.edu>, aryeh@eddie.mit.edu (Aryeh M. Weiss) writes: > In article <411@yonder.UUCP> michael@yonder.UUCP (Michael E. Haws) writes: > >Is it reasonable to expect that the following code should > >work the same on all platforms, and produce "good" results? > > if ((45.0 / 100.0) == .45) > > printf("good\n"); > > > > Only numbers that can be expressed exactly in base 2 will produce > "good" results. 0.1 and 0.01 are prime examples of numbers that > cannot. (A common textbook example is accumulating 0.01 100 times and > not getting 1.0.) Not so fast there! There are still computers around with BCD (binary coded decimal) floating point, both in hardware and software. There are even machines which do not have an ordinary radix, such as the Wang 'logarithmic' floating point representation. What you really intend to say here is that that floating point numbers which are rational fractions with denominators given by a power of the radix may escape rounding. Portable and careful use of floating point is not based on assumptions about when rounding may occur or in what way numbers will be rounded. A good example of this is "Moler's expression", found in various places, (e.g. LINPACK). This expression attempts to determine the machine epsilon on the assumption that the radix of the representation is not a power of three, and even though there are very few (if any) machines for which this assumption is not valid, there is explicit comment indicating this in the sources. (a FORTRAN version of Moler's expression might be: ((4.0/3.0) - 1.0) * 3.0 - 1.0 in single precision.) Now it is sometimes quite useful to assume that the radix of the floating representation is known, but this assumption cannot be regarded as portable. Two such cases are balancing a matrix before applying the QR algorithm to compute eigenvalues, where the diagonal is scaled by a vector of powers of the base of the floating point representation: "The diagonal matrix D is chosen to have the form diag(beta^i1, beta^i2, ..., beta^in) where beta is the floating point base. Note that D^-1 A D can be computed without roundoff." From: Golub and van Loan, _Matrix Computations_, 2d edition, p. 381 Johns Hopkins, Baltimore, (1989). Then with respect to computing the matrix exponential by scaling and squaring, the same authors (in the same book on p. 557) point out that division by a power of two is particularly convenient because the resultant power amounts to repeated scaling. They _do not_ make any remark about the absence of roundoff. This is because the algorithm is particularly suited to squaring, and the division is by a power of two for this reason, not because it may be the floating radix. If they had in mind only machines with radix 2, they would likely have remarked on the roundoff advantage as well. It becomes quite clear that this authoritative and very recent reference is written with the point of view that you should _not_ assume the radix of the floating point representation is 2, even when it might be mildly beneficial to take advantage of this. In any case where "all platforms" are under consideration, or even "all UNIX platforms", one should not _expect_ that the dyadic rationals are exempt from rounding. They are in fact mute on the subject of non-radix floating representations, since on p. 61 they give a radix model for floating point representations. Later, Andrew Mullhaupt
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/25/90)
In article <995@s8.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes: >It becomes quite clear that this authoritative and very recent >reference is written with the point of view that you should _not_ >assume the radix of the floating point representation is 2, ... It is even more obvious when you consider the wide variation even among binary-radix floating-point processors with respect to guard bits, rounding modes, and so on that it is insane to think than any simple model of floating-point is going to apply accurately universally. Rather than attempting to model exact behavior of floating-point systems, I recommend devising algorithms that are robust in the face of considerable cruft from the floating-point unit. This is virtually always possible, if you undertake the task with sufficient suspicion.
dik@cwi.nl (Dik T. Winter) (05/26/90)
In article <995@s8.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes: > Not so fast there! Not so fast here! > There are still computers around with BCD (binary > coded decimal) floating point, both in hardware and software. There > are even machines which do not have an ordinary radix, such as the > Wang 'logarithmic' floating point representation. What you really > intend to say here is that that floating point numbers which are > rational fractions with denominators given by a power of the radix > may escape rounding. *** Very true. The keyword here is may. There is that infamous binary machine where division or multiplication by two need not be exact. Also there were (still are?) machines without representation for 0.0, they had only plus or minus very small. -- dik t. winter, cwi, amsterdam, nederland dik@cwi.nl
brnstnd@stealth.acf.nyu.edu (05/28/90)
In article <13000@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: > Rather than attempting to model exact behavior of floating-point systems, > I recommend devising algorithms that are robust in the face of > considerable cruft from the floating-point unit. This doesn't remove from FPU makers the responsibility to make chips satisfying reasonably simple rules. You really can get a lot better performance out of numerical algorithms when, e.g., you're working under Knuth's model of floating-point computation. ANSI floating-point is an excellent start, and it's worth the effort for strict compliance. Anyway, this discussion belongs in sci.math or alt.chips.poor-engrg. Followups to the former. ---Dan
gwyn@smoke.BRL.MIL (Doug Gwyn) (05/28/90)
In article <1603:May2720:32:5190@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu (Dan Bernstein) writes: -In article <13000@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: -> Rather than attempting to model exact behavior of floating-point systems, -> I recommend devising algorithms that are robust in the face of -> considerable cruft from the floating-point unit. -This doesn't remove from FPU makers the responsibility to make chips -satisfying reasonably simple rules. You really can get a lot better -performance out of numerical algorithms when, e.g., you're working under -Knuth's model of floating-point computation. ANSI floating-point is an -excellent start, and it's worth the effort for strict compliance. Unfortunately, in the real world the programmer cannot specify the properties of the floating-point hardware. There are numerous kinds of FP architectures and implementations, each with its own set of inaccuracies. I stand by what I said -- instead of relying on the FP model, make the algorithms robust.
aryeh@eddie.mit.edu (Aryeh M. Weiss) (05/29/90)
In article <995@s8.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes: >In article <1990May24.132423.3080@eddie.mit.edu>, aryeh@eddie.mit.edu (Aryeh M. Weiss) writes: >> >> Only numbers that can be expressed exactly in base 2 will produce >> "good" results. 0.1 and 0.01 are prime examples of numbers that >> cannot. (A common textbook example is accumulating 0.01 100 times and >> not getting 1.0.) > >Not so fast there! There are still computers around with BCD (binary >coded decimal) floating point, both in hardware and software. There ... Well, it IS a textbook example. I've seen it. >are even machines which do not have an ordinary radix, such as the >Wang 'logarithmic' floating point representation. What you really >intend to say here is that that floating point numbers which are >rational fractions with denominators given by a power of the radix >may escape rounding. I have seen BCD FP representations in software and I have also seen BCD fixed point representations in hardware. I was not aware of the Wang format. I should have qualified my remarks to IEEE-like (base 2) FP formats. Certainly there have been as many FP formats as manufacturers. It is always good practice to absolutely avoid assumptions about the underlying hardware. The other point is that computations that are algebraically identical cannot be assumed to be numerically identical. Algebraic expressions sometimes must be rearranged and performed in careful order to preserve numerical precision. A typical problem is taking the difference of two large but numerically close numbers, resulting in loss of precision in the small difference. This is what often screws up *unsophisticated* linear algebra algorithms. Anyway this is certainly getting out of bandwidth. --
rcd@ico.isc.com (Dick Dunn) (05/31/90)
michael@yonder.UUCP (Michael E. Haws) writes: > Is it reasonable to expect that the following code should > work the same on all platforms, and produce "good" results? > > if ((45.0 / 100.0) == .45) > printf("good\n"); > else > printf("bad\n"); Leaving aside for a moment all the nasty things that can go wrong with floating-point equality, you might note a couple of things here and ask some simpler questions: Note that the given expression contains a division of a couple of integral values in float notation. On almost all architectures, these should be represented exactly. The question could be interpreted as whether the machine is able to do an accurate division for these particular operands. Another way to look at it is to think about the process of forming the float constant ".45". This is often done by taking the string of digits (45) and scaling by an appropriate power of ten, which is remarkably close in semantics to 45.0/100.0. Yet a third viewpoint is that the entire expression is constant; a good compiler ought to evaluate it carefully. The likelihood that it will come up true, while admittedly not a certainty, ought to be higher than or equal to the likelihood that it comes up true if calculated at run time. For the particular example given, it's more surprising to get the answer "false" than for just some random example of a floating-point equality test. _ _ _ _ _ Also, in passing I'll note that the business of equality comparison with an error tolerance can be just as risky as exact equality. Suppose we define this near equality as a function (or macro or whatever) "almostequal(a,b)". The nasty is that this function is not transitive--you can have three values such that "almostequal(a,b)" and "almostequal(b,c)" are true, but "almostequal(a,c)" is false. This can trap an unwary programmer. -- Dick Dunn rcd@ico.isc.com uucp: {ncar,nbires}!ico!rcd (303)449-2870 ...Simpler is better.
amull@Morgan.COM (Andrew P. Mullhaupt) (06/01/90)
In article <1990May31.001618.24274@ico.isc.com>, rcd@ico.isc.com (Dick Dunn) writes: > michael@yonder.UUCP (Michael E. Haws) writes: > > Is it reasonable to expect that the following code should > > work the same on all platforms, and produce "good" results? > > > > if ((45.0 / 100.0) == .45) > > printf("good\n"); > > else > > printf("bad\n"); > > Leaving aside for a moment all the nasty things that can go wrong with > floating-point equality, you might note a couple of things here and ask > some simpler questions: > > Note that the given expression contains a division of a couple of integral > values in float notation. On almost all architectures, these should be > represented exactly. The question could be interpreted as whether the > machine is able to do an accurate division for these particular operands. > > Another way to look at it is to think about the process of forming the > float constant ".45". This is often done by taking the string of digits > (45) and scaling by an appropriate power of ten, which is remarkably close > in semantics to 45.0/100.0. > > Yet a third viewpoint is that the entire expression is constant; a good > compiler ought to evaluate it carefully. The likelihood that it will come > up true, while admittedly not a certainty, ought to be higher than or equal > to the likelihood that it comes up true if calculated at run time. > On the other hand, consider the possibility that the .45 is converted to double, as are the exactly representable 45.0 and 100.0. Now if the division is performed with guard digits, or perhaps in IEEE extended, then we have to worry about whether or not the comparison is done strictly as double, or in the wider form. This can depend on the hardware (do you have a coprocessor or not), on the compiler, (does it use extended or not), or on the options given to the compiler. The comparison, if done in the wider format, will result unequal if the extended and double representations of 45.0/100.0 are not the same. You're probably right that a lot of machines will result in equality for the given example, but the complexity of ensuring this makes it an assumption I'd rather skip than justify. One of the most interesting classes of bugs is based on the failure of this assumption; a program is developed, but when optimization is enabled, it breaks because some result which is stored (and therefore tacitly coerced back to double) is no longer coerced because the compiler optimized the store/recall sequence away. If you recompile the program without optimization (as is often necessary) in order to run the program under a debugger, the bug will go away! The hours of enjoyment you might spend chasing this kind of bug down is reason enough to steer well clear of code which can be very sensitive to the floating representation. There are enough such possible hours that studying numerical analysis and learning ways to safely exploit fixed and floating point computations (and a lot of other useful stuff) could be thought of as a time saver. Later, Andrew Mullhaupt
moss@cs.umass.edu (Eliot Moss) (06/02/90)
Just a minor followup to the (45.0 / 100.0) == .45 question: While the integers 45.0 and 100.0 are undoubtedly represented exactly, .45 is a repeating form in binary (just as, say, 1/7 is in decimal notation). One *would* expect (50.0 / 100.0) == .50 since .50 *does* have an exact representation in binary. In general, when writing floating point algorithms one must demand equality only within a tolerance; choosing that tolerance is a complex issue, and one can have relative and absolute tolerances. Hope this adds a little to the original posters understanding .... Eliot -- J. Eliot B. Moss, Assistant Professor Department of Computer and Information Science Lederle Graduate Research Center University of Massachusetts Amherst, MA 01003 (413) 545-4206; Moss@cs.umass.edu
dik@cwi.nl (Dik T. Winter) (06/06/90)
Just a minor followup: In article <MOSS.90Jun1104004@ibis.cs.umass.edu> moss@cs.umass.edu writes: > Just a minor followup to the (45.0 / 100.0) == .45 question: ... > One > *would* expect (50.0 / 100.0) == .50 since .50 *does* have an exact > representation in binary. I would not bet my money on it. If the evaluation of 50.0 / 100.0 is done using the hardware operation(s) to do division the result may very well not be exact. Try on a Cray: 3.0/3.0, 5.0/5.0, 7.0/7.0 etc. Some give exactly 1, others do not. (The Cray does not have division, but inversion followed by multiplication. And all those inverses are not exact.) Still worse, on a CDC 205 in double precision division by 2.0 need not be exact. -- dik t. winter, cwi, amsterdam, nederland dik@cwi.nl
hwt@.bnr.ca (Henry Troup) (06/06/90)
It occurs to me that even if the compiler does pre-evaluation of floating point expressions, you could be compiling on a different machine... More variables... -- Henry Troup - BNR owns but does not share my opinions ..uunet!bnrgate!hwt%bwdlh490 or HWT@BNR.CA