garzione@convex.com (Michael Garzione) (06/26/90)
rond@pro-grouch.cts.com (Ron Dippold) writes: >In-Reply-To: message from pariyana@hawk.ulowell.edu >> I have a question regarding floating point variables. I have a >> program that requires exact precision in all computations. >> Floating point variables have given me trouble. For example, when >> I enter 32.14 into a floating point variable, then print it out, it >> prints 32.139998. This problem is magnified in all computations. > >Check out the types that your compiler supports. Many support a "double >float" or something similar with much more precision. Some also support IEEE >floating-point numbers, which use 80 bits. >UUCP: crash!pro-grouch!rond >ARPA: crash!pro-grouch!rond@nosc.mil >INET: rond@pro-grouch.cts.com You will never get "exact" precision with any floating point computation if any of the numbers used cannot be represented in base 2 exactly (assuming the exponent in the floating point number is represented as a power of 2 and the mantissa is also in base 2 and of finite length) The error you see above has three sources. (1) 32.14 (base 10) is not an exact power of 2, so does not have an exact representation as a floating point number (2) If you say something like "float i = 32.14" you get another potential error (in this case it's just a result of #1) as the input routines convert a base 10 _printed_ representation of a number to its binary equivalent (base 2 mantissa and exponent) (3) Now that you've got an approximation of 32.14, printing it out gets still another potential error converting binary representations to base 10 to print. (From Knuth, Vol II, p. 200 : "Note: Since floating point arithmetic is inherently approximate...") If you _really_ need *exact* precision you may want to consider using a representation where non-integral numbers are represented as the ratio of two integral numbers and use multiple precision arithmetic on them. You'll be guaranteed of exact precision in all calculations, but it's probably not worth the hassle. Hope this helps. Mike Convex Computer Corporation {uunet,sun}!convex!garzione garzione@convex.com
dik@cwi.nl (Dik T. Winter) (06/27/90)
In article <44436@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: (Replying to articles dealing with exact arithmetic using floating point) > The problem has nothing to do with C or the precision of the machine's > floating point unit. It has to do with the precision of the ASCII to float > and float to ASCII conversion routines. The first sentence is correct; the second is bogus. As has been noticed before, it has to do with the impossiblility to represent all (decimal) numbers on machines that have a non-decimal representation for fp numbers. > This problem was addressed at the > SIGPLAN '90 conference on Program Language Design and Implementation. I did not read the papers from this conference, but if they address anything, it is not this problem. What they might have addressed is the problem how to write and read numbers such that a number once written and read back will have exactly the same internal representation (note the order: first write, next read back in). I doubt this however, because this problem has already been solved some 20 years ago (Algol 68 required it). There might have been improved algorithms. -- dik t. winter, cwi, amsterdam, nederland dik@cwi.nl
dylan@ibmpcug.co.uk (Matthew Farwell) (06/27/90)
In article <44436@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: >The problem has nothing to do with C or the precision of the machine's >floating point unit. It has to do with the precision of the ASCII to float >and float to ASCII conversion routines. This problem was addressed at the >SIGPLAN '90 conference on Program Language Design and Implementation. Two >papers appear in the proceedings: Consider the code segment:- main() { float f; f = 0.0; while (1) { if (f == 10.0) break; printf("%f\n", f); f += 0.1; } printf("Stopped\n"); } If its all to do with conversion routines, why doesn't this stop when f reaches 10? Dylan. -- Matthew J Farwell | Email: dylan@ibmpcug.co.uk The IBM PC User Group, PO Box 360,| dylan%ibmpcug.CO.UK@ukc Harrow HA1 4LQ England | ...!uunet!ukc!ibmpcug.co.uk!dylan Phone: +44 81-863-1191 | Sun? Don't they make coffee machines?
kaleb@mars.jpl.nasa.gov (Kaleb Keithley) (06/28/90)
In article <b3f.2688bfce@ibmpcug.co.uk> dylan@ibmpcug.CO.UK (Matthew Farwell) writes: -In article <44436@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: - -Consider the code segment:- - -main() -{ - float f; - - f = 0.0; - while (1) { - if (f == 10.0) break; - printf("%f\n", f); - f += 0.1; - } - printf("Stopped\n"); -} - -If its all to do with conversion routines, why doesn't this stop when f -reaches 10? - Because (10.0 - 0.1) + 0.1 will never be exactly equal to 10.0. kaleb@thyme.jpl.nasa.gov Jet Propeller Labs Kaleb Keithley "So that's what an invisible barrier looks like"
writer@me.utoronto.ca (Tim Writer) (06/28/90)
In article <b3f.2688bfce@ibmpcug.co.uk> dylan@ibmpcug.co.uk (Matthew Farwell) writes: >main() >{ > float f; > f = 0.0; > while (1) { > if (f == 10.0) break; > printf("%f\n", f); > f += 0.1; > } > printf("Stopped\n"); >} >If its all to do with conversion routines ... As has been mentioned in several earlier postings, it is *not* all to do with conversion routines. It *is* to do with the fact that no floating point system can represent any real number exactly. Your documentation will tell you that a float can represent real numbers within in a specific interval. However, since we know there are an infinite number of reals in any given interval, we can deduce that your floating point system can only represent a relative few of these numbers exactly. Thus, when you write `0.1' and `10.0' in your program, your computer may not store exactly `0.1' or `10.0'. Then, when you continuously add `0.1', you are not actually adding `0.1' but some value close to `0.1'. The result is you never get `10.0' because you have accumulated some error in `f'. Try this program. main() { float f, stop; f = 0.0; stop = 10.0; while (1) { if (f == stop) break; f += 0.1; (void) printf("f=%.20f\tstop=%.20f\n",f,stop); } } Now try this program. main() { float eps, tst; eps=1.0; do { eps=0.5*eps; tst=eps+1.0; } while (tst>1.0); (void) printf("%.10e\n",eps); } This gives an approximation of machine epsilon which differs from its real value by at most a factor of 2. Machine epsilon is defined to be the smallest number which when added to 1 produces a result which is greater than 1. It is a measure of the relative error in the floating point representation of real numbers. On my system, this program gives a value of roughly 5.9e-8. In other words, I can represent floats with about 8, yes *only* 8 digits accuracy. Or, to rephrase, any digits following the eighth digit of the mantissa are garbage. Tim
ark@alice.UUCP (Andrew Koenig) (06/28/90)
In article <4186@jato.Jpl.Nasa.Gov>, kaleb@mars.jpl.nasa.gov (Kaleb Keithley) writes: > Because (10.0 - 0.1) + 0.1 will never be exactly equal to 10.0. Beg pardon? I agree that it won't always be exactly 10.0 but `never' is a bit strong. -- --Andrew Koenig ark@europa.att.com
dylan@ibmpcug.co.uk (Matthew Farwell) (06/28/90)
In article <4186@jato.Jpl.Nasa.Gov> kaleb@mars.UUCP (Kaleb Keithley) writes: >In article <b3f.2688bfce@ibmpcug.co.uk> dylan@ibmpcug.CO.UK (Matthew Farwell) writes: >-[code segment deleted] >- >-If its all to do with conversion routines, why doesn't this stop when f >-reaches 10? >Because (10.0 - 0.1) + 0.1 will never be exactly equal to 10.0. This is exactly the point I was making. The person who I followed up to was saying it was to do with ascii->float conversion routines. I was making the point that it wasn't anything to do with them. Dylan. -- Matthew J Farwell | Email: dylan@ibmpcug.co.uk The IBM PC User Group, PO Box 360,| dylan%ibmpcug.CO.UK@ukc Harrow HA1 4LQ England | ...!uunet!ukc!ibmpcug.co.uk!dylan Phone: +44 81-863-1191 | Sun? Don't they make coffee machines?
karl@haddock.ima.isc.com (Karl Heuer) (06/29/90)
In article <4169@jato.Jpl.Nasa.Gov> kaleb@mars.UUCP (Kaleb Keithley) writes: >I once wrote a rational number package... All numbers are held in >numerator/denominator form in a struct. Certain operations, addition in particular, will quickly cause the denominator to exceed the range of an integral type. To avoid this, you have to combine this with an expandable-int package. And you still have a problem if you apply a math function: sqrt(2) can only be approximated, not represented exactly, by this mechanism. Several people have posted to say that the exact-representation problem is "unsolvable" because of this. This is not necessarily true. You could use continued fraction representation, for instance. sqrt(2) still has no finite representation, but by using lazy evaluation you don't *need* to have the whole thing in hand at once: it suffices to have a generator object that will emit the next term on request. There's a good discussion of this in HAKMEM, item #101. Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint
tony@oha.UUCP (Tony Olekshy) (06/29/90)
In <b3f.2688bfce@ibmpcug.co.uk>, dylan@ibmpcug.co.uk (Matthew Farwell) writes: > > > [question about] while (f != 10.0) f += 0.1; > > If its all to do with conversion routines, why doesn't this stop when f > reaches 10? [which was a rhetorical question, to the poster he had quoted] I add... It does not stop at (f == 10.0) simply because no-one said f = 10.0; The converse (why did it stop, if it had) is more complex. Next question. -- Yours etc., Tony Olekshy. Internet: tony%oha@CS.UAlberta.CA BITNET: tony%oha@UALTAMTS.BITNET uucp: alberta!oha!tony
jlg@lambda.UUCP (Jim Giles) (06/30/90)
From article <b3f.2688bfce@ibmpcug.co.uk>, by dylan@ibmpcug.co.uk (Matthew Farwell): > [...compressed example ...] > main() { > float f; f = 0.0; > while (1) { > if (f == 10.0) break; > printf("%f\n", f); > f += 0.1;} > printf("Stopped\n");} > > If its all to do with conversion routines, why doesn't this stop when f > reaches 10? It doesn't (or shouldn't) have anything to do with the conversion routines. The job of the conversion routines is to convert the decimal into the internal representations and vice-versa. They _should_ do this job as accurately as possible - if the number is exacly representable in both bases you have a right to expect exact conversion. The above routine fails (on some machines) to terminate because 0.1(decimal) is not exactly representable in base 2 (or many other bases for that matter). The value of the increment is therefore approximated. Depending upon the precision and rounding mode of the machine in question, the above program may or may not terminate. This fault exists even if the conversion was carried out to the nearest possible internal representation. The value of 0.1(decimal) is 0.000110011001100110011...(binary). This is a repeating fraction, where the group ^^^^ are the repeated digits. No matter what precision your machine carries, it can't represent the whole thing (not as a floating-point number anyway). A hunderd repeated adds with the approximate value _can't_ really equal 10.0 - though it might coincidentally match if the precision and rounding modes exactlty cancel each other out. Solutions to this problem are to introduce new data types (like fixed-point or rational) or to use a base 10 machine. Both these solutions carry high price-tags (in terms of speed or hardware or both). J. Giles
barmar@think.com (Barry Margolin) (07/06/90)
In article <4186@jato.Jpl.Nasa.Gov> kaleb@mars.UUCP (Kaleb Keithley) writes: >In article <b3f.2688bfce@ibmpcug.co.uk> dylan@ibmpcug.CO.UK (Matthew Farwell) writes: [A for-loop iterating by 0.1] >-If its all to do with conversion routines, why doesn't this stop when f >-reaches 10? >Because (10.0 - 0.1) + 0.1 will never be exactly equal to 10.0. Well, it does on my IEEE-compliant Symbolics Lisp Machine. But other floating point formats may not have this property. And there are other floating point computations that don't obey normal arithmetic axioms; for instance, 10.0 * .1 == 1.0 .1+.1+.1+.1+.1+.1+.1+.1+.1+.1 == 1.0000001 This latter inequality is the reason the loop fails to terminate. The problem is that 1/10 is a repeating fraction in binary, so the internal representation of .1 can't be exactly right. The multiplication happens to round in such a way that the error is cancelled out, but the repeated additions accumulate the errors because there's a rounding step after each addition step. In fact, here's a smaller case that demonstrates this: .1 + .1 == .2 .5 + .1 == .6 .5 + (.1 + .1) == .7 (.5 + .1) + .1 == .70000005 -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
steve@groucho.ucar.edu (Steve Emmerson) (07/08/90)
In <14429@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: >... >The job of the conversion routines is to convert the decimal into the internal >representations and vice-versa. They _should_ do this job as accurately as >possible - if the number is exacly representable in both bases you have a >right to expect exact conversion. A right? Granted by whom? Steve Emmerson steve@unidata.ucar.edu ...!ncar!unidata!steve
ark@alice.UUCP (Andrew Koenig) (07/08/90)
In article <7906@ncar.ucar.edu>, steve@groucho.ucar.edu (Steve Emmerson) writes: > In <14429@lambda.UUCP> jlg@lambda.UUCP (Jim Giles) writes: > > >... > >The job of the conversion routines is to convert the decimal into the internal > >representations and vice-versa. They _should_ do this job as accurately as > >possible - if the number is exacly representable in both bases you have a > >right to expect exact conversion. > A right? > Granted by whom? Granted by the IEEE floating-point standard, for one thing. If I am using a system whose vendor claims that it supports IEEE floating point, then I can expect that input conversion on a floating point number with an exact representation will be exact; input conversion at compile time will give precisely the same result as input conversion at run time; if I write out any number at all with enough significant digits, I can read back the result and get exactly the same result; and a whole bunch of other useful properties. Interestingly, this does not require that conversion be as accurate as possible, and one can reasonably argue that conversions should generally *not* be as accurate as possible, because it's quite expensive to get it exactly right compared with what it costs to get it *almost* right. In particular, I do not know how to write an exact input conversion routine without doing unbounded precision arithmetic. For that reason, the IEEE standard does *not* require conversion to be as accurate as possible -- it allows an error of up to 0.47 times the least significant bit of the number being converted, in addition to the inevitable error implied by rounding. The number 0.47 is carefully chosen to guarantee all the properties I mentioned above; the IEEE standard refers the reader to Jerome Coonen's PhD thesis for further detail. -- --Andrew Koenig ark@europa.att.com
steve@groucho.ucar.edu (Steve Emmerson) (07/09/90)
In <11035@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: >Granted by the IEEE floating-point standard, for one thing. >If I am using a system whose vendor claims that it supports >IEEE floating point, then I can expect that > input conversion on a floating point number with an > exact representation will be exact; An earlier discussion of the IEEE standard indicated that it allows an exactly-representable value to be off by by one bit. Is this, then, incorrect? Steve Emmerson steve@unidata.ucar.edu ...!ncar!unidata!steve
oz@yunexus.yorku.ca (Ozan Yigit) (07/11/90)
In article <11035@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: >Interestingly, this does not require that conversion be as accurate >as possible, and one can reasonably argue that conversions should >generally *not* be as accurate as possible, because it's quite expensive >to get it exactly right compared with what it costs to get it >*almost* right. In particular, I do not know how to write an >exact input conversion routine without doing unbounded precision >arithmetic. Two excellent papers on this topic that may be of interest can be found in "Proceedings of the ACM '90 Conference on Programming Language Design and Implementation" [SIGPLAN NOTICES, v25, #6, June 1990]. Will Clinger: "How to Read Floating Point Numbers Accurately" Guy Steel (Jr) and John White: "How to Print Floating Point Numbers Accurately" I suspect the essence of their work will be reflected in various Common Lisp and Scheme implementations, and their respective standards. oz --- Sometimes when you fill a vacuum, it still sucks. | oz@nexus.yorku.ca Dennis Ritchie | or (416) 736 5257
loren@tristan.llnl.gov (Loren Petrich) (07/14/90)
In article <b3f.2688bfce@ibmpcug.co.uk> dylan@ibmpcug.CO.UK (Matthew Farwell) writes: >In article <44436@ism780c.isc.com> marv@ism780.UUCP (Marvin Rubenstein) writes: >Consider the code segment:- > >main() >{ > float f; > > f = 0.0; > while (1) { > if (f == 10.0) break; > printf("%f\n", f); > f += 0.1; > } > printf("Stopped\n"); >} > >If its all to do with conversion routines, why doesn't this stop when f >reaches 10? That is because 0.1 is not represented as a finite-length number in binary. In binary form it is: 0.0011001100110011... with an infinitely repeating sequence of digits. Check it out by multiplying it by 10 (binary: 1010). Since a computer will only use finite-length numbers, it will cut off this infinite-length representation, and get a number that is not quite 0.1. But no integer multiple of that can equal 10.0, so the program gets stuck in an infinite loop. The moral: avoid this sort of test. You won't get exact agreement. ^ Loren Petrich, the Master Blaster \ ^ / loren@sunlight.llnl.gov \ ^ / One may need to route through any of: \^/ <<<<<<<<+>>>>>>>> lll-lcc.llnl.gov /v\ lll-crg.llnl.gov / v \ star.stanford.edu / v \ v For example, use: loren%sunlight.llnl.gov@star.stanford.edu My sister is a Communist for Reagan
ark@alice.UUCP (Andrew Koenig) (07/15/90)
In article <7913@ncar.ucar.edu>, steve@groucho.ucar.edu (Steve Emmerson) writes: > In <11035@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: > >Granted by the IEEE floating-point standard, for one thing. > >If I am using a system whose vendor claims that it supports > >IEEE floating point, then I can expect that > > input conversion on a floating point number with an > > exact representation will be exact; > An earlier discussion of the IEEE standard indicated that it allows an > exactly-representable value to be off by by one bit. > Is this, then, incorrect? It depends on what you mean by `off by one bit.' Every decimal floating-point literal (such as 0.3 or 2.7e-28) is the exact representation of some rational number. When trying to fit that number into an IEEE floating-point representation, some accuracy may have to be lost by rounding. The IEEE standard allows that rounding error, of course, and 0.47 times the value of the low-order bit in additional error. Among other things, the number 0.47 implies that some numbers will have the last bit wrong when converted, but if the input exactly represents an IEEE floating-point value, then that is the value you will get. Moreover, if you write an IEEE value out with enough significant digits and read it back in again, the accumulated error will never be enough to change the value. -- --Andrew Koenig ark@europa.att.com