jsturges@pldote.intel.com (~Jay Sturges) (07/12/89)
/* * Here is a routine written some time ago * which will perform ASCII to float conversions * * Enjoy, * * Jay Sturges sun!sacto!pldote!jsturges * */ #include <stdio.h> #define MAXEXP 38 /* maximum exponent */ #define MINEXP -38 /* minimum exponent */ #define EOL '\0' /* end of line */ #define DOT '.' /* decimal point */ #define TRUE 1 #define FALSE 0 #define CVALUE(c) (c - '0') /* numerical value of char */ typedef char boolean; /* flagging type */ /* * atof( ) * string to double conversion in C language * */ atof(t, value) char *t; real *value; { boolean eflg = FALSE; /* exponetial flag */ boolean dflg = FALSE; /* decimal point flag */ boolean sflg = FALSE; /* sign (+/-) flag */ boolean error = FALSE; /* error case */ real gtvalue = 0; /* value of the string */ real dsign = 1.0; /* sign of the mantissa */ int esign = 1; /* sign of exponetial */ int exp = 0; /* exponetial value */ int didx = 0; /* places after decimal pt. */ int big = 0; /* keeping track of the value */ while (*t != EOL && !error) { switch (*t) { case 'e': /* exponetial flag */ case 'E': /* reset sign flag */ if (eflg) error = TRUE; else { eflg = TRUE; sflg = FALSE; } break; case DOT: /* decimal point flag */ if (dflg) error = TRUE; else { /* * starting counting digits * after decimal point */ dflg = TRUE; didx = 1; } break; case '-': /* sign flag */ case '+': if (sflg) error = TRUE; else { sflg = TRUE; if (!eflg && *t == '-') dsign = -1.0; else if (eflg && *t == '-') esign = -1; } break; case '0': /* unsigned part of the */ case '1': /* exponetial and the */ case '2': /* mantissa. */ case '3': case '4': case '5': case '6': case '7': case '8': case '9': if (eflg) exponetial(*t, &exp); else { if (big >= MAXEXP) error = TRUE; else { mantissa(*t,>value,didx,dflg,&big); if (dflg) didx++; } } break; case ' ': break; default: error = TRUE; } t++; } /* * having the exponetial, the mantissa, the signs of them * the value can be computed, if possible */ if (!error) { /* * signed exponetial and mantissa */ exp *= esign; gtvalue *= dsign; if (exp > MAXEXP || exp < MINEXP || (exp + big) > MAXEXP || (exp + big) < MINEXP) error = TRUE; else if (exp > 0) while (--exp >= 0) gtvalue *= 10.0; else if (exp < 0) while (++exp <= 0) gtvalue *= 0.1; *value = gtvalue; /* * you can exponent zero with any number * and it is still a legal case */ if (*value == 0.0) error = FALSE; } return (error); } /* * basically, atoi() */ static exponetial(c, exp) char c; int *exp; { *exp *= 10; *exp += CVALUE(c); } static mantissa(c, value, didx, decimal, big) char c; real *value; int didx; boolean decimal; int *big; { real dtmp; if (!decimal) { /* * before the decimal point * keeping increasing the value */ *value *= 10.0; *value += (real) CVALUE(c); if (*value > 0.0) (*big)++; } else { /* * after the decimal point */ dtmp = CVALUE(c); if (didx >= MAXEXP) { /* too small, it's virtually zero */ dtmp = 0.0; didx = 0; } while (--didx >= 0) /* C is pass by value */ dtmp *= 0.1; *value += dtmp; } }
meissner@tiktok.dg.com (Michael Meissner) (07/27/89)
In article <38@pldote.intel.com> jsturges@pldote.intel.com (~Jay Sturges) writes: | | /* | * Here is a routine written some time ago | * which will perform ASCII to float conversions | * | * Enjoy, | * | * Jay Sturges sun!sacto!pldote!jsturges | * | */ ... | if (!error) | { | /* | * signed exponetial and mantissa | */ | exp *= esign; | gtvalue *= dsign; | | if (exp > MAXEXP || exp < MINEXP || | (exp + big) > MAXEXP || | (exp + big) < MINEXP) | error = TRUE; | else if (exp > 0) | while (--exp >= 0) | gtvalue *= 10.0; | else if (exp < 0) | while (++exp <= 0) | gtvalue *= 0.1; ... | /* | * after the decimal point | */ | dtmp = CVALUE(c); | if (didx >= MAXEXP) | { /* too small, it's virtually zero */ | dtmp = 0.0; | didx = 0; | } | while (--didx >= 0) /* C is pass by value */ | dtmp *= 0.1; | *value += dtmp; | } | } | This code is subject to round off errors (particularly multiplying by 0.1 which is faster, but less accurate than dividing by 10.). Each fractional digit increases the accumulated errors. For example, the following two numbers: 1.0 0.1e1 should be identical down to the last bit. The best way to do this conversion is to accumulate the mantissa as an integer, possibly dropping digits when the bits in the mantissa cannot hold the number exactly. You then calculate the scale factor (exponent - number of digits not dropped to the right of the decimal point). If this scale factor is 0, you are done; if it is positive, you multiply by the appropriate power of ten (which you get from a statically calculated array, not by repeated multiplications); if it is negative you divide by the appropriate power of ten (negative of the scale factor). This way the round off error is minimized to one multiply or divide. A good course or textbook in numerical analysis should give other means of preserving the accuracy. -- Michael Meissner, Data General. Uucp: ...!mcnc!rti!xyzzy!meissner If compiles were much Internet: meissner@dg-rtp.DG.COM faster, when would we Old Internet: meissner%dg-rtp.DG.COM@relay.cs.net have time for netnews?
andrew@alice.UUCP (Andrew Hume) (07/29/89)
it seems to me the discussion of an atof is going in th wrong direction. it is nice to insist that 1.0 and 0.1e1 evaluate to the same bit pattern but really what we want, at least what i want, is an accurate atof. if you consider the real number line, the set of possible values of double (or whatever precision) are just a bunch of points on this line. then atof should return the closest such point to the real represented by the string. (ties should probably be broken systematically.) in summary, get someone who really knows what is going on (not me certainly) to figure it out; buggering around with half-hearted hacks to preserve some precious bits of accuracy is simply a waste of time, however well intentioned. (sorry, my spleen needed venting.)
ark@alice.UUCP (Andrew Koenig) (07/29/89)
In article <9699@alice.UUCP>, andrew@alice.UUCP (Andrew Hume) writes: > it seems to me the discussion of an atof is going in th wrong > direction. it is nice to insist that 1.0 and 0.1e1 evaluate > to the same bit pattern but really what we want, at least what i want, > is an accurate atof. if you consider the real number line, the set > of possible values of double (or whatever precision) are just a bunch > of points on this line. then atof should return the closest such point > to the real represented by the string. (ties should > probably be broken systematically.) > in summary, get someone who really knows what is going on > (not me certainly) to figure it out; buggering around with > half-hearted hacks to preserve some precious bits of accuracy > is simply a waste of time, however well intentioned. Well, I sort of know what's going on, having written the assembly-language atof() that's been in use on VAX machines for the past ten years. Ideally, atof would return the closest floating-point number to the value given as input, with ties broken as specified by the machine's arithmetic (IEEE says that ties are ordinarily broken by rounding to even). Unfortunately, an atof that good requires unbounded precision arithmetic -- I just don't see any other sensible way of doing it. To convince yourself of this, pick an integer that's big enough to fill completely the fraction of your floatin-point format. That is, changing the least significant bit (LSB) changes the value by 1. Now add 0.5 to that integer. You then have a number whose decimal representation is of the form xxxxxx.5 that represents an exact tie. If ties are rounded down, or the integral part of the number is even and ties are rounded to even, then this number itself will be rounded down. However, if instead of xxxxxx.5 you have xxxxxx.5000...0001, it will have to be rounded up. Thus in general you need to know about ALL the digits of the number to convert it exactly. For this reason, IEEE does not require exact conversion. Instead, it requires an error of no more than 0.47*LSB in the conversion. If I remember correctly, the value 0.47 is a principal result of Jerome Coonen's PhD thesis: it guarantees that you can write out a floating-point number and read it back in again while preserving the precise bit pattern, yet is still not too difficult to implement. In addition to this property, IEEE requires that input must be converted precisely the same way at compile time and run time. Also, the accuracy requirement implies that a decimal value that can be exactly represented as a floating-point number must be converted to that number. Beyond this, it is not hard to guarantee that equivalent representations such as 1.0 and 0.1e1 will produce exactly the same result. I forget whether IEEE requires that or not. One way to do it is to follow the strategy I did in my VAX atof(): accumulate digits in a multiple-precision integer until further digits would cause that integer to overflow. Along the way, remember where the decimal point is. Then pick off the exponent, if any, offset it by the decimal point position, and then (finally) scale the multiple-precision integer appropriately. I used 64 bits for the integer; since a VAX double-precision fraction is 56 bits (including a hidden bit) I could lose 7 bits in the scaling and still be assured of getting it (nearly) right. I did not actually prove that I never lost more than 7 bits in scaling, but I did try a number of edge cases and believe that the actual results are at least a full decimal digit better than the IEEE requirement. I completely agree with Andrew Hume that floating-point conversion is not something that can be done well casually. -- --Andrew Koenig ark@europa.att.com
knighten@pinocchio (Bob Knighten) (07/31/89)
In article <9701@alice.UUCP>, ark@alice (Andrew Koenig) gives a nice
discussion of some of the issues in writing a good atof. Here I would like to
add a few references for those who want to do it. The best overall discussion
- in the context of the IEEE standard for binary floating point arithmetic - is
Chapter 7 (Accurate Yet Economical Binary-Decimal Conversions) of
Jerome T. Connen, Contributions to a Proposed Standard for Binary
Floating-Point Arithmetic. Ph. D. Thesis, Department of Mathematics,
University of California, Berkeley. June, 1984.
This is available from University Microfilms. The order number is 8512788.
Also useful is Appendix D (Pascal Unit for Correctly Rounded Binary-Decimal
Conversions.)
Other useful references:
J. T. Coonen, "An implementation guide to a proposed standard for
floating-point arithmetic," Computer, 13(1), Jan. 1980, pp 68-79
I. B. Goldberg, "27 bits are not enough for 8-digit accuracy," CACM, 10(2),
Feb. 1967, pp. 105-106
D. W. Matula, "In-and-out conversions," CACM, 11(1), Jan. 1968, pp. 47-50
E. W. Phillips, "Binary calculation," in The Origins of Digital Computers, 2nd
ed., G. Randell, ed., Springer-Verlag, 1975, p. 273