rick@tetrauk.UUCP (Rick Jones) (08/06/90)
Thanks to all those who responded to my query about exactness (or lack of it) in FP numbers. It has confirmed my suspicion that there isn't a clean solution to this problem. The responses can be grouped into 3 broad categories: a. "The effective precision is a function of the algorithm and initial data, not just the representation" While this is true, the representation puts a limit on the precision even when the algorithm may suggest something higher. But more fundamentally, this is an issue of _precision_ as opposed to _exactness_, which are not quite the same. If the context is dealing with real-world values, they can all be assumed to be irrational, and known only to within some understood precision. When dealing with more artificial numbers (see below), they are presumed to be exact, and unexpected problems arise because there is not a 1-1 correspondence between exact decimal numbers and exact binary FP numbers. As my example showed, you don't _expect_ to get into precision issues when checking if .291/2 = .1455 b. "Analyse the bit-pattern of the IEEE format" This came from Stefan & Carlos at U. of Basel, and although possible, it feels a bit non-portable & hardware specific. We write portable packages which run on all sorts of hardware - just how universal IS the IEEE format? (that's not an entirely rhetorical question) c. The "constructive reals" solution This is interesting, and I will follow up the various references. It seems quite well-known, but sounds a bit computationally expensive, though I don't know enough yet to make a qualified comment. d. The "continued fractions" solution The reference to this came from Colin Plumb, and is also something I shall follow up. It appears to have an advantage that all rational numbers have a finite representation. For interest (if anyone's still with me :-), I'll explain the problem domain. I deliberately omitted this from my initial query to get as wide a response as possible. This is nothing more taxing (!) than business systems, which are not known for requiring complex mathematics. They don't, but they do require exactness. Auditors have this annoying view that accounts must balance to the penny, not 1 part in 10 to-the-something. There is a good case for using some form of BCD representation, but there are many programming advantages in using the embedded numerical types of the language (yes, I do know about OOP and building a BCD class with overloaded operators, that's one of my options). In fact, the only disadvantage of FP is the exactness issue. 15 significant digits is enough, the 8-byte repesentation is compact and fixed, the arithmetic operators are embedded and the computation done in an FPP in all except toy computers. Unfortunately, exactness cannot be controlled by applying a system-wide precision in terms of decimal places. To illustrate: My system may be handling currency values in units of Lira, Yen, or some other inconsiderate currency with a very small basic denomination, and need to go up to quantities of 10^9 or more. I may also have some stock kept in bulk, and need to account for quantities down to 4 decimal places. This total range is pushing the limits, but no one figure needs that total range of precision - I never account for less than 1 Yen, and I'm not going to have a billion tons of bulk stock. Andrew Koenig appreciated this problem; as he said: Floating point arithmetic is *hard*. Precision is dealt with in the appropriate places by rounding, but even this causes a problem. I notice that no one tried to provide a solution to the second, and in many ways more subtle, problem in my original posting. Given that the supposed exact but unrepresentable value of .1455 has been arrived at by two different means where the actual values are incrementally above and below, and we simply want to round the result to 3 decimal places: simplistic rounding will actually _increase_ the discrepancy by yielding .145 and .146 respectively. Comparison of the values before rounding would indicate them equal within 15 significant digits. After rounding, they show a difference in the 3rd digit. Accountants find this sort of thing difficult to grasp! I actually have a general solution for this, but it involves sprintf() and some sneaky string manipulation, and is hardly elegant. Thanks for the ideas, they have provided food for thought. I shall be on holiday for the next 3 weeks; if you have any Earth-shattering solutions which haven't come up so far, could you e-mail me please? -- Rick Jones You gotta stand for something Tetra Ltd. Maidenhead, Berks Or you'll fall for anything rick@tetrauk.uucp (...!ukc!tetrauk.uucp!rick) - John Cougar Mellencamp
henry@zoo.toronto.edu (Henry Spencer) (08/08/90)
In article <713@tetrauk.UUCP> rick@tetrauk.UUCP (Rick Jones) writes: >... This is nothing more taxing (!) than business systems, which are not >known for requiring complex mathematics. They don't, but they do require >exactness. Auditors have this annoying view that accounts must balance to the >penny, not 1 part in 10 to-the-something. There is a good case for using some >form of BCD representation, but there are many programming advantages in using >the embedded numerical types of the language... Our own experience, in building accounting systems for our own use, is that the problem can often be eliminated by ignoring the decimal point in dollar currencies and thinking of money as measured in pennies. Then integer arithmetic suffices, if amounts are not huge. If amounts *are* huge, note that most floating-point boxes will do precise integer arithmetic if you are careful to avoid operations that necessarily produce fractions. Yes, you get fractional amounts if you (say) buy 3.5 tons of stuff that is priced at $1.17 per ton... but there really *is* a fractional amount there, and the proper response is to explicitly round it in whatever way is appropriate to the problem (in this case, the standard business approach of rounding up). This isn't necessarily an adequate solution to tricky cases involving things like multiple currencies, but for simple stuff, it's amazing how much simpler the problem gets if you don't insist on representing money as a fraction just because it's written that way. -- The 486 is to a modern CPU as a Jules | Henry Spencer at U of Toronto Zoology Verne reprint is to a modern SF novel. | henry@zoo.toronto.edu utzoo!henry
johnb@srchtec.UUCP (John Baldwin) (08/09/90)
In article <1990Aug7.173030.2823@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >In article <713@tetrauk.UUCP> rick@tetrauk.UUCP (Rick Jones) writes: >> Auditors have this annoying view that accounts must balance to the >>penny, not 1 part in 10 to-the-something. There is a good case for using some >>form of BCD representation, but there are many programming advantages in using >>the embedded numerical types of the language... > >Our own experience, in building accounting systems for our own use, is that >the problem can often be eliminated by ignoring the decimal point in dollar >currencies and thinking of money as measured in pennies. Doesn't this push him back into the "continued sums" implementation? Remember, conversion rates between different currencies "slide" up and down at a fairly high frequency. (Does anybody know just HOW frequently the exchange rates change?) This would require an INTERNAL currency which would probably NOT be the same as a recognized world currency; one in which any unit of any other world currency can be expressed as an integer. This is probably tricky enough to find WITHOUT having the conversions changing on you all the time! Rick: how many currencies do you have to deal with and how often do they change? Maybe this *would* be a feasible approach: if the numbers are too big for a (long), store them in a struct. Have a multidimensional array of structs containing numerators and denominators for conversions. -- John T. Baldwin | johnb@srchtec.uucp Search Technology, Inc. | johnb%srchtec.uucp@mathcs.emory.edu standard disclaimer: | ...uunet!samsung!emory!stiatl!srchtec.. opinions and mistakes purely my own. | ...mailrus!gatech!stiatl!srchtec...
mayne@VSSERV.SCRI.FSU.EDU (William (Bill) Mayne) (08/09/90)
In article <168@srchtec.UUCP> johnb@srchtec.UUCP (John Baldwin) writes: >In article <1990Aug7.173030.2823@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >>In article <713@tetrauk.UUCP> rick@tetrauk.UUCP (Rick Jones) writes: >>> Auditors have this annoying view that accounts must balance to the >>>penny, not 1 part in 10 to-the-something. There is a good case for using some >>>form of BCD representation, but there are many programming advantages in using >>>the embedded numerical types of the language... >> I strongly agree both about BCD and using the embedded types (and operators) of the language. SUMMARY of my response: Seeing the original poster's description of his problem it seems to me that what is really needed is not floating point numbers but fixed point numbers with a moderately large number of digits and scaling for decimal point placement. The traditional approach would be to use BCD. Although the kind of arithmetic support needed is possible in C, and I describe the approach I'd take below, the point is well taken that it is more desirable to use the builtin operators, rather than have to use explicit function calls for every operation. Thus the best alternative is a language which provides either BCD or some other mechanism to support arithmetic in the scale and precision needed. If using a language which already has the needed arithmetic features is ruled out by considerations of hardware, cost, programmer skills, or what ever the best solution may be to use C++. With operator overloading once a class for BCD is provided programmers can use C syntax for their applications. The rest of this article just expands a little on this summary. You could write your own routines in C to support BCD or something functionally equivalent. One relatively simple alternative would be using long integers (which I think would give enough significant digits) plus a signed short integer for decimal scaling. Then you would have some interesting problems with division and handling overflows and such, but no more than floating point implementers face. On a machine without hardware support for either BCD or floating point I think this scheme of using binary for the significant part but decimal exponents would be the most satifactory. A number would be represented by a structure like this: /* Please no flames if the syntax isn't exactly right, */ /* I don't use this often and don't have a reference handy. */ typedef BCD struct { long digits; short exponent; } This would be normalized so that for all nonzero values digits%10!=0 and the value represented is digits * 10**exponent. Most of the arithmetic would be done in binary, but using a decimal exponent part avoids the problems of representing values like 0.2, which is a repeating fraction in binary and gives accountants so much heartburn. Although solutions in C are possible and even interesting I would suggest that C may not be the best language for this type of problem. Instead, consider: (1) COBOL (2) PL/1 (3) Turbo Pascal with the BCD option (4) C++ (5) Rexx COBOL and PL/1 both provide the kind of arithmetic support needed for this problem, plus business oriented output formatting. Of the two PL/1 is a much better language. However, you may not have these available or have programmers proficient in using them, especially if you are in a microcomputer environment. If you are using PCs you might look into Turbo Pascal. Support for BCD is (or was the last time I looked) an extra cost option, but quite reasonable. I haven't used it so I don't know how well it works. With C++ you might still have to write you own routines to support BCD (though I wouldn't be surprised if you could find a package somewhere). A big advantage over doing the same thing in C would be that you could then overload the regular arithmetic operators to use the new format. This would make coding and maintaining your appication much easier, and is probably the most portable of the solutions suggested here (short of using C, which as I said is really not well suited to this type of problem.) If performance is a problem and your hardware supports BCD (which I think math coprocessors do) you should try to use it, even though a small assembly language interface might to required to gain access to it and of course portability would suffer. Rexx is a fine interpreted language with support for arbitrary precision arithmetic, using fixed point when possible and only going to floating point when the exponents get too small or too large. You can control the digits setting so this won't happen. It is available on PCs (Personal Rexx from Mansfield (CT) software), IBM mainframes, amigas, and many unix platforms (from /wrk/group). Performance and space might be a problem, depending upon the demands of the application. Rexx stores all values as strings. Hence it doesn't take much advantage of hardware support for arithmetic. Its main use is actually as macro language. Good luck, and keep the newsgroup posted on your solutions. Thanks to the original poster for the interesting summary.
henry@zoo.toronto.edu (Henry Spencer) (08/09/90)
In article <168@srchtec.UUCP> johnb@srchtec.UUCP (John Baldwin) writes: >>Our own experience, in building accounting systems for our own use, is that >>the problem can often be eliminated by ignoring the decimal point in dollar >>currencies and thinking of money as measured in pennies. > >Doesn't this push him back into the "continued sums" implementation? >Remember, conversion rates between different currencies "slide" up and down >at a fairly high frequency... If you are doing tricky things with exchange rates, all bets are off. :-) In the simple case, though, if you go to Thomas Cook and buy (say) US$, they will *not* give you a fraction of a cent even if the mathematical conversion factor would indicate it. They will round the amount to an integer in some way (typically in their favor :-)). So if you're trying to keep track of that transaction by program, you don't need to keep the mathematically-exact amount, which is wrong anyway; it is both sufficient and correct to use the appropriate rounding function to give an integer, and store that. -- It is not possible to both understand | Henry Spencer at U of Toronto Zoology and appreciate Intel CPUs. -D.Wolfskill| henry@zoo.toronto.edu utzoo!henry
amull@Morgan.COM (Andrew P. Mullhaupt) (08/11/90)
In article <1990Aug9.154302.29976@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) writes: > In article <168@srchtec.UUCP> johnb@srchtec.UUCP (John Baldwin) writes: > > If you are doing tricky things with exchange rates, all bets are off. :-) > ... So if you're trying > to keep track of that transaction by program, you don't need to keep the > mathematically-exact amount, which is wrong anyway; it is both sufficient > and correct to use the appropriate rounding function to give an integer, > and store that. If you are marking the resultant of the transaction to market, you _do_ need to have the fraction. One of the most interesting difficulties in an "all-integer" or single precision financial package is split adjustment of stocks. I know directly of a case where due to a one-bit error in single precision rounding, a portfolio was found to be unhedged by a single share. The trading system in question was capable of ordering a trade to rebalance, and it did so. It turns out to be extremely unusual for such a trade to be generated for any really useful purpose, and so the traders operating the system noticed this trade, and became concerned that a potentially serious error was going undetected. After some effort, the source of the error was found, and (somewhat sheepishly) a second trade of a single share had to be ordered to reverse the effect of the first. Now it's kind of funny that a single share trade happens, and maybe the financial loss is at most 25 cents capital and some unmentionable transaction cost, but the hassle for the traders is not ignorable. There is another case I know of where tens of thousands of shares went round-trip but this was a combination of a database error and floating point mishap. The moral of the story? Do not make casual assumptions about the kind of arithmetic required in business applications. Expect that "getting it right to the penny", or any other fixed accuracy will not be enough in every situation. There will always be one more example than you can think of when you're trying to get away with a low ceiling in a financial application. Later, Andrew Mullhaupt
news@usc.edu (08/19/90)
In article <168@srchtec.UUCP> johnb@srchtec.UUCP (John Baldwin) writes: In article <1990Aug7.173030.2823@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >In article <713@tetrauk.UUCP> rick@tetrauk.UUCP (Rick Jones) >writes: >> Auditors have this annoying view that accounts must balance to >> the penny, not 1 part in 10 to-the-something. There is a good >> case for using some form of BCD representation, but there are >> many programming advantages in using the embedded numerical >> types of the language... > >Our own experience, in building accounting systems for our own >use, is that the problem can often be eliminated by ignoring the >decimal point in dollar currencies and thinking of money as >measured in pennies. This is very true. Units of money should never be stored as fractions. BCD is merely one technique of ensuring this. Doesn't this push him back into the "continued sums" implementation? Remember, conversion rates between different currencies "slide" up and down at a fairly high frequency. (Does anybody know just HOW frequently the exchange rates change?) This would require an INTERNAL currency which would probably NOT be the same as a recognized world currency; one in which any unit of any other world currency can be expressed as an integer. This is probably tricky enough to find WITHOUT having the conversions changing on you all the time! The only solution I know of to the multiple currency problem is to keep records in the same form as the real world entities they represent. If you have two apples and three oranges, and the exchange rate is currently 1 apple = 1 orange, that does not mean you have 5 apples nor does it mean you have 5 oranges. The exchange rate only exists when there is an exchange, and is only set for that exchange. (Not to be confused with a transaction, which might consiste of many exchanges.) wulkwa