[net.arch] Floating Point Rounding

jer@peora.UUCP (J. Eric Roskos) (07/22/85)

> Third, the 360's floating point unit truncates results rather than rounding
> (again in the interest of speed) which is equivalent to keeping one less
> bit than if you rounded.

This is not really necessarily in the interests of speed.  As I explained
to one of the people in here privately (believing he was going to post a
summary), our 3200 series floating point unit does round on the low-order
bit.  It does this probabilistically.  Not being a number theorist, I can't
demonstrate that the underlying assumption of equal-distribution involved
is valid*, but basically, it works like this.

Whenever a computation is completed, there are some guard digits left over
whose values can't be copied to the destination register (that being the
nature of guard digits in the first place).  Now, if the most-significant
(hexadecimal) guard digit is 7 or less, no change in the non-guard digits
(hereafter called just the "mantissa") are performed; in essence, it
"rounds down".  If the most-significant guard digit is 9 or more, the
mantissa of the result is "rounded up" by adding 1 to the mantissa.

Furthermore, if the most-significant guard-digit is 8, and any other guard
digit is nonzero, the hardware still "rounds up" by adding 1.  (Intuitively,
this may seem wrong, but if you think carefully about real-numbers -- and
understanding real numbers does always require a certain "willing
suspension of disbelief" -- you'll see that it does make sense.)

So far, this is plain, ordinary rounding.  But now comes the problem.  If
the most-significant guard digit is 8, and all other guard digits are zero,
you have a dilemma.  If you always round down, the result will come out
"too small" (we're speaking here in terms of absolute values); if you
always round up, it will come out "too large".  So... if all other guard
digits are 0, the least significant bit of the low-order digit of the
mantissa is FORCED to 1.  The theory here is that there is an equal
probability that this low-order bit will previously have been a 1 or a zero
-- something that is much more probable, I believe, than the incorrect
high-order bit assumption mentioned earlier.  As a result, this R* rounding
will give a "More Accurate" rounding than the IBM rounding, according
to this reasoning.

However, very sadly, in the real world, "better" is not always good enough.
Since IBM does it another way, the less-accurate results have come to be so
ingrained in the scientific community that more accurate results often
cause an uproar: "you didn't get the results our IBM got, so your machine
is broken!" they exclaim.  As a result, R* rounding is not that popular,
eventhough there is a large body of theory behind it showing that it is
a "better" rounding scheme than the IBM approach.

The moral of this very long story is that sometimes it is well known that
certain floating point algorithms are better, but they can't be used, even
though they are efficient, because machines that use them would get in
trouble in the marketplace.  As the theoretical physicist Richard Feynman
puts it, "in the real world, innovation is very hard."  This is very true.

---------

*Note that this statement is largely a scientific-disclaimer: i.e., I am
not willing to state as fact a claim I can't prove, while making a semi-
rigorous explanation of a well-defined algorithm.  I do believe,
personally, that the assumption is valid; it's just that this belief is
opinion.
-- 
Shyy-Anzr:  J. Eric Roskos
UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!peora!jer
US Mail:    MS 795; Perkin-Elmer SDC;
	    2486 Sand Lake Road, Orlando, FL 32809-7642

wfmans@ihuxb.UUCP (w. mansfield) (07/24/85)

> ... (discussion of plain ordinary rounding)
> So far, this is plain, ordinary rounding.  But now comes the problem.  If
> the most-significant guard digit is 8, and all other guard digits are zero,
> you have a dilemma.  If you always round down, the result will come out
> "too small" (we're speaking here in terms of absolute values); if you
> always round up, it will come out "too large".  So... if all other guard
> digits are 0, the least significant bit of the low-order digit of the
> mantissa is FORCED to 1.  The theory here is that there is an equal
> probability that this low-order bit will previously have been a 1 or a zero
> -- something that is much more probable, I believe, than the incorrect
> high-order bit assumption mentioned earlier.  As a result, this R* rounding
> will give a "More Accurate" rounding than the IBM rounding, according
> to this reasoning.

I'd like to point you folks toward the IEEE 754 floating point standard and
its supporting documentation for discussions of rounding.  The IEEE 754
committee spent years (seemingly) discussing rounding, and determines that
Four (4) rounding modes were required to support numerical algorithms in
computers:  Round to nearest, round to plus infinity, round to minus
infinity and round to zero.  Round to nearest is the mode discussed above.
In the case referred to above (in IEEE language, where the two nearest
representable values are equally near the infinitely precise value), the one
with the least significant bit zero will be returned.  In some previous
drafts the IEEE committee wanted to alternate the value returned in this
case; that is, alternately returning the lesser and greater model number.
Although mathmatically pleasing, the hardware oriented members of the
committee suffered apoplexy, and they settled on returning the lesser.

I always thought that the reason for truncation in IBM was because the
FORTRAN language was defined that way.  Clearly difficult to find cause
and effect here.
> 
> However, very sadly, in the real world, "better" is not always good enough.
> Since IBM does it another way, the less-accurate results have come to be so
> ingrained in the scientific community that more accurate results often
> cause an uproar: "you didn't get the results our IBM got, so your machine
> is broken!" they exclaim.  As a result, R* rounding is not that popular,
> eventhough there is a large body of theory behind it showing that it is
> a "better" rounding scheme than the IBM approach.
> 
A big part of the problem is that the real numerical analysists designing
programs for computers are very aware of the peculiarities of the hardware
that their programs run on, and they have very carefully designed their
software to extract as much precision as possible from the hardware, and
in doing so have made their programs hardware specific.
The purpose of the IEEE standard is to provide a machine independent system
of arithmetic, where the same algorithm will give the same result regardless
of the underlying hardware implementation.

Another part of the problem is that the folks who write specifications for
computer languages (Ada excluded) don't specify the type of arithmetic
to be performed.  Basically they wimp out and say "the answer you get
depends on the hardware".  This is especially true where floating point
is concerned.

(The preceding is my opinion -- I do not represent my employer or IEEE in
presenting these views)

ark@alice.UUCP (Andrew Koenig) (07/25/85)

Try using floating-point arithmetic to simulate multiple-precision
fixed-point arithmetic sometime.  It's much harder if the machine rounds.

sambo@ukma.UUCP (Father of micro-S) (08/06/85)

In article <1117@ihuxb.UUCP> wfmans@ihuxb.UUCP (w. mansfield) writes:
>> ... (discussion of plain ordinary rounding)
>> So far, this is plain, ordinary rounding.  But now comes the problem.  If
>> the most-significant guard digit is 8, and all other guard digits are zero,
>> you have a dilemma.  If you always round down, the result will come out
>> "too small" (we're speaking here in terms of absolute values); if you
>> always round up, it will come out "too large".  So... if all other guard
>> digits are 0, the least significant bit of the low-order digit of the
>> mantissa is FORCED to 1.

Volume 2 of Knuth's The Art of Computer Programming (1981) has a good dis-
cussion of rounding on pp. 221-23.  Let me quote a bit from it:
     "No rounding rule can be best for every application....  But for most
     numerical calculations the best policy appears to be the rounding
     scheme ... which insists that the least significant digit should al-
     ways be made even (or always odd) when an ambiguous value is rounded.
     This is not a trivial technicality, of interest only to nit-pickers;
     it is an important practical consideration, since the ambiguous case
     arises surprisingly often and a biased rounding rule produces signi-
     ficantly poor results.... For even radices, there is reason to prefer
     the following rule: `Round to even when b/2 is odd, round to odd when
     b/2 is even.'  [B is the radix.]  ... On the other hand, some people
     prefer rounding to even in all cases, so that the remainder will tend
     to be 0 more often.  Neither alternative conclusively dominates the
     other; fortunately the base is usually b = 2 or b = 10, when everyone
     agrees that round to even is best."
Well, almost everyone.  Above is a counterexample, or was the radix 16?
I don't remember.  Anyway, IEEE does it right.  See below.

>I'd like to point you folks toward the IEEE 754 floating point standard and
>Round to nearest is the mode discussed above.
>In the case referred to above (in IEEE language, where the two nearest
>representable values are equally near the infinitely precise value), the one
>with the least significant bit zero will be returned.

>> However, very sadly, in the real world, "better" is not always good enough.
>> As a result, R* rounding is not that popular,
>> eventhough there is a large body of theory behind it showing that it is
>> a "better" rounding scheme than the IBM approach.

Knuth seems to really attack the IBM approach:
     "Many people feel that, since floating point arithmetic is inexact by
     nature, there is no harm in making it just a little bit less exact in
     certain rare cases, if it is convenient to do so.  This policy saves a
     few cents in the design of computer hardware, or a small percentage of
     the average running time of a subroutine.  But the above discussion
     shows that such a policy is mistaken....  The reason is not to glorify
     `bit chasing'; a more fundamental issue is at stake here: `Numerical
     subroutines should deliver results that satisfy simple, useful mathema-
     tical laws whenever possible.'  [The single quotes are mine; the origi-
     nal is italicized.]  The crucial formula u `+' v = round(u + v) is a
     `regularity' property that makes a great deal of difference between
     whether mathematical analysis of computational algorithms is worth doing
     or worth avoiding.  Without any underlying symmetry properties, the job
     of proving interesting results becomes extremely unpleasant.  `The
     enjoyment of one's tools is an essential ingredient of successful
     work.'"  [The single quotes are mine; the original is italicized.]  

>Another part of the problem is that the folks who write specifications for
>computer languages (Ada excluded) don't specify the type of arithmetic
>to be performed.  Basically they wimp out and say "the answer you get
>depends on the hardware".  This is especially true where floating point
>is concerned.

Another example of a language that avoids this problem is micro-S.  (If you
don't know what micro-S is, you can ask me.)
-----------------------------------------
Samuel A. Figueroa, Dept. of CS, Univ. of KY, Lexington, KY  40506-0027
ARPA: ukma!sambo<@ANL-MCS>, or sambo%ukma.uucp@anl-mcs.arpa,
      or even anlams!ukma!sambo@ucbvax.arpa
UUCP: {ucbvax,unmvax,boulder,oddjob}!anlams!ukma!sambo,
      or cbosgd!ukma!sambo

	"Micro-S is great, if only people would start using it."

jer@peora.UUCP (J. Eric Roskos) (08/08/85)

> Summary: Knuth has some good discussion on rounding
>
> In article <1117@ihuxb.UUCP> wfmans@ihuxb.UUCP (w. mansfield) writes:

Just a little grumble here before I comment... I wrote the text that
appears under this header, not w. mansfield, describing Perkin-Elmer
R* Rounding.

Anyway, to answer the question asked... the radix is indeed 16.  Thus,
according to Knuth's rule,

>    ficantly poor results.... For even radices, there is reason to prefer
>    the following rule: `Round to even when b/2 is odd, round to odd when
>    b/2 is even.'  [B is the radix.]  ... On the other hand, some people
>    prefer rounding to even in all cases, so that the remainder will tend
>    to be 0 more often.  Neither alternative conclusively dominates the
>    other; fortunately the base is usually b = 2 or b = 10, when everyone
>    agrees that round to even is best."

the R* rounding scheme is right, because it is a round-to-odd scheme (if I
understand it all correctly), and 16/2=8, an even number.  This is very
reassuring!  However, I think the person who devised the rounding scheme
may have read Knuth's comments, since I think the floating point unit on
the 3200s was designed just shortly after Knuth's books came out.  I know
he had a substantial theoretical basis for the scheme, since one of our
local historians of sorts here has a large collection of papers in support
of the scheme left over from those days (CACM papers, etc.).

I think the person who said "some numerical analysts put corrections for
a machine's shortcomings into their programs" had a good point, one I hadn't
thought of.  I do know that a numerical analyst (fairly well-known) at
the university where I did my graduate work was originally of the "the
floating point unit is broken" school, then came to like it a great deal
once she sat down and figured out why the results were different, though I
also remember she changed her programs slightly as a result.  But the moral
of the story really was that market forces ultimately seem to determine the
acceptability of a product, much more than its technical correctness, in
many cases.  Though this doesn't mean you shouldn't try...

Disclaimer: I'm just an interested observer of R* rounding, et. al; I had
   nothing to do with inventing or implementing it.  I can't even remember
   the last time I wrote a program that used floating point operations!
-- 
Shyy-Anzr:  J. Eric Roskos
UUCP:       ..!{decvax,ucbvax,ihnp4}!vax135!petsd!peora!jer
US Mail:    MS 795; Perkin-Elmer SDC;
	    2486 Sand Lake Road, Orlando, FL 32809-7642