[net.arch] IEEE f.p. standard

ark@alice.UucP (Andrew Koenig) (04/21/86)

> COS(X) is defined as SIN(X-PI/2).  Now for |X| <= PI, we require that
> SIN(X) = RND(sin(X)).  This is the correctly rounded value of the function
> value.

Unfortunately, even this fairly simple definition is quite hard to implement.

Consider the problem of rounding an infinite-precision real number
to its nearest representable value.  Conceptually, this process involves
cutting the fraction somewhere and manipulating the bits before the
cut in a way that depends on the value of the bits after the cut.
For instance, if you're rounding a positive value toward infinity,
you will increment the bits before the cut if any of the bits after
the cut are non-zero.  If all the bits after the cut are zero, you must
leave the bits before the cut alone.

This implies that it's not easy to place an upper bound on the precision
you need to get a correctly rounded value of a transcendental function.
Essentially, you need to be able to distinguish these cases:

	xxxxxxx0|111111111111...
	xxxxxxx1|000000000000...

where the bar represents the cut point.  If rounding toward infinity,
the first case must round up to xxxxxxx1, while the second case either
rounds down to xxxxxxx1 (if all the bits after the cut are 0) or up
to yyyyyyy0 (where any bit is nonzero after the cut and yyyyyyy=xxxxxxx+1)

In other words, when rounding toward infinity, you must generate bits
after the cut until you know there's a nonzero bit, and it's hard
to know how much precision you need to get it right.

Similar situations occur in different places in other rounding modes.

weemba@brahms.BERKELEY.EDU (Matthew P. Wiener) (04/22/86)

In article <2048@lanl.ARPA> jlg@a.UUCP (Jim Giles) writes:
>Since PI/2 is not exactly representable, there is no representable value of
>X which will produce COS(X)=0 for all rounding modes.  Similarly, no value
>of X will produce SIN(X)=1 for rounding modes which truncate down.  This is
>an example of the type of problems that occur when dealing with
>trancendantal numbers on with only a subset of the rationals to calculate
>with.

This particular instance is trivially curable by using functions sinpi(),
etc.  which return sin(pi*( )), etc.  It's a bad habit to implement a
function in one particular way solely out of respect for the classical
notation.  I prefer implementations that respect standard usage.

ucbvax!brahms!weemba	Matthew P Wiener/UCB Math Dept/Berkeley CA 94720

jlg@lanl.UUCP (04/22/86)

In article <5335@alice.uUCp> ark@alice.UucP (Andrew Koenig) writes:
>> COS(X) is defined as SIN(X-PI/2).  Now for |X| <= PI, we require that
>> SIN(X) = RND(sin(X)).  This is the correctly rounded value of the function
>> value.
>
>Unfortunately, even this fairly simple definition is quite hard to implement.

It's not really HARD to implement so much as it's slow.  Since all values
of X are rational (of the form I*2^J, where I and J are integers), you can
implement the above requirements using CORDIC or some similar algorithm.
A side effect of the CORDIC algorithm is that it leaves a 'remainder' after
each iteration so that, after the last iteration, the remainder can be used
to tell you which way to round.

The value X must be rational for two reasons.  First: the SIN of a rational
is always irrational (except zero - an easy case), therefore there is no
possibility of a 'tie' - you will always have a remainder and the direction
to round will always be unambiguous.  Second: the calculated remainder is
always 'accurate', assuming that the CORDIC coefficients are chosen
properly. ('Accurate' is in quotes here because it is usually NOT the real
remainder, but only one that is guaranteed to generate the correct result.
This is true because the CORDIC coefficients are only stored to finite
precision themselves.)

For all this to work, the CORDIC algorithm must be carried out with about
twice the precision of the target result.  CORDIC is slow anyway, and
double precision intermediate calculations make it worse.  However it is
*possible* to achieve the precision stated above.

As I said, I am designing hardware for the Trig functions mainly for
recreation.  Of all the algorithms I've tried, only a few could give the
accuracy I wanted.  And of those, only things like CORDIC are easily
implemented in hardware.  For a double-extended SIN function, CORDIC
requires 64 iterations through the algorithm.  If a bit-sliced hardware
version could be made that cycles at about 100 ns per iteration, the SIN
function would take about 7 micro secs (not including argument reduction) -
hardware is available that could do it (a lot of expensive chips though).
I wonder how much trouble it would be to put it in VLSI, and how much it
would be slowed.

My only other experience with this issue is what can be found in
mathematical support libraries on mainframe computers.  Both the Cray
and CDC (NOS) support libraries use polynomial interpolants, the error
characteristics of which are arcane (to say the least).  Rational
polynomial interpolants are better but a rational polynomial requires a
divide - which, in hardware, has some of the same implementation
complexities as the trig functions (ie. it must be done iteratively, the
true round is sometimes hard to find, or it is slow - like the shift
subtract algorithm:  VERY similar to CORDIC trig functions).

Presumably, other people are working at this problem at a more professional
level.  What algorithm is presently used for this kind of thing?  What
error bounds are accepted?

J. Giles
Los Alamos