[comp.lang.c] more than you wanted hear about pow

sesv@cbnewsc.att.com (steven.e.sommars) (08/12/90)

Here's my contribution to the pow discussion.  If you spot
an error, please send me email corrections & I will summarize.

============================================================
What makes for a good pow(x,y)?  An unordered, partial (yet 
overlapping) list might include:
	a) Accurate enough to satisfy users
	b) monotonic
	c) fast
	d) result doesn't overflow/underflow prematurely
	e) reproducible (does pow(x,y) vary over time or
	   between machines?)
	f) function size (text+data)
	g) If pow(integer,integer) is exactly representable,
	   that value should be returned.  E.g., 
	   pow(2.,3.) should return exactly 8.
	h) Acceptable interface ``glue''.  Returns
	   what the user expects when given 
		negative x and/or y,
		pow(0,0) (we all know the right answer to that one :-)  ),
		one or both arguments are NaN's,
		arguments which cause over/underflow, 
		...and so forth.
	   The list of special cases is long for pow().
	i) Possible interface to traceback / exception handling
	   mechanisms.
	j) adheres to whatever standard the user deems important
	   (ANSI C, X/Open, SVID, NCEG, ...)
	k) support of all applicable rounding modes.
	l) if applicable, sets IEEE flags ``correctly''.
Most pow users are willing to sacrifice one or more of these properties.

How do computers compute pow(x,y) in production code?  Here are some of
the methods I know of.  I'm sure that others exist.  See the references
if you want more details. 

1) exp(y*log(x)).  Inaccurate, as Andrew Koenig described.  I've seen
   errors of about 1000 ulps on IEEE dp.  However if you are on
   an architecture with enough bits of fast extended precision, 
   this method may provide acceptable results.  (If you have
   fast extra precision, other tricks can be used.)

2) Cody&Waite.  More accurate than 1), but runs into
   trouble for some extreme cases, e.g., pow(1+epsilon,large)

   [Aside. I implemented this algorithm for IEEE dp.  Andrew 
    helped make my code portable (not an easy task).  Alas, he
    also provided test cases to show where the Cody&Waite algorithm
    is ill-behaved.   This is one reason why the code never
    made it into System V.  It's still better than exp(y*log(x)).]

3) 4.3 BSD.  A fairly sophisticated algorithm, implemented by
   K.C.Ng.  It carefully uses exp(a*log(b)) in some regions,
   but simulates extra precision in the computations.
   The maximum observed error is about 140 ulps, but only
   in a small region.  The typical error is much smaller. It is 
   monotonic, has the pow(integer,integer) characteristic (see g)
   above).   Many implementations, including SVR4, are based on this.

4) S. Gal's accurate table method: This is used in 
   the IBM R6000, in IBM VS II Fortran (S/370),
   and perhaps elsewhere.  

5) Another technique used in combination with a
   test for pow(x,integer).  As noted by others,
   monotonicity may be violated when combining
   approximation techniques, even if both techniques 
   are individually good.

   Also note that while repeated multiplication
   may be fast, it may be less accurate than a good pow() 
   due to cumulative roundoff errors.

6) Some microprocessors have built in support for approximating
   transcendental functions.  Sometimes this can be used
   for building an accurate pow.

7) Others.  I have tested early versions of a table based pow designed
   by Peter Tang.  IEEE dp accuracy is better than one ulp,
   usually less than 0.55 ulps.  I don't know when the
   algorithm will be finalized/published.



   [Final aside. Here's one way to view the computation of exp(y*log(x)).
    let x+dx be the next IEEE dp number above positive x,
		x+dx=nextafter(x,HUGE)
    (Roughly, dx = x * 2^-53.) 

    The value computed for pow(x,y) and pow(x+dx,y) should differ.
    However log(x) and log(x+dx)
    may be approximated by the same IEEE dp result!  Thus
    pow(x,y) and pow(x+dx,y) may return the
    same value from the math library.  To look at how
    this varies with x, consider the following table.
	 x	# of neighboring double precision numbers near x 
                for which log(x) == log(neighbor) when 
                the results are rounded to d.p.
	2^10		 4
	2^20		 8
	2^40		16
	2^80		32
    For large x, log(x) loses the ability to distinguish arguments.
    For y~0, exp(y) does the same.  Thus exp(y*log(x)) has problems.]

============================================================
References:

R.C.Agarwal, et al., 
``New Scalar and Vector Elementary Functions for the IBM System/370''
IBM J. Res. Develop, 30 March 1986.
**Recommended reading.  

William J. Cody, Jr and William Waite, 
"Software Manual for the Elementary Functions,"
Prentice-Hall, 1980.
**Two distinct uses for this book.  a)cookbook for implementing
**elementary function, b)gives code (elefunt) for testing
**accuracy of implementations.  Elefunt is available on netlib.

S. Gal,
``Computing Elementary Functions: A New Approach for Achieving High
Accuracy and Good Performance,''
IBM Technical Report 88.153, March 1985.
**Techniques used for radix-16 floating point (see Agarwal paper).  

S. Gal, B. Bachelis
``An Accurate Elementary Mathematical Library for the IEEE Floating
Point Standard,''
IBM Technical Report 88.223, June 1987.
**Good background, but doesn't cover pow.
**A revised version of this will go into ACM Transactions on
**Math. Software.  

J.F. Hart, et al., 
"Computer Approximations."
John Wiley & Sons, 1968.
**One of the standard sources for approximations.  Somewhat dated.

``IEEE Standard for Binary Floating-Point Arithmetic,''
ANSI/IEEE Std. 754-1985, IEEE, 1985.

P. W. Markstein,
``Computation of elementary functions on the IBM RISC
System/6000 processor,''
IBM J. Res. Develop, 34 January 1990.
**Describes how >>correctly rounded<< elementary functions
**can be implemented.  A table based (Gal/Bachelis) technique produces
**the properly rounded answer in 1023 out of 1024 cases.  
**The other 1 out of 1024 cases can be detected. 
**Without help, the approximation may round
**to the wrong value, extra precision can be used to
**recompute a better approximation.  
**This paper does not explicitly cover pow.

Joel Silverstein, Steve Sommars, Yio-Chian Tao.
``The UNIX\(rg System Math Library, a Status Report''
January 1990 Usenix Proceedings (Washington).
**Describes UNIX SVR3->SVR4 libm changes,
**gives test results, includes pretty ulps plots.

Eugene H. Spafford, John C. Flaspohler,
``A Report on the Accuracy of Some Floating Point Math Functions
on Selected Computers,''
Georgia Tech.
Technical Report GIT-SERC-86/02, 
GIT-ICS~85/06.
**Survey of libm several years ago.  Accuracy was 
**pretty bad.  This was before 4.3BSD libm came out;
**the situation is better now.

============================================================
	Steve Sommars
	sesv@cbnewsc.att.com

skwu@boulder.Colorado.EDU (WU SHI-KUEI) (08/12/90)

For us ignorami, what does an error of "190 ulps" mean?  Thank you.

sesv@cbnewsc.att.com (steven.e.sommars) (08/13/90)

In article <24645@boulder.Colorado.EDU>, skwu@boulder.Colorado.EDU (WU SHI-KUEI) writes:
> For us ignorami, what does an error of "190 ulps" mean?  Thank you.


An ulp is a Unit in the Last Place, roughly the amount a floating point
number changes if the least significant bit is flipped.  For IEEE
double precision, this can be computed by
                  logb(x)-52
	ulp(x) = 2
where logb(x), an IEEE754 recommended function, is the unbiased
exponent of x.  For normalized numbers, an IEEE dp ulp is
in the range:
          -52                       -53
        2     * x    >= ulp(x) >= 2     * x
[Attn. nitpickers: I have intentionally omitted details of denormalized 
numbers, ulp(powers of the radix), etc.]

If we have an approximation, F(y), to a function whose true
value at y is f(y), then 
	error in ulps =   (f(y) - F(y))
			  -------------
			    ulp(f(y))
[The choice in sign of error, (F-f) versus (f-F), is somewhat arbitrary.]

An ulp is a measure of relative error, not absolute.  A
single precision function with maximum error of 1 ulp is 
not more accurate than a double precision function with
maximum error of 3 ulps.

Ulps are not hard to use.  Let's say that the *true* result, T,
of a computation is about 3.3, or when expressed in IEEE dp hex,
	T = 400AA3D70A3D70A4 [4E2...]   
The extra hex digits in [] won't fit into a IEEE word.
The ulp error depends on the approximation.
		approx		   error in ulps
	    400AA3D70A3D70A6	 -1.B1E   (-1.694)
	    400AA3D70A3D70A5	 -0.B1E	  (-0.694)
	    400AA3D70A3D70A4	  0.4E2   ( 0.305)
	    400AA3D70A3D70A3	  1.4E2   ( 1.305)
The best we can hope for in computing transcendentals is
a maximum error of 0.5 ulps (in round-to-nearest mode).
The value 400AA3D70A3D70A4 is the best approximation to T,
it has the smallest roundoff error.


	Steve Sommars
	sesv@cbnewsc.att.com