[sci.math] Iterative square root function?

clutx.clarkson.edu (Steven Stadnicki,212 Reynolds,2684079,5186432664) (09/25/89)

In the Ray Tracing News, vol. 2 # 4 (I think), someone mentioned a cubically
convergent method for doing Pythagorean sums and square roots.  Does anyone
have info on this method, and any methods for computing quick trig and inverse
trig functions?  It would be much appreciated.

                                     Steven Stadnicki
                                     stadnism@{clutx,sandbox}.clarkson.edu
                                     "Neil and Buzz had been prepared for
                                      monsters and disruptions in reality
                                      and space paradoxes and time paradoxes
                                      and death by evanescence.  They hadn't
                                      been prepared for New Jersey."

turk@Apple.COM (Ken "Turk" Turkowski) (09/25/89)

In article <1989Sep25.022915.5886@sun.soe.clarkson.edu> stadnism@image.soe!clutx.clarkson.edu writes:
>In the Ray Tracing News, vol. 2 # 4 (I think), someone mentioned a cubically
>convergent method for doing Pythagorean sums and square roots.  Does anyone
>have info on this method, and any methods for computing quick trig and inverse
>trig functions?  It would be much appreciated.

Here's a reposting of an article of mine of a few years ago:


[Moler-Morrison, "Replacing Square Roots by Pythagorean Sums", IBM
Journal of Research and Development", vol. 27, no. 6, pp. 577-581, Nov.
1983] describes an algorithm for computing sqrt(a^2+b^2) which is fast,
robust and portable.  The naive approach to this problem (just writing
sqrt(a*a+b*b) ) has the problem that for many cases when the result is
a representable floating point number, the intermediate results may
overflow or underflow.  This seriously restricts the usable range of a
and b.  Moler and Morrison's method never causes harmful overflow or
underflow when the result is in range.  The algorithm has cubic
convergence, and depending on implementation details may be even faster
than sqrt(a*a+b*b).  Here is a C function that implements their
algorithm:
>
double hypot(p,q)
double p,q;
{
    double r, s;
    if (p < 0) p = -p;
    if (q < 0) q = -q;
    if (p < q) { r = p; p = q; q = r; }
    if (p == 0) return 0;
    for (;;) {
	r = q / p;
	r *= r;
	s = r + 4;
	if (s == 4)
	    return p;
	r /= s;
	p += 2 * r * p;
	q *= r;
    }
}

----------------------------------------------------------------

And a relevant response:

> From decwrl!decvax!dartvax!chuck Mon Nov 19 23:15:02 1984
> Subject: Pythagorean Sums
> 
> 
> Ken --
> 
> I recently worked on our assembly language routines which compute
> the square root and absolute value of a complex number.  Thus, your 
> posting of this article caught my attention.  I went so far as to
> visit our library and read the paper by Moler and Morrison.  I then
> reread the code that we use to implement our routines in hopes of
> improving them.  
> 
> In their paper, Moler and Morrison write "The [algorithm] is
> robust, portable, short, and, we think, elegant.  It is also 
> potentially faster than a square root."  Unfortunately, I've come 
> to the conclusion that our existing algorithm is about as robust,
> shorter, faster, and more elegant.  (I'm not sure about portability.)
> 
> I thought I'd send this letter to you, hoping that if you got bored
> one afternoon, you could point out any flaws in my thinking.  And I
> thought that there might be a small chance you would be interested
> in a comparison of the Moler-Morrison algorithm with another algorithm.
> 
> The algorithm that we use is:
> 
> double hypot(p,q)
> double p,q;
> {
>   double r;
>   if (p == 0) return (q);
>   if (q == 0) return (p);
>   if (p < 0) p = -p;
>   if (q < 0) q = -q;
>   if (p < q) {r = p; p = q; q = r;}
>   r = q / p;
>   return p * sqrt (1 + r * r);
> }
> 
> Moler and Morrison claim two advantages:  their algorithm does
> not overflow very easily and it converges rapidly.  However, depending
> on just how "sqrt" is implemented, our algorithm will not overflow
> very easily, either.  This brings us to speed considerations.  The 
> standard method for computing the square root is the Newton-Raphson
> method which coverges quadraticly (sp?).  The Moler-Morrison 
> algorithm converges cubically.  However, the amount of work needed for
> one iteration of the M&M algorithm will allow us to perform two iterations
> of the Newton-Raphson method.
> 
> Let me unroll these two methods side by side so we can see what is
> involved in computing each.  We assume that the arguments 'p' and 'q'
> have been normalized so that 0 < q <= p.  (Pardon my pseudo-code.)
> 
> Newton_Raphson(p,q)                  Moler_Morrison(p,q)
> r = q/p;                             r = q/p;
> r *= r;                              r *= r;
> r += 1;
> 
> /* linear approx. to sqrt(r)
> using magic constants 'c1' and
> 'c2'.  Note that 1 <= r <= 2. 
> See "Computer Evaluation of 
> Mathematical Functions" 
> by C.T. Fike. */
> 
> g = r * c1 + c2;    /* 7 bits */     s = r / (4 + r);
> g = (g + r/g) * .5; /* 15 bits */    p = p + 2 * s * p;
> 
> /* Here N-R has about 5 digits of accuracy, M-M has about 6,
> and N-R has performed an extra addition. */
> 
> g = (g + r/g) * .5; /* 30 bits */    q = s * q;
> g = (g + r/g) * .5; /* 60 bits */    r = q / p;
>                                      r = r * r;
>                                      s = r / (4 + r);
>                                      p = p + 2 * s * p;
> 
> /* Now, both algorithms have about 20 digits of accuracy.
> Note that (depending on how one counts) M-M uses from 2
> to 3 more floating point multiplication than N-R on each
> iteration.  N-R used an extra addition to set up for the
> iterations and will need an extra multiply to return its
> value.  On each further iteration, the number of significant
> digits will quadruple for the N-R method and only triple
> for the M-M method. */
> 
> return p * g;                        return p;


----------------------------------------------------------------

For computing trigonometric and inverse trigonometric functions in
fixed- point, there is the well-known CORDIC algorithm, which can, by
the way, be extended to exponentials, logarithms, square roots, and
other trancendental functions.  This has linear convergence, as opposed
to cubic for Moler-Morrison or quadratic for Newton-Raphson methods,
however it uses only integer adds and shifts, which were, at one time,
faster than floating-point multiplications and additions.  Applications
of CORDIC transformations to anti-aliasing lines and polygons, as well
as additional references, can be found in:

	Turkowski, Kenneth
	Anti-Aliasing Through the Use of Coordinate Transformations
	ACM Transactions on Graphics
	Vol. 1, No. 3, July 1982
	pp. 215-234

Numerically paranoid code for CORDIC computations follows.  Note that the
pre-and post-shifts can be deleted to speed up the algorithm, with a
resultant loss of precision of about 2 bits.

-------------------------------- cordic.c --------------------------------
# define RADIANS 1
# define COSCALE 0x22c2dd1c	/* 0.271572 */

static long arctantab[32] = {
# ifdef DEGREES			/* MS 10 integral bits */
# define QUARTER (90 << 22)
	266065460, 188743680, 111421900, 58872272, 29884485, 15000234, 7507429,
	3754631, 1877430, 938729, 469366, 234683, 117342, 58671, 29335, 14668,
	7334, 3667, 1833, 917, 458, 229, 115, 57, 29, 14, 7, 4, 2, 1, 0, 0,
# else
# ifdef RADIANS			/* MS 4 integral bits */
# define QUARTER ((int)(3.141592654 / 2.0 * (1 << 28)))
	297197971, 210828714, 124459457, 65760959, 33381290, 16755422, 8385879,
	4193963, 2097109, 1048571, 524287, 262144, 131072, 65536, 32768, 16384,
	8192, 4096, 2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1, 0, 0,
# else
# define BRADS 1
# define QUARTER (1 << 30)
	756808418, 536870912, 316933406, 167458907, 85004756, 42667331,
	21354465, 10679838, 5340245, 2670163, 1335087, 667544, 333772, 166886,
	83443, 41722, 20861, 10430, 5215, 2608, 1304, 652, 326, 163, 81, 41,
	20, 10, 5, 3, 1, 1,
# endif
# endif
};


/* To rotate a vector through an angle of theta, we calculate:
 *
 *	x' = x cos(theta) - y sin(theta)
 *	y' = x sin(theta) + y cos(theta)
 *
 * The rotate() routine performs multiple rotations of the form:
 *
 *	x[i+1] = cos(theta[i]) { x[i] - y[i] tan(theta[i]) }
 *	y[i+1] = cos(theta[i]) { y[i] + x[i] tan(theta[i]) }
 *
 * with the constraint that tan(theta[i]) = pow(2, -i), which can be
 * implemented by shifting. We always shift by either a positive or
 * negative angle, so the convergence has the ringing property. Since the
 * cosine is always positive for positive and negative angles between -90
 * and 90 degrees, a predictable magnitude scaling occurs at each step,
 * and can be compensated for instead at the end of the iterations by a
 * composite scaling of the product of all the cos(theta[i])'s.
 */

static
pseudorotate(px, py, theta)
long *px, *py;
register long theta;
{
	register int i;
	register long x, y, xtemp;
	register long *arctanptr;

	x = *px;
	y = *py;

	/* Get angle between -90 and 90 degrees */
	while (theta < -QUARTER) {
		x = -x;
		y = -y;
		theta += 2 * QUARTER;
	}
	while (theta > QUARTER) {
		x = -x;
		y = -y;
		theta -= 2 * QUARTER;
	}

	/* Initial pseudorotation, with left shift */
	arctanptr = arctantab;
	if (theta < 0) {
		xtemp = x + (y << 1);
		y = y - (x << 1);
		x = xtemp;
		theta += *arctanptr++;
	} else {
		xtemp = x - (y << 1);
		y = y + (x << 1);
		x = xtemp;
		theta -= *arctanptr++;
	}

	/* Subsequent pseudorotations, with right shifts */
	for (i = 0; i <= 28; i++) {
		if (theta < 0) {
			xtemp = x + (y >> i);
			y = y - (x >> i);
			x = xtemp;
			theta += *arctanptr++;
		} else {
			xtemp = x - (y >> i);
			y = y + (x >> i);
			x = xtemp;
			theta -= *arctanptr++;
		}
	}

	*px = x;
	*py = y;
}


static
pseudopolarize(argx, argy)
long *argx, *argy;
{
	register long theta;
	register long yi, i;
	register long x, y;
	register long *arctanptr;

	x = *argx;
	y = *argy;

	/* Get the vector into the right half plane */
	theta = 0;
	if (x < 0) {
		x = -x;
		y = -y;
		theta = 2 * QUARTER;
	}
	if (y > 0)
		theta = -theta;

	arctanptr = arctantab;
	if (y < 0) {		/* Rotate positive */
		yi = y + (x << 1);
		x = x - (y << 1);
		y = yi;
		theta -= *arctanptr++;	/* Subtract angle */
	} else {		/* Rotate negative */
		yi = y - (x << 1);
		x = x + (y << 1);
		y = yi;
		theta += *arctanptr++;	/* Add angle */
	}

	for (i = 0; i <= 28; i++) {
		if (y < 0) {	/* Rotate positive */
			yi = y + (x >> i);
			x = x - (y >> i);
			y = yi;
			theta -= *arctanptr++;
		} else {	/* Rotate negative */
			yi = y - (x >> i);
			x = x + (y >> i);
			y = yi;
			theta += *arctanptr++;
		}
	}

	*argx = x;
	*argy = theta;
}


/* Fxprenorm() block normalizes the arguments to a magnitude suitable for
 * CORDIC pseudorotations.  The returned value is the block exponent.
 */
static int
fxprenorm(argx, argy)
long *argx, *argy;
{
	register long x, y;
	register int shiftexp;
	int signx, signy;

	shiftexp = 0;		/* Block normalization exponent */
	signx = signy = 1;

	if ((x = *argx) < 0) {
		x = -x;
		signx = -signx;
	}
	if ((y = *argy) < 0) {
		y = -y;
		signy = -signy;
	}
	/* Prenormalize vector for maximum precision */
	if (x < y) {		/* |y| > |x| */
		while (y < (1 << 27)) {
			x <<= 1;
			y <<= 1;
			shiftexp--;
		}
		while (y > (1 << 28)) {
			x >>= 1;
			y >>= 1;
			shiftexp++;
		}
	} else {		/* |x| > |y| */
		while (x < (1 << 27)) {
			x <<= 1;
			y <<= 1;
			shiftexp--;
		}
		while (x > (1 << 28)) {
			x >>= 1;
			y >>= 1;
			shiftexp++;
		}
	}

	*argx = (signx < 0) ? -x : x;
	*argy = (signy < 0) ? -y : y;
	return (shiftexp);
}


/* Return a unit vector corresponding to theta.
 * sin and cos are fixed-point fractions.
 */
fxunitvec(cos, sin, theta)
long *cos, *sin, theta;
{
	*cos = COSCALE >> 1;	/* Save a place for the integer bit */
	*sin = 0;
	pseudorotate(cos, sin, theta);

	/* Saturate results to fractions less than 1, shift out integer bit */
	if (*cos >= (1 << 30))
		*cos = 0x7FFFFFFF;	/* Just shy of 1 */
	else if (*cos <= -(1 << 30))
		*cos = 0x80000001;	/* Just shy of -1 */
	else
		*cos <<= 1;

	if (*sin >= (1 << 30))
		*sin = 0x7FFFFFFF;	/* Just shy of 1 */
	else if (*sin <= -(1 << 30))
		*sin = 0x80000001;	/* Just shy of -1 */
	else
		*sin <<= 1;
}


/* Fxrotate() rotates the vector (argx, argy) by the angle theta. */
fxrotate(argx, argy, theta)
long *argx, *argy;
long theta;
{
	register long x, y;
	int shiftexp;
	long frmul();

	if (((*argx || *argy) == 0) || (theta == 0))
		return;

	shiftexp = fxprenorm(argx, argy);	/* Prenormalize vector */
	pseudorotate(argx, argy, theta);	/* Perform CORDIC
						 * pseudorotation */
	x = frmul(*argx, COSCALE);	/* Compensate for CORDIC enlargement */
	y = frmul(*argy, COSCALE);
	if (shiftexp < 0) {	/* Denormalize vector */
		*argx = x >> -shiftexp;
		*argy = y >> -shiftexp;
	} else {
		*argx = x << shiftexp;
		*argy = y << shiftexp;
	}
}


fxatan2(x, y)
long x, y;
{
	if ((x || y) == 0)
		return (0);
	fxprenorm(&x, &y);	/* Prenormalize vector for maximum precision */
	pseudopolarize(&x, &y);	/* Convert to polar coordinates */
	return (y);
}


fxpolarize(argx, argy)
long *argx, *argy;
{
	int shiftexp;
	long frmul();

	if ((*argx || *argy) == 0) {
		*argx = 0;	/* Radius */
		*argy = 0;	/* Angle */
		return;
	}
	/* Prenormalize vector for maximum precision */
	shiftexp = fxprenorm(argx, argy);
	/* Perform CORDIC conversion to polar coordinates */
	pseudopolarize(argx, argy);
	/* Scale radius to undo pseudorotation enlargement factor */
	*argx = frmul(*argx, COSCALE);
	/* Denormalize radius */
	*argx = (shiftexp < 0) ? (*argx >> -shiftexp) : (*argx << shiftexp);
}


# ifdef vax
long
frmul(a, b)			/* Just for testing */
long a, b;
{

	/*
	 * This routine should be written in assembler to calculate the high
	 * part of the product, i.e. the product when the operands are
	 * considered fractions. 
	 */
	return ((a >> 16) * (b >> 15));
}

# endif
-- 
Ken Turkowski @ Apple Computer, Inc., Cupertino, CA
Internet: turk@apple.com
Applelink: TURKOWSKI1
UUCP: sun!apple!turk