[comp.lang.c] Wanted: fast, low-precision trig functions for C

tcamp@dukeac.UUCP (Ted A. Campbell) (02/28/89)

I am writing a graphics-based program that has to calculate 
several thousand map coordinates.  The slowest part of this 
program by far is its trigonometric functions--it takes 
3 minutes and 45 seconds to calculate 3000 sets ofd coordinates
on an XT clone at 10 mhz (no coprocessor), which is the lowest 
resolution map I can make.  

I wondered if there were any fast, low-precision approaches 
to the standard trig functions, since the compilers I use
(Microsoft QuickC and AT&T Unix C) boast of their high 
level of precision?  

Any suggestions will be appreciated.  Thanks.  

Ted A. Campbell
Duke Divinity School 
{ethos, ecsgate}!dukeac!numen!tcamp

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/28/89)

In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes:
>I am writing a graphics-based program that has to calculate 
>several thousand map coordinates.  ...
>I wondered if there were any fast, low-precision approaches 
>to the standard trig functions, ...

The cost of the "extra precision" over a less precise but
otherwise similar approximation is typically small.

However, there may be hope -- I know of two circumstances
under which one can substantially speed up such functions:

(1)  If all coordinates will be taken from a grid, as when
dealing with graphic display devices, then several methods
are available for quickly returning answers that are as
precise as the grid resolution allows.  Unfortunately the
ones I know of are implemented as proprietary code.  If
you have access to sources for graphic devices, e.g. DMD,
check there for fast approximate trig functions.

(2)  For more or less arbitrary floating-point coordinates,
if there are only a manageable number of distinct function
arguments, then you may be able to "cache" them, either
precomputed or computed the first time a value is needed,
stashing the value for immediate use the next time the
same argument is supplied.  For this to work well, there
must be a fast method of mapping the argument into the
correct cache entry.

	(a)  If the arguments are all regularly spaced,
	for example integral number of degrees, then
	the cache can be an array of return values indexed
	by the integer number of spacing intervals.  E.g.
		double cosval[360] = { 1.0, 0.999848, ... };

	(b)  If the arguments are arbitrary within some
	domain, but the function is nicely-behaved, then
	you can obtain good enough accuracy by converting
	the argument to some number of intervals, as above,
	rounding to the nearest integral number, and proceed
	as above.  For example, the slope of cos() is no
	worse than +-1, so for 0.1% accuracy it would suffice
	to divide the range [0,Pi/2] into on the order of a
	thousand bins.

Once you start thinking along these lines, you can probably
spot several areas for code speed improvement.  The key is
that the cost to access the cache must be less than the cost
of whatever approximation method the standard library is
using.  If your machine has a "polynomial evaluate" instruction,
it might be hard to beat the standard library.

bill@twwells.uucp (T. William Wells) (02/28/89)

In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes:
: I wondered if there were any fast, low-precision approaches
: to the standard trig functions, since the compilers I use
: (Microsoft QuickC and AT&T Unix C) boast of their high
: level of precision?

Try table lookup and linear interpolation.

---
Bill
{ uunet | novavax } !twwells!bill

henry@utzoo.uucp (Henry Spencer) (03/01/89)

In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes:
>...I wondered if there were any fast, low-precision approaches 
>to the standard trig functions...

Consider table lookup, followed (if necessary) by linear interpolation
between adjacent table values (most functions look fairly linear on a
small scale).
-- 
The Earth is our mother;       |     Henry Spencer at U of Toronto Zoology
our nine months are up.        | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

tcamp@dukeac.UUCP (Ted A. Campbell) (03/09/89)

Thanks to all who responded to my query about high-speed, low
precision trig functions for use in graphics applications.  
I received numerous replies, along the following lines: 

(a) Most suggested the development of fast look-up tables, 
perhaps with some variation of interpolation between near values
to increase precision.  

(b) Some suggested more elaborate numerical algorithms which, 
alas, I don't understand.  

(c) Some suggest the purchase of a co-processor, which perhaps 
someday I shall do.  

The attached code is based on a suggestion for a lookup table and 
sample code sent to me from Fraser Orr at the Department of Computer
Science, University of Glasgow.  (If he is listening--THANKS--I doubt
that I could find a path to his machine.)  The code below allows 
me to use the functions vpt_sin(), vpt_cos(), and vpt_tan() at either
low precision (high speed) or high precision (low speed).  Note:

(a) The global variable vpt_level selects precision level:  if set to 1,
high-speed look-up tables are used; if set to any other values, the 
routines will convert to radians and utilize the normal sin(), cos(), 
and tan() functions.  

(b) The function vpt_init() must be called before any of these routines 
are used at level 1.  

(c) I have not utilized interpolation here.  If those who understand 
mathematics can give me a model for an interpolation function, 
we could incorporate it as a second level between 1 and the slower 
routines.  

(d) These routines have been tested with AT&T Unix C compiler (7300) 
and with the Microsoft QuickC (tm) compiler.  

Thanks again to all who responded.  

Ted A. Campbell
Duke Divinity School 
{ethos, ecsgate}!dukeac!numen!tcamp

/****************************************************************

	VPT.C		Variable Precision Trigonometric Functions

****************************************************************/

#include "math.h"

extern double vpt_sin(), vpt_cos(), vpt_tan();

#define	DEG_RAD	1.745329252e-2
#define	RAD_DEG	5.729577951e1

double 	sin_table[ 91 ];
int	vpt_level = 1;

/* #define TEST */

#ifdef	TEST

#include "stdio.h"

char test_buf[ 64 ];

main()
	{
	int cont;
	static double x;
	vpt_init();
	cont = 1;
	while ( cont == 1 )
		{
		printf( "Enter precision level (1 or 2 ): " );
		fflush( stdout );
		gets( test_buf );
		if ( test_buf[ 0 ] == 0 )
			{
			return 0;
			}
		sscanf( test_buf, "%d", &vpt_level );
		printf( "Enter a number: " );
		fflush( stdout );
		gets( test_buf );
		if ( test_buf[ 0 ] == 0 )
			{
			return 0;
			}
		sscanf( test_buf, "%lf", &x );
		printf( "sin = %f, cos = %f, tan = %f \n", 
			vpt_sin( x ),
			vpt_cos( x ), 
			vpt_tan( x ) );
		}
	}

#endif

vpt_init()
	{	
	int i;
	for ( i = 0; i < 91; i++ )
		{
		sin_table[ i ] = sin( (double) DEG_RAD * i );
		}
	}

double 
vpt_sin( i )
	double i;
	{
	int sign, target, work;
	switch( vpt_level )
		{
		case 1:
			work = i;
			while ( work < 0 )
				{
				work += 360;
				}
			work = work % 360;
			if ( ( work >= 0 ) && ( work < 90 ))
				{
				sign = 1;
				target = work;
				}
			else if ( ( work >= 90 ) && ( work < 180 ))
				{
				sign = 1;
				target = 90 - ( work % 90 );
				}
			else if ( ( work >= 180 ) && ( work < 270 ))
				{
				sign = -1;
				target = work % 90;
				}
			else if ( ( work >= 270 ) && ( work < 360 ))
				{
				sign = -1;
				target = 90 - ( work % 90 );
				}
			else
				{
				return 0;
				}
			return sign * sin_table[ target ];
		default:
			return sin( DEG_RAD * i );
		}
	}

double
vpt_cos( i )
	double i;
	{
	int sign, target, work;
	switch( vpt_level )
		{
		case 1:
			work = i;
			while ( work < 0 )
				{
				work += 360;
				}
			work = work % 360;
			if ( ( work >= 0 ) && ( work < 90 ))
				{
				sign   = 1;
				target = 90 - work;
				}
			else if ( ( work >= 90 ) && ( work < 180 ))
				{
				sign   = -1;
				target = work % 90;
				}
			else if ( ( work >= 180 ) && ( work < 270 ))
				{
				sign   = -1;
				target = 90 - ( work % 90 );
				}
			else if ( ( work >= 270 ) && ( work < 360 ))
				{
				sign   = 1;
				target = work % 90;
				}
			else
				{
				return 0;
				}
			return sign * sin_table[ target ];
		default: 
			return cos( DEG_RAD * i );
		}
	}

double
vpt_tan( i )
	double i;
	{
	double c;
	switch( vpt_level )
		{
		case 1:
			c = vpt_cos( i );
			if ( c == 0 )
				{
				return 0;
				}
			else
				{
				return vpt_sin( i ) / vpt_cos( i );
				}
		default: 
			return tan( DEG_RAD * i );
		}
	}

leo@philmds.UUCP (Leo de Wit) (03/11/89)

In article <1274@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes:
|
|
|Thanks to all who responded to my query about high-speed, low
|precision trig functions for use in graphics applications.  
|I received numerous replies, along the following lines: 
    []
|(c) I have not utilized interpolation here.  If those who understand 
|mathematics can give me a model for an interpolation function, 
|we could incorporate it as a second level between 1 and the slower 
|routines.  

Simple. Given x values x0, x1 (x1 >= x0) and their corresponding y
values f(x0) = y0, f(x1) = y1, linear interpolation between these two
points just says: replace the curve f(x) by the straight line between
(x0,y0) and (x1,y1) for x values in the interval [x0,x1].

       l(x) - y0    y1 - y0
       --------- = ----------
       x - x0       x1 - x0

or

       l(x) = y0 + (x - x0) * (y1 - y0) / (x1 - x0)

       where l(x) is a linear approximation for f(x) on the interval [x0,x1].

Alternatively one can use linear approximation using the derivative (?)
of the functions at hand, since they are easy available: for sin(x) it
is cos(x) and for cos(x) it is -sin(x). This is what I used in the
example below. This is also called a Taylor series:

    f(x) = f(x0) + (x - x0) * f'(x0) + ....

|(d) These routines have been tested with AT&T Unix C compiler (7300) 
|and with the Microsoft QuickC (tm) compiler.  
|
|Thanks again to all who responded.  

I hope you don't mind me critisizing (part of) this code, any criticism
on my code would also be appreciated.

    []
|vpt_init()
|	{	
|	int i;
|	for ( i = 0; i < 91; i++ )
|		{
|		sin_table[ i ] = sin( (double) DEG_RAD * i );
|		}
|	}

Since we're talking approximations here, there is no need to calculate 
all sinuses for your table using sin(). The example I'll give shows how
it can be done.

|double 
|vpt_sin( i )
|	double i;
|	{
|	int sign, target, work;
|	switch( vpt_level )
|		{
|		case 1:
|			work = i;
|			while ( work < 0 )
|				{
|				work += 360;
|				}
|			work = work % 360;

This can take a loooooong time if work (i if you want) is a large
negative number. work == 3600000 for instance would take 10000
iterations. I would suggest in this case work = 360 - (-work % 360).
Note also that for initially negative numbers work = work % 360 is not
needed (because work will already be >= 0 and < 360). Another point
is that C float to int conversion does truncation instead of rounding
(and if I understand the dpANS correctly, is now no longer
implementation defined for negative numbers). So you should preferably
use work = i + 0.5 for positive i and work = i - 0.5 for negative i.

    []
|			else if ( ( work >= 90 ) && ( work < 180 ))
|				{
|				sign = 1;
|				target = 90 - ( work % 90 );
|				}

Since work is >= 90 and < 180, work % 90 will amount to work - 90, which
is perhaps a preferable operation (subtraction is a more basic operation
for many processors), especially since this leads to

				target = 180 - work;

There are many other examples where this could be used in the code.

|			return sign * sin_table[ target ];

I'd suggest you use a boolean for sign, and avoid doing a multiplication:

         return (sign) ? -sin_table[target] : sin_table[target];

And below is how I would do it (perhaps). sintab[] and costab[] I would
probably generate by a C program, thus avoiding having to do the
initialization. In this case however, init_tab() initiates the sin and 
cos tables, using the gonio rules:

	 sin(a+b) = sin(a) * cos(b) + cos(a) * sin(b);
	 cos(a+b) = cos(a) * cos(b) - sin(a) * sin(b);

Although these may lead to cumulative errors, these errors are quite small
in respect to any approximations.
The dsin() and dcos() macros are the simple, table-driven macros, the
macros dsin2() and dcos2() are a first order approximation (the error in
them is typically a factor 100 smaller).
Perhaps dsin2() and dcos2() should better be functions, thus avoiding to
recalculate x % 360.
Be warned! dsin(), dcos, dsin2(), and dcos2() all use their argument
several times, so avoid using side effects or (complex) calculations,
or implement them as functions if you must.

--------- c u t   h e r e ----------
#include <math.h>

/* One degree in radians */
#define ONE_DEG (M_PI / 180)

/* sin, cos for pos. x and neg. x (x in degrees), to be used by the macros
 * that follow.
 */
#define dsinp(x) sintab[(int)((x) + 0.5) % 180]
#define dsinn(x) -sintab[(int)(-(x) + 0.5) % 180]
#define dcosp(x) costab[(int)((x) + 0.5) % 180]
#define dcosn(x) costab[(int)(-(x) + 0.5) % 180]

/* Zero order approximation: simple table lookup (x in degrees) */
#define dsin(x) (((x) >= 0) ? dsinp(x) : dsinn(x))
#define dcos(x) (((x) >= 0) ? dcosp(x) : dcosn(x))

/* First order approximation (x in degrees) */
#define dsin2(x) (((x) >= 0) \
 ? (dsinp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dcosp(x)) \
 : (dsinn(x) + ONE_DEG * ((x) - (int)((x) - 0.5)) * dcosn(x)))
#define dcos2(x) (((x) >= 0) \
 ? (dcosp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dsinp(x)) \
 : (dcosn(x) - ONE_DEG * ((x) - (int)((x) - 0.5)) * dsinn(x)))

static double sintab[180], costab[180];

static void init_tab(), demo();

main()
{
    int i;

    init_tab();
    demo();
}

/* Shows abberation for dsin(), dcos(), dsin2() and dcos2() in the interval
 * [-56,-55]
 */
static void demo()
{
    double d;

    for (d = -55.; d >= -56.; d = d - 0.1) {
        printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin(d),
                                cos(d * ONE_DEG) - dcos(d));
        printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin2(d),
                                cos(d * ONE_DEG) - dcos2(d));
    }
}

/* This builds sintab,costab with errors of magnitude e-16, which can
 * be ignored without problem (the approximations introduce much
 * larger errors).
 */
static void init_tab()
{
    int i;
    double sin1 = sin(ONE_DEG), cos1 = cos(ONE_DEG);

    sintab[0] = 0.;
    costab[0] = 1.;

    for (i = 1; i <= 45; i++) {
        sintab[i]       = sintab[180 - i] = costab[90 - i]
                        = cos1 * sintab[i - 1] + sin1 * costab[i - 1];
        costab[90 + i]  = -sintab[i];
        costab[i]       = sintab[90 + i] = sintab[90 - i]
                        = cos1 * costab[i - 1] - sin1 * sintab[i - 1];
        costab[180 - i] = -costab[i];
    }
}
------- e n d s   h e r e ----------

	 Leo.

B.T.W. News had objections against some of the newsgroups in the list,
so I had to strip it a bit to get it out. Well ...
(Newsgroups was: misc.wanted,comp.lang.c,triangle.general,micro.general,triangle.graphics,micro.ibm)

leo@philmds.UUCP (Leo de Wit) (03/12/89)

Sorry for the followup, but my original posting contained an error with
respect to the sin / cos periods. The following one should be ok.

And below is how I would do it (perhaps). sintab[] and costab[] I would
probably generate by a C program, thus avoiding having to do the
initialization. In this case however, init_tab() initiates the sin and 
cos tables, using the gonio rules:

	 sin(a+b) = sin(a) * cos(b) + cos(a) * sin(b);
	 cos(a+b) = cos(a) * cos(b) - sin(a) * sin(b);

Although these may lead to cumulative errors, these errors are quite small
in respect to the approximations used.
The dsin() and dcos() macros are the simple, table-driven macros, the
macros dsin2() and dcos2() are a first order approximation (the error in
them is typically a factor 100 smaller).
Perhaps dsin2() and dcos2() should better be functions, thus avoiding to
recalculate x % 360.
Be warned! dsin(), dcos, dsin2(), and dcos2() all use their argument
several times, so avoid using side effects or (complex) calculations,
or implement them as functions if you must.

--------- c u t   h e r e ----------
#include <math.h>

/* One degree in radians */
#define ONE_DEG (M_PI / 180)

/* sin, cos for pos. x and neg. x (x in degrees), to be used by the macros
 * that follow.
 */
#define dsinp(x) sintab[(int)((x) + 0.5) % 360]
#define dsinn(x) -sintab[(int)(-(x) + 0.5) % 360]
#define dcosp(x) costab[(int)((x) + 0.5) % 360]
#define dcosn(x) costab[(int)(-(x) + 0.5) % 360]

/* Zero order approximation: simple table lookup (x in degrees) */
#define dsin(x) (((x) >= 0) ? dsinp(x) : dsinn(x))
#define dcos(x) (((x) >= 0) ? dcosp(x) : dcosn(x))

/* First order approximation (x in degrees) */
#define dsin2(x) (((x) >= 0) \
 ? (dsinp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dcosp(x)) \
 : (dsinn(x) + ONE_DEG * ((x) - (int)((x) - 0.5)) * dcosn(x)))
#define dcos2(x) (((x) >= 0) \
 ? (dcosp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dsinp(x)) \
 : (dcosn(x) - ONE_DEG * ((x) - (int)((x) - 0.5)) * dsinn(x)))

static double sintab[360], costab[360];

static void init_tab(), demo();

main()
{
    init_tab();
    demo();
}

/* Shows abberation for dsin(), dcos(), dsin2() and dcos2() in the interval
 * [-56,-55]
 */
static void demo()
{
    double d;

    for (d = -55.; d >= -56.; d = d - 0.1) {
        printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin(d),
                                cos(d * ONE_DEG) - dcos(d));
        printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin2(d),
                                cos(d * ONE_DEG) - dcos2(d));
    }
}

/* This builds sintab,costab with errors of size 10^-16, which can
 * be ignored without problem (the approximations introduce much
 * larger errors).
 */
static void init_tab()
{
    int i;
    double sin1 = sin(ONE_DEG), cos1 = cos(ONE_DEG);

    sintab[0] = sintab[180] = costab[90] = costab[270] = 0.;
    costab[0] = sintab[90] = 1.;
    costab[180] = sintab[270] = -1.;

    for (i = 1; i <= 45; i++) {
        sintab[i]       = sintab[180 - i]
                        = costab[90 - i]
                        = costab[270 + i]
                        = cos1 * sintab[i - 1] + sin1 * costab[i - 1];
        sintab[180 + i] = sintab[360 - i]
                        = costab[90 + i]
                        = costab[270 - i]
                        = -sintab[i];
        costab[i]       = costab[360 - i]
                        = sintab[90 + i]
                        = sintab[90 - i]
                        = cos1 * costab[i - 1] - sin1 * sintab[i - 1];
        costab[180 + i] = costab[180 - i]
                        = sintab[270 - i]
                        = sintab[270 + i]
                        = -costab[i];
    }
}
------- e n d s   h e r e ----------

	 Leo.