tcamp@dukeac.UUCP (Ted A. Campbell) (02/28/89)
I am writing a graphics-based program that has to calculate several thousand map coordinates. The slowest part of this program by far is its trigonometric functions--it takes 3 minutes and 45 seconds to calculate 3000 sets ofd coordinates on an XT clone at 10 mhz (no coprocessor), which is the lowest resolution map I can make. I wondered if there were any fast, low-precision approaches to the standard trig functions, since the compilers I use (Microsoft QuickC and AT&T Unix C) boast of their high level of precision? Any suggestions will be appreciated. Thanks. Ted A. Campbell Duke Divinity School {ethos, ecsgate}!dukeac!numen!tcamp
gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/28/89)
In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes: >I am writing a graphics-based program that has to calculate >several thousand map coordinates. ... >I wondered if there were any fast, low-precision approaches >to the standard trig functions, ... The cost of the "extra precision" over a less precise but otherwise similar approximation is typically small. However, there may be hope -- I know of two circumstances under which one can substantially speed up such functions: (1) If all coordinates will be taken from a grid, as when dealing with graphic display devices, then several methods are available for quickly returning answers that are as precise as the grid resolution allows. Unfortunately the ones I know of are implemented as proprietary code. If you have access to sources for graphic devices, e.g. DMD, check there for fast approximate trig functions. (2) For more or less arbitrary floating-point coordinates, if there are only a manageable number of distinct function arguments, then you may be able to "cache" them, either precomputed or computed the first time a value is needed, stashing the value for immediate use the next time the same argument is supplied. For this to work well, there must be a fast method of mapping the argument into the correct cache entry. (a) If the arguments are all regularly spaced, for example integral number of degrees, then the cache can be an array of return values indexed by the integer number of spacing intervals. E.g. double cosval[360] = { 1.0, 0.999848, ... }; (b) If the arguments are arbitrary within some domain, but the function is nicely-behaved, then you can obtain good enough accuracy by converting the argument to some number of intervals, as above, rounding to the nearest integral number, and proceed as above. For example, the slope of cos() is no worse than +-1, so for 0.1% accuracy it would suffice to divide the range [0,Pi/2] into on the order of a thousand bins. Once you start thinking along these lines, you can probably spot several areas for code speed improvement. The key is that the cost to access the cache must be less than the cost of whatever approximation method the standard library is using. If your machine has a "polynomial evaluate" instruction, it might be hard to beat the standard library.
bill@twwells.uucp (T. William Wells) (02/28/89)
In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes:
: I wondered if there were any fast, low-precision approaches
: to the standard trig functions, since the compilers I use
: (Microsoft QuickC and AT&T Unix C) boast of their high
: level of precision?
Try table lookup and linear interpolation.
---
Bill
{ uunet | novavax } !twwells!bill
henry@utzoo.uucp (Henry Spencer) (03/01/89)
In article <1256@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes: >...I wondered if there were any fast, low-precision approaches >to the standard trig functions... Consider table lookup, followed (if necessary) by linear interpolation between adjacent table values (most functions look fairly linear on a small scale). -- The Earth is our mother; | Henry Spencer at U of Toronto Zoology our nine months are up. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
tcamp@dukeac.UUCP (Ted A. Campbell) (03/09/89)
Thanks to all who responded to my query about high-speed, low precision trig functions for use in graphics applications. I received numerous replies, along the following lines: (a) Most suggested the development of fast look-up tables, perhaps with some variation of interpolation between near values to increase precision. (b) Some suggested more elaborate numerical algorithms which, alas, I don't understand. (c) Some suggest the purchase of a co-processor, which perhaps someday I shall do. The attached code is based on a suggestion for a lookup table and sample code sent to me from Fraser Orr at the Department of Computer Science, University of Glasgow. (If he is listening--THANKS--I doubt that I could find a path to his machine.) The code below allows me to use the functions vpt_sin(), vpt_cos(), and vpt_tan() at either low precision (high speed) or high precision (low speed). Note: (a) The global variable vpt_level selects precision level: if set to 1, high-speed look-up tables are used; if set to any other values, the routines will convert to radians and utilize the normal sin(), cos(), and tan() functions. (b) The function vpt_init() must be called before any of these routines are used at level 1. (c) I have not utilized interpolation here. If those who understand mathematics can give me a model for an interpolation function, we could incorporate it as a second level between 1 and the slower routines. (d) These routines have been tested with AT&T Unix C compiler (7300) and with the Microsoft QuickC (tm) compiler. Thanks again to all who responded. Ted A. Campbell Duke Divinity School {ethos, ecsgate}!dukeac!numen!tcamp /**************************************************************** VPT.C Variable Precision Trigonometric Functions ****************************************************************/ #include "math.h" extern double vpt_sin(), vpt_cos(), vpt_tan(); #define DEG_RAD 1.745329252e-2 #define RAD_DEG 5.729577951e1 double sin_table[ 91 ]; int vpt_level = 1; /* #define TEST */ #ifdef TEST #include "stdio.h" char test_buf[ 64 ]; main() { int cont; static double x; vpt_init(); cont = 1; while ( cont == 1 ) { printf( "Enter precision level (1 or 2 ): " ); fflush( stdout ); gets( test_buf ); if ( test_buf[ 0 ] == 0 ) { return 0; } sscanf( test_buf, "%d", &vpt_level ); printf( "Enter a number: " ); fflush( stdout ); gets( test_buf ); if ( test_buf[ 0 ] == 0 ) { return 0; } sscanf( test_buf, "%lf", &x ); printf( "sin = %f, cos = %f, tan = %f \n", vpt_sin( x ), vpt_cos( x ), vpt_tan( x ) ); } } #endif vpt_init() { int i; for ( i = 0; i < 91; i++ ) { sin_table[ i ] = sin( (double) DEG_RAD * i ); } } double vpt_sin( i ) double i; { int sign, target, work; switch( vpt_level ) { case 1: work = i; while ( work < 0 ) { work += 360; } work = work % 360; if ( ( work >= 0 ) && ( work < 90 )) { sign = 1; target = work; } else if ( ( work >= 90 ) && ( work < 180 )) { sign = 1; target = 90 - ( work % 90 ); } else if ( ( work >= 180 ) && ( work < 270 )) { sign = -1; target = work % 90; } else if ( ( work >= 270 ) && ( work < 360 )) { sign = -1; target = 90 - ( work % 90 ); } else { return 0; } return sign * sin_table[ target ]; default: return sin( DEG_RAD * i ); } } double vpt_cos( i ) double i; { int sign, target, work; switch( vpt_level ) { case 1: work = i; while ( work < 0 ) { work += 360; } work = work % 360; if ( ( work >= 0 ) && ( work < 90 )) { sign = 1; target = 90 - work; } else if ( ( work >= 90 ) && ( work < 180 )) { sign = -1; target = work % 90; } else if ( ( work >= 180 ) && ( work < 270 )) { sign = -1; target = 90 - ( work % 90 ); } else if ( ( work >= 270 ) && ( work < 360 )) { sign = 1; target = work % 90; } else { return 0; } return sign * sin_table[ target ]; default: return cos( DEG_RAD * i ); } } double vpt_tan( i ) double i; { double c; switch( vpt_level ) { case 1: c = vpt_cos( i ); if ( c == 0 ) { return 0; } else { return vpt_sin( i ) / vpt_cos( i ); } default: return tan( DEG_RAD * i ); } }
leo@philmds.UUCP (Leo de Wit) (03/11/89)
In article <1274@dukeac.UUCP> tcamp@dukeac.UUCP (Ted A. Campbell) writes: | | |Thanks to all who responded to my query about high-speed, low |precision trig functions for use in graphics applications. |I received numerous replies, along the following lines: [] |(c) I have not utilized interpolation here. If those who understand |mathematics can give me a model for an interpolation function, |we could incorporate it as a second level between 1 and the slower |routines. Simple. Given x values x0, x1 (x1 >= x0) and their corresponding y values f(x0) = y0, f(x1) = y1, linear interpolation between these two points just says: replace the curve f(x) by the straight line between (x0,y0) and (x1,y1) for x values in the interval [x0,x1]. l(x) - y0 y1 - y0 --------- = ---------- x - x0 x1 - x0 or l(x) = y0 + (x - x0) * (y1 - y0) / (x1 - x0) where l(x) is a linear approximation for f(x) on the interval [x0,x1]. Alternatively one can use linear approximation using the derivative (?) of the functions at hand, since they are easy available: for sin(x) it is cos(x) and for cos(x) it is -sin(x). This is what I used in the example below. This is also called a Taylor series: f(x) = f(x0) + (x - x0) * f'(x0) + .... |(d) These routines have been tested with AT&T Unix C compiler (7300) |and with the Microsoft QuickC (tm) compiler. | |Thanks again to all who responded. I hope you don't mind me critisizing (part of) this code, any criticism on my code would also be appreciated. [] |vpt_init() | { | int i; | for ( i = 0; i < 91; i++ ) | { | sin_table[ i ] = sin( (double) DEG_RAD * i ); | } | } Since we're talking approximations here, there is no need to calculate all sinuses for your table using sin(). The example I'll give shows how it can be done. |double |vpt_sin( i ) | double i; | { | int sign, target, work; | switch( vpt_level ) | { | case 1: | work = i; | while ( work < 0 ) | { | work += 360; | } | work = work % 360; This can take a loooooong time if work (i if you want) is a large negative number. work == 3600000 for instance would take 10000 iterations. I would suggest in this case work = 360 - (-work % 360). Note also that for initially negative numbers work = work % 360 is not needed (because work will already be >= 0 and < 360). Another point is that C float to int conversion does truncation instead of rounding (and if I understand the dpANS correctly, is now no longer implementation defined for negative numbers). So you should preferably use work = i + 0.5 for positive i and work = i - 0.5 for negative i. [] | else if ( ( work >= 90 ) && ( work < 180 )) | { | sign = 1; | target = 90 - ( work % 90 ); | } Since work is >= 90 and < 180, work % 90 will amount to work - 90, which is perhaps a preferable operation (subtraction is a more basic operation for many processors), especially since this leads to target = 180 - work; There are many other examples where this could be used in the code. | return sign * sin_table[ target ]; I'd suggest you use a boolean for sign, and avoid doing a multiplication: return (sign) ? -sin_table[target] : sin_table[target]; And below is how I would do it (perhaps). sintab[] and costab[] I would probably generate by a C program, thus avoiding having to do the initialization. In this case however, init_tab() initiates the sin and cos tables, using the gonio rules: sin(a+b) = sin(a) * cos(b) + cos(a) * sin(b); cos(a+b) = cos(a) * cos(b) - sin(a) * sin(b); Although these may lead to cumulative errors, these errors are quite small in respect to any approximations. The dsin() and dcos() macros are the simple, table-driven macros, the macros dsin2() and dcos2() are a first order approximation (the error in them is typically a factor 100 smaller). Perhaps dsin2() and dcos2() should better be functions, thus avoiding to recalculate x % 360. Be warned! dsin(), dcos, dsin2(), and dcos2() all use their argument several times, so avoid using side effects or (complex) calculations, or implement them as functions if you must. --------- c u t h e r e ---------- #include <math.h> /* One degree in radians */ #define ONE_DEG (M_PI / 180) /* sin, cos for pos. x and neg. x (x in degrees), to be used by the macros * that follow. */ #define dsinp(x) sintab[(int)((x) + 0.5) % 180] #define dsinn(x) -sintab[(int)(-(x) + 0.5) % 180] #define dcosp(x) costab[(int)((x) + 0.5) % 180] #define dcosn(x) costab[(int)(-(x) + 0.5) % 180] /* Zero order approximation: simple table lookup (x in degrees) */ #define dsin(x) (((x) >= 0) ? dsinp(x) : dsinn(x)) #define dcos(x) (((x) >= 0) ? dcosp(x) : dcosn(x)) /* First order approximation (x in degrees) */ #define dsin2(x) (((x) >= 0) \ ? (dsinp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dcosp(x)) \ : (dsinn(x) + ONE_DEG * ((x) - (int)((x) - 0.5)) * dcosn(x))) #define dcos2(x) (((x) >= 0) \ ? (dcosp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dsinp(x)) \ : (dcosn(x) - ONE_DEG * ((x) - (int)((x) - 0.5)) * dsinn(x))) static double sintab[180], costab[180]; static void init_tab(), demo(); main() { int i; init_tab(); demo(); } /* Shows abberation for dsin(), dcos(), dsin2() and dcos2() in the interval * [-56,-55] */ static void demo() { double d; for (d = -55.; d >= -56.; d = d - 0.1) { printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin(d), cos(d * ONE_DEG) - dcos(d)); printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin2(d), cos(d * ONE_DEG) - dcos2(d)); } } /* This builds sintab,costab with errors of magnitude e-16, which can * be ignored without problem (the approximations introduce much * larger errors). */ static void init_tab() { int i; double sin1 = sin(ONE_DEG), cos1 = cos(ONE_DEG); sintab[0] = 0.; costab[0] = 1.; for (i = 1; i <= 45; i++) { sintab[i] = sintab[180 - i] = costab[90 - i] = cos1 * sintab[i - 1] + sin1 * costab[i - 1]; costab[90 + i] = -sintab[i]; costab[i] = sintab[90 + i] = sintab[90 - i] = cos1 * costab[i - 1] - sin1 * sintab[i - 1]; costab[180 - i] = -costab[i]; } } ------- e n d s h e r e ---------- Leo. B.T.W. News had objections against some of the newsgroups in the list, so I had to strip it a bit to get it out. Well ... (Newsgroups was: misc.wanted,comp.lang.c,triangle.general,micro.general,triangle.graphics,micro.ibm)
leo@philmds.UUCP (Leo de Wit) (03/12/89)
Sorry for the followup, but my original posting contained an error with respect to the sin / cos periods. The following one should be ok. And below is how I would do it (perhaps). sintab[] and costab[] I would probably generate by a C program, thus avoiding having to do the initialization. In this case however, init_tab() initiates the sin and cos tables, using the gonio rules: sin(a+b) = sin(a) * cos(b) + cos(a) * sin(b); cos(a+b) = cos(a) * cos(b) - sin(a) * sin(b); Although these may lead to cumulative errors, these errors are quite small in respect to the approximations used. The dsin() and dcos() macros are the simple, table-driven macros, the macros dsin2() and dcos2() are a first order approximation (the error in them is typically a factor 100 smaller). Perhaps dsin2() and dcos2() should better be functions, thus avoiding to recalculate x % 360. Be warned! dsin(), dcos, dsin2(), and dcos2() all use their argument several times, so avoid using side effects or (complex) calculations, or implement them as functions if you must. --------- c u t h e r e ---------- #include <math.h> /* One degree in radians */ #define ONE_DEG (M_PI / 180) /* sin, cos for pos. x and neg. x (x in degrees), to be used by the macros * that follow. */ #define dsinp(x) sintab[(int)((x) + 0.5) % 360] #define dsinn(x) -sintab[(int)(-(x) + 0.5) % 360] #define dcosp(x) costab[(int)((x) + 0.5) % 360] #define dcosn(x) costab[(int)(-(x) + 0.5) % 360] /* Zero order approximation: simple table lookup (x in degrees) */ #define dsin(x) (((x) >= 0) ? dsinp(x) : dsinn(x)) #define dcos(x) (((x) >= 0) ? dcosp(x) : dcosn(x)) /* First order approximation (x in degrees) */ #define dsin2(x) (((x) >= 0) \ ? (dsinp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dcosp(x)) \ : (dsinn(x) + ONE_DEG * ((x) - (int)((x) - 0.5)) * dcosn(x))) #define dcos2(x) (((x) >= 0) \ ? (dcosp(x) + ONE_DEG * ((x) - (int)((x) + 0.5)) * dsinp(x)) \ : (dcosn(x) - ONE_DEG * ((x) - (int)((x) - 0.5)) * dsinn(x))) static double sintab[360], costab[360]; static void init_tab(), demo(); main() { init_tab(); demo(); } /* Shows abberation for dsin(), dcos(), dsin2() and dcos2() in the interval * [-56,-55] */ static void demo() { double d; for (d = -55.; d >= -56.; d = d - 0.1) { printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin(d), cos(d * ONE_DEG) - dcos(d)); printf("%g: %g %g\n", d,sin(d * ONE_DEG) - dsin2(d), cos(d * ONE_DEG) - dcos2(d)); } } /* This builds sintab,costab with errors of size 10^-16, which can * be ignored without problem (the approximations introduce much * larger errors). */ static void init_tab() { int i; double sin1 = sin(ONE_DEG), cos1 = cos(ONE_DEG); sintab[0] = sintab[180] = costab[90] = costab[270] = 0.; costab[0] = sintab[90] = 1.; costab[180] = sintab[270] = -1.; for (i = 1; i <= 45; i++) { sintab[i] = sintab[180 - i] = costab[90 - i] = costab[270 + i] = cos1 * sintab[i - 1] + sin1 * costab[i - 1]; sintab[180 + i] = sintab[360 - i] = costab[90 + i] = costab[270 - i] = -sintab[i]; costab[i] = costab[360 - i] = sintab[90 + i] = sintab[90 - i] = cos1 * costab[i - 1] - sin1 * sintab[i - 1]; costab[180 + i] = costab[180 - i] = sintab[270 - i] = sintab[270 + i] = -costab[i]; } } ------- e n d s h e r e ---------- Leo.