gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/11/84)
As long as we're proposing our favorite extensions to C (not that they should be in the forthcoming ANSI standard, but perhaps the standard should anticipate that these extensions might be made some day): I find that frequently I need both the quotient and remainder of two integers: a / b a % b Since most (maybe all) machine architectures compute these at the same time, it sure would be nice to be able to get both results rather than wastefully reexecuting precisely the same instructions a second time. This becomes particularly bothersome when "a" & "b" are fairly complicated expressions. I have no idea what a good syntax for such an operation would be. It would also be nice if sin( x ) and cos( x ) could be computed simultaneously with reduced cost. I doubt if this is possible but would like to know if it is.
guido@mcvax.UUCP (Guido van Rossum) (11/12/84)
>It would also be nice if sin( x ) and cos( x ) could be computed >simultaneously with reduced cost. I doubt if this is possible >but would like to know if it is. Can't you use (sin x)**2 + (cos x)**2 = 1 ? This suggests that abs cos x = root(1 - (sin x)**2); surely something can be said about the sign given the range x is in. This assumes computing a square root is faster than computing a sine. This is as close as you can g; the type polynomial approximations used for computing sine/cosine are not symmetrical around pi/4. (I doubt that they are symmetrical around any point, since their precision is only relevant between 0 and pi/2.) Anyway, I wouldn't want to add such a function to the standard library. A specialized numerical library could provide one, however. BTW, does anybody know whether the NAG library of numerical algorithms has been translated into C? This surely would make a great standard! Thesis: a language standard should not incorporate standards for numerical functions. It should only provide the raw material to allow a numerical library to be written. (I believe the ANSI Committee is well aware of this, and only defines the most basic numerical functions. If only compiler writers would provide decent implementations of these... (and of printf %e/%f/%g).) -- Guido van Rossum, "Stamp Out BASIC" Committee, CWI, Amsterdam, Holland guido@mcvax.UUCP "Life is like a sewer. What you get out of it, depends on what you put into it."
quiroz@rochester.UUCP (Cesar Quiroz) (11/12/84)
>(from a recent posting :) > I find that frequently I need both the quotient and remainder of > two integers: a / b a % b > Since most (maybe all) machine architectures compute these at the > same time, it sure would be nice to be able to get both results > rather than wastefully reexecuting precisely the same instructions > a second time. This becomes particularly bothersome when "a" & "b" > are fairly complicated expressions. I have no idea what a good > syntax for such an operation would be. > > It would also be nice if sin( x ) and cos( x ) could be computed > simultaneously with reduced cost. I doubt if this is possible > but would like to know if it is. Both are reasonable expectations, but seem to have a place in a standard library rather than in the syntax of the language. The sin/cos simultaneous computing is something you can program in C completely. Getting both quotient and remainder at the same time cannot (at least not without incurring a procedure call, which denies the advantage) and you may have a point if you expect the compiler to recognize something like: intdiv(a,b,&q,&r); and expand it in line. However, it may be too much additional overhead for the gains. In such a case, you may feel forced to write assem... (no I didn't mean to type that!). Cesar
alan@apollo.uucp (Alan Lehotsky) (11/14/84)
Regarding the suggestion of being able to get SIN and COS or (DIV and REM) at the same time - all you need is a good compiler! I believe that the VAX/VMS FORTRAN compiler looks for computations of SIN(X) and COS(X) and calls a special entrypoint into the run-time-library which returns BOTH values at once. Similarly, I designed (but never implemented) a modification to the BLISS compiler which would have done the same thing for DIV and REM - I don't know if this was actually finished after I left DEC. By the way, I'm unaware of the existance of any "good" C compilers. [ A "good C compiler" is definitely an oxymoron!] << Put down that flamethrower!!>> Al Lehotsky
andrew@orca.UUCP (Andrew Klossner) (11/14/84)
"It would also be nice if sin( x ) and cos( x ) could be computed simultaneously with reduced cost. I doubt if this is possible but would like to know if it is." Sure: sine = sin(x); cosine = sqrt(1.0 - sine*sine); Sqrt() is lots cheaper than cos(). :-) -- Andrew Klossner (decvax!tektronix!orca!andrew) [UUCP] (orca!andrew.tektronix@csnet-relay) [ARPA]
geoff@desint.UUCP (Geoff Kuenning) (11/15/84)
In article <5715@brl-tgr.ARPA> Doug Gwyn <gwyn@brl-tgr.ARPA> writes: >I find that frequently I need both the quotient and remainder of >two integers: a / b a % b >Since most (maybe all) machine architectures compute these at the >same time, it sure would be nice to be able to get both results >rather than wastefully reexecuting precisely the same instructions >a second time. This becomes particularly bothersome when "a" & "b" >are fairly complicated expressions. I have no idea what a good >syntax for such an operation would be. This is the business of the optimizer, not the programmer. Even a peephole optimizer should be able to handle: x = a / b; y = a % b; and any optimizer that can handle common subexpressions can take care of this even when a and b are complex expressions. C already has too many "features" that are basically ways for the programmer to compensate for poor optimizers. Let's move into the 20th century here. >It would also be nice if sin( x ) and cos( x ) could be computed >simultaneously with reduced cost. I doubt if this is possible >but would like to know if it is. It is. The obvious way is to make use of the identity sin (x) == sqrt (1 - cos (x) * cos (x)) which can be computed slightly faster than sin(x) on some architectures. I think that there are also numerical algorithms that generate both functions at once, though I am out of my field here. I know I have run into a routine (in the Evans and Sutherland Picture System library?) named 'sincos', which returned both values at once for use in rotation calculations, but it may have been fixed-point. -- Geoff Kuenning First Systems Corporation ...!ihnp4!trwrb!desint!geoff
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/16/84)
>> It would also be nice if sin( x ) and cos( x ) could be >> computed simultaneously with reduced cost. I doubt if this is >> possible but would like to know if it is. > sine = sin(x); > cosine = sqrt(1.0 - sine*sine); I wish people would check things out before posting them. On our 4.2BSD system, the sqrt() approach is actually SLOWER than using cos(). (Under UNIX System V it is very slightly faster.) Now, if I wanted cos(x)^2, I would certainly use 1 - sin(x)^2, which I already knew about (I too went to high school). What I had in mind was more along the lines of a CORDIC algorithm or some other obscure but useful approach.
john@x.UUCP (John Woods) (11/16/84)
>Regarding the suggestion of being able to get SIN and COS or (DIV and REM) at >the same time - all you need is a good compiler! > > By the way, I'm unaware of the existance of any "good" C compilers. > [ A "good C compiler" is definitely an oxymoron!] When I worked at Lincoln Labs, the performance of the UNIX (4.1) f77 was compared against the VMS FORTRAN, in (I believe) an application for digital signal processing. VMS FORTRAN came out (as I recall) as twice as fast as the f77 compiler (resulting code, not speed of compilation). A 4.1BSD C version of the same program was 10% faster than the VMS FORTRAN. The result is left to the reader as an exercise. :-) -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1114 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA If your puppy goes off in the next room, is it because of the explosive charge? [y][n]
john@x.UUCP (John Woods) (11/16/84)
> > "It would also be nice if sin( x ) and cos( x ) could be > computed simultaneously with reduced cost. I doubt if this is > possible but would like to know if it is." > On a more serious note than my last message, how about: circular( value, sinptr, cosptr) double value, *sinptr, *cosptr; { /* compute both and store, saving function overhead and sharing some commonality of code */ } It's easy. Buggering the compiler to do it solves *one* case fast, at the expense of still having garbage code. Reminds me of compiler writers who detect the famous FORTRAN "Whetstone" benchmark and output hand-written assembly code for the whole thing, so as to achieve artificially high scores.. -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1114 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA If your puppy goes off in the next room, is it because of the explosive charge? [y][n]
ken@turtlevax.UUCP (Ken Turkowski) (11/16/84)
> >It would also be nice if sin( x ) and cos( x ) could be computed > >simultaneously with reduced cost. I doubt if this is possible > >but would like to know if it is. > > It is. The obvious way is to make use of the identity > > sin (x) == sqrt (1 - cos (x) * cos (x)) > > which can be computed slightly faster than sin(x) on some architectures. > I think that there are also numerical algorithms that generate both > functions at once, though I am out of my field here. I know I have run into > a routine (in the Evans and Sutherland Picture System library?) named > 'sincos', which returned both values at once for use in rotation > calculations, but it may have been fixed-point. > > Geoff Kuenning There is a class of algorithms, called CORDIC (for COordinate Rotation DIgital Computer), which calculates sine and cosine simultaneously with linear convergence in fixed point with no multiplications. There are one test, two shifts, and three adds per bit of precision. It is very easily written entirely in C (except for frmul(), a simple fractional multiplication). The original paper is: Volder, Jack E. "The CORDIC Trigonometric Computing Technique", IRE Trans. on Electronic Computers, Sept. 1959, pp.330-334 Extension to other functions (multiplication, division, sin, cos, tan, arctan, sinh, cosh, tanh, arctanh, ln, exp, and square root) is covered in: Walther, J.S., "A Unified Algorithm for Elementary Functions", Proc. AFIPS 1971 Spring Joint Computer Conference, pp. 379-385 Theory of CORDIC generalizations is discussed in: Chen, Tien Chi, "Automatic Computation of Exponentials, Logarithms, Ratios, and Sqaure Roots", IBM J. Res. Develop., July 1972, pp. 380-388 -- Ken Turkowski @ CADLINC, Palo Alto, CA UUCP: {amd,decwrl,flairvax,nsc}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.ARPA
david@imd.UUCP (11/18/84)
> sine = sin(x); > cosine = sqrt(1.0 - sine*sine); >Sqrt() is lots cheaper than cos(). But is it the positive or negative root of the sqrt? By the time that was figured out, one probably could have called cos().
ken@turtlevax.UUCP (Ken Turkowski) (11/18/84)
> >> It would also be nice if sin( x ) and cos( x ) could be > >> computed simultaneously with reduced cost. I doubt if this is > >> possible but would like to know if it is. > > > sine = sin(x); > > cosine = sqrt(1.0 - sine*sine); > > I wish people would check things out before posting them. On our > 4.2BSD system, the sqrt() approach is actually SLOWER than using > cos(). (Under UNIX System V it is very slightly faster.) Now, > if I wanted cos(x)^2, I would certainly use 1 - sin(x)^2, which > I already knew about (I too went to high school). > > What I had in mind was more along the lines of a CORDIC algorithm > or some other obscure but useful approach. In Tom Duff's SIGGRAPH '84 tutorial on Numerical Methods for Computer Graphics, he writes (quote until end of article): Calculating Euclidean distance probably accounts for 90% of the square roots expended in computer graphics applications. Aside from the obvious geometric uses, most illumination models require that the unit normal to each surface be computed at each pixel. [Moler-Morrison, "Replacing Square Roots by Pythagorean Sums", IBM Journal of Research and Development", vol. 27, no. 6, pp. 577-581, Nov. 1983] describes an algorithm for computing sqrt(a^2+b^2) which is fast, robust and portable. The naive approach to this problem (just writing sqrt(a*a+b*b) ) has the problem that for many cases when the result is a representable floating point number, the intermediate results may overflow or underflow. This seriously restricts the usable range of a and b. Moler and Morrison's method never causes harmful overflow or underflow when the result is in range. The algorithm has _____cubic convergence, and depending on implementation details may be even faster than sqrt(a*a+b*b). Here is a C function that implements their algorithm: double hypot(p,q) double p,q; { double r, s; if (p < 0) p = -p; if (q < 0) q = -q; if (p < q) { r = p; p = q; q = r; } if (p == 0) return 0; for (;;) { r = q / p; r *= r; s = r + 4; if (s == 4) return p; r /= s; p += 2 * r * p; q *= r; } } This routine produces a result good to 6.5 digits after 2 iterations, to 20 digits after 3 iterations, and to 62 digits after 4 iterations. In normal use you would probably unwind the loop as many times as you need and omit the test altogether. Moler and Morrison's paper describes the generalization of this algorithm to n dimensions, and exhibits a related square root algorithm which also has cubic convergence. [Dubrelle, "A Class of Numerical Methods for the Computation of Pythagorean Sums", IBM Journal of Research and Development, vol. 27, no. 6, pp. 582-589, Nov. 1983] gives a geometric justification of the algorithm and presents a set of generalizations that have arbitrarily large order of convergence. -- Ken Turkowski @ CADLINC, Palo Alto (soon Menlo Park), CA UUCP: {amd,decwrl,flairvax,nsc}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.ARPA
grunwald@uiucdcsb.UUCP (11/19/84)
Actually, one of the cute things about the CORDIC method is that it computes the sin/cos by rotating a vector. It turns out that you can use the CORDIC methods to rotate vectors for graphics without needing to compute the Sin/Cos at all. But this does mean 3 adds and shifts per bit for every vector rotated. A multiplication takes only 1 add and shift per bit, but you need to do four multiplies for rotation. A classmate made a VLSI chip to do just this in a VLSI design class I took.
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/20/84)
> > "It would also be nice if sin( x ) and cos( x ) could be > > computed simultaneously with reduced cost. ... > > It's easy. Buggering the compiler to do it solves *one* case fast, at the > expense of still having garbage code. Sorry, John, but you missed the point of my inquiry. I was not suggesting that the compiler should have anything to do with this. I want a speedy algorithm that computes both functions simultaneously. The majority of the time spent is NOT in the function linkage but rather in the guts of the computation (which is usually range reduction followed by a rational approximation). The reason why this matters should be pretty obvious to anyone who does graphics programming for a living.
Mike Niswonger <CNISWONGER@SIMTEL20.ARPA> (11/21/84)
Doug,
I have run into the same situation as you in often needing both
sin and cos in doing coordinate transformations. However, after looking
at the algorithims presently used, only a small amount of time could be
saved.
Most commercial FORTRAN compilers use a 4 or 5 term Chebychev app-
roximation. Unfortunately, odd powers are used for sine and even terms
are used for the cosine (or is it the other way around?? never could re-
member). The only actual duplicated code would be where the input angle
is reduced ("unwrapped") to its reference angle. On the machine that I
normally use, a Gould 32/8780, an SIN or a COS take about 7.5 uS, with
a SQRT taking about 8 uS. These times are including the FORTRAN overhead.
>>sigh!!<< With something that simple, I decided not to shave any
more - it wasn't worth the effort. -- Mike
.
-------
dik@turing.UUCP (11/30/84)
>The reason why this matters should be pretty obvious to anyone who >does graphics for a living. In that case it is simple. You do not need large parts of the fancy stuff executed by standard sine and cosine routines; e.g. you will not need a 16 digit precision (as delivered by the standard Vax Unix sine and cosine). Moreover, you may make quite sure that the argument to these functions is within a range like [-pi,pi], so range reduction is also much simpler to do. It is in general a pain that all standard functions in C deliver a result in double precision (what?, oh yes, the attempt is made) although you need it in single precision, which results in a waste of time. -- dik t. winter centrum voor wiskunde en informatica postbus 4079 1009 AB amsterdam nederland +31 20 592 4102 (polish your dutch) UUCP: decvax!mcvax!turing!dik
dik@turing.UUCP (11/30/84)
> A multiplication takes only 1 add > and shift per bit, ..... Oh yes, but you can have a lot of parallelism there, so the time needed for a 32*32 bit multiply is comparable to about 5 adds. In contrast, the CORDIC methods allow also some parallelism, but the timing is still one bit in about one add time. -- dik t. winter centrum voor wiskunde en informatica postbus 4079 1009 AB amsterdam nederland +31 20 592 4102 (polish your dutch) UUCP: decvax!mcvax!turing!dik