dgh@validgh.com (David G. Hough on validgh) (05/16/91)
A couple of issues have been confused in recent comp.arch discussion. One is whether argument reduction for elementary transcendental functions involves simultaneous computation of a quotient (preferably in integer format) and a remainder; another is whether instructions for that purpose should be included in computer instruction set architectures. Observe that the Motorola 68881 and 68882 implement exactly this operation in frem and fmod, but the quotient is probably never used because these chips also implement fsin, fcos, and fetox. The integer quotient is returned as a signed bitfield in the fpsr status register. The significance of the underlying mathematical operation can scarcely be doubted. As far as I know all software that attempts to compute sin(x) for even modestly large x, or that attempts to compute exp(x) over a full range, with any pretense of competitive accuracy and speed, has to directly or indirectly exploit formulas like sin(x) = sin( m * 2 * pi + n * pi/k + r) = {+ or -}{sin or cos}(r) and exp(x) = exp(n * loge(b) + r) = b**n * exp(r) where k is typically 2 and b is typically the base of floating-point arithmetic, usually 2. For sin, n = [x / (pi/2)] % 4 r = x % (pi/2) where the [] denote rounding to the nearest signed integer value, so that in this case |r| <= pi/4. n is used to select one of four cases in a switch statement, and so is appropriately an integer. For exp, n = [x / loge(b)] r = x % loge(b) Since n is going more or less directly to the exponent field of the result, it is also appropriately an integer, and a fairly small one. I'm not aware of any common floating-point formats that need more than 16 signed bits of n - larger x would overflow or underflow. These operations of producing an integer quotient and corresponding remainder are pretty fundamental, although not having been commonly implemented, people have learned to obtain equivalent or sort of equivalent results by indirect means. If an integral approximation n to [x/y] can be obtained somehow then a corresponding remainder r = x - y * n may be constructed provided y * n can be computed exactly. The computer architecture issues relate to language issues, the latency of remainder operations, and what precision to use for pi/2 and loge(b). In principle compilers could recognize adjacent uses of x % y and x / y and exploit any instructions available appropriately, although not many might readily be adapted to recognized a situation where an integer quotient and a floating-point remainder was desired. ANSI-C has some quotient-remainder functions but they are not quite right for this application. Since remainder operations for large magnitude x may take hundreds or thousands of cycles, interrupt latency may suffer or the remainder hardware must be designed to be interruptible in mid-instruction. Neither of these ideas sits well in the RISC universe. However it is perfectly feasible to define a partial remainder, as in the original 8087, which computes for some maximum number of cycles and indicates if the remainder operation is still incomplete. I had specified such a partial remainder in early drafts of the SPARC architecture but later removed it after it was apparent that it would not be implemented any time soon, and because of the second issue: What precision for pi/2 and loge(b) is appropriate? The 68881 and 68882 fsin uses an approximation to 66 significant bits to compute extended-precision results with 64 significant bits. That defers but does not eliminate the onset of gradual loss of significant bits in computed sin(x) for x close to even multiples of pi, as described in the March 1981 IEEE Computer. More lately I have subscribed to the view that the standard elementary trigonometric, logarithmic, and exponential functions should suffer no greater uncertainty due to argument reduction than due to subsequent approximation on a reduced range, which means that arguments should be reduced with values of pi/2 and loge(b) which are as accurate as necessary, depending on the size of the maximum exponent, which implies a somewhat different approach. I have sometimes advocated correctly-rounded transcendental functions, which is an even more demanding requirement, requiring a reduced result and its uncertainty as input to the calculation on a reduced range. The book by Cody and Waite summarizes the best thinking of the mid-1970's when it was written, just prior to the advent of IEEE arithmetic. The ongoing series of papers by Cody's colleague Peter Tang, published in ACM TOMS, are the best examples of current thinking about elementary transcendental functions which are almost correctly rounded, but for efficiency's sake, not quite. -- David Hough dgh@validgh.com uunet!validgh!dgh na.hough@na-net.ornl.gov