jimv@radix (Jim Valerio) (09/27/87)
I am probably one of the few people who reads this newsgroup that has microcoded transcendental floating-point instructions. I wrote significant portions of the transcendental microcode for the 80387. I also have written significant portions of math libraries that implement transcendental functions. In this article, I speak as an implementor of the functions, rather than as a user of the functions. As a user of the functions, I couldn't care less whether they are in software, microcode, or hardware, just as long as they are provided and don't surprise me with their results. In article <705@gumby.UUCP> earl@mips.UUCP (Earl Killian) writes: >The 68881 transcendentals are not implemented in hardware; they are >implemented in microcode. I believe the extra 0.5-1.5ulp of accuracy >of the 68881 is due to the use of extended precision calculations, not >to either hardware or algorithm (simple rational approximations are >very accurate too when evaluated in extended precision). The advantage I found of microcoding transcendental instructions over software implementations of the functions were twofold. One advantage was that the mantissa computations were done using the unrounded intermediate results, which had approximately 3 extra bits of significance. The extra bits in a few critical places allowed simpler approximation functions to be used that, when rounded, delivered accurate (i.e. correctly rounded) results. Without the extra bits, a software implementation would need to effectively compute double-precision operations. The other advantage I found with the microcoded implementation over a software implementation was that it was more convenient to do non-standard operations. By this I mean that in a software implementation, I would be obliged to do add and multiply types of operations (or whatever the instruction set gave me). The only alternative would be to break the numbers apart and do integer operations on the mantissas and exponents. In the microcoded implementation, the hardware support for manipulating pieces of floating-point numbers was easily accessible, and I was not encouraged to think of every operation I was performing as an arithmetic operation on a floating-point data type. The bottom line is that, given the sort of hardware support I had available (this means area constraints, folks), I was able to get significantly faster and better approximations using microcode than I could get from software. Earl goes on to say: > implementing transcendentals in 68881 microcode did >nothing to make them fast. The cycle counts for sin, cos, tan, atan, >log, exp, etc. average about 3.5 longer for 68881 instructions than >for MIPS R2000 libm subroutines. The MIPS implementation is laudable, but there are many more issues than speed involved here. One is accuracy. Often more important than accuracy is monotonicity. (If the mathematical function is monotonic over a region, is the approximated function monotonic over the same region?) Polynomial approximation techniques often have monotonicity problems. Other issues include working on making simple transcendental identities hold in floating-point computations (e.g. sin(-x) = -sin(x), exp(-x) = 1/exp(x)). Now, I suspect that MIPS used the 4.3bsd libm. This is a very good math library, with man-years of work put into it by floating-point experts to give it many of the important attributes of a good library. It boasts of high accuracy and no observed monotonicity errors. Earl expresses some doubt that a microcoded implementation of the transcendental functions is the right way to gain extra accuracy, and suggests that providing hardware extended precision would be the better approach. Unfortuately, the 4.3bsd library would need to be largely rewritten in non-trivial ways to take advantage of the extended precision. I believe that a company committed to excellent floating-point, given the choice of implementing transcendental functions in software when no excellent library is available, or implementing them with significant hardware support, would be crazy to go the software route. Of course, most companies choose to forego the excellence, and hope that the users don't notice. I would like to make a few miscellaneous comments on the transcendental functions. The intent here is to indirectly say something about the difficulty and intricacy of implementing these functions. I said that the 4.3bsd libm has no observed monotonicity errors. That means that test programs running a few million points haven't found one. That doesn't mean that the error doesn't exist. A few million points is a very small subset of all double precision floating-point numbers. Most floating-point libraries haven't been tested even this well. As a double-precision library, the 4.3bsd libm can do double-duty as a single precision library, but one must check very carefully to verify that the rounding of the double-precision result to single precision doesn't introduce monotonicity errors. This same concern does not apply to the 80387 and 68881 since the programmer has presumably set the rounding controls first. The double precision libm would make a slow single precision library. This observations holds for the 68881 and 80387: these chips compute an extended-precision result, and consequently is not as fast as an optimized double precision implementation might be. (Honestly, though, a double-precision CORDIC implementation wouldn't make that much difference.) The 68881 uses a CORDIC algorithm, based on some work done by Steve Walther at HP. Unfortunately, the 68881 doesn't carry around enough internal precision to guarantee high accuracy or monotonicity. My understanding is that their results are accurate to about 56 bits (note double precision is 53 bits), and that monotonicity errors are rare but not unheard of. I do not recall if any monotonicity errors have been observed in single or double precision. The 80387 uses a different CORDIC algorithm (a modification of that used in the 8087). This algorithm requires less precision than the one used in the 68881, and is accurate to 62 or 63 bits (depending on the function). In addition, this CORDIC algorithm has been proved monotonic. However, the microcoded instruction set that uses these CORDIC primitives has not been proved monotonic, so it is not clear what the proof buys you. The last I heard, there have been no observed monotonicity errors in the single, double, or extended precisions. -- Jim Valerio {verdix,intelca!mipos3,intel-iwarp.arpa}!omepd!radix!jimv
bct@its63b.ed.ac.uk (B Tompsett) (09/30/87)
In article <8@radix> jimv@radix.UUCP (Jim Valerio) writes: >I am probably one of the few people who reads this newsgroup that has >microcoded transcendental floating-point instructions. The Computervision CDS 4000 has microcoded trancendental floating point instructions. I was responsible for the Fortran compiler for this machine, an it certainly helped with accuracy and performance to have these functions in microcode. The microcoders had access to more machine facilities than I would have had if I had to write them in a regular run-time library. For example, intermediate computations could be performed to more precision than the normally provided floating point operations allow. Brian. -- -- > Brian Tompsett. Department of Computer Science, University of Edinburgh, > JCMB, The King's Buildings, Mayfield Road, EDINBURGH, EH9 3JZ, Scotland, U.K. > Telephone: +44 31 667 1081 x3332. > JANET: bct@uk.ac.ed.ecsvax ARPA: bct%ecsvax.ed.ac.uk@cs.ucl.ac.uk > USENET: bct@ecsvax.ed.ac.uk UUCP: ...!mcvax!ukc!ecsvax.ed.ac.uk!bct > BITNET: psuvax1!ecsvax.ed.ac.uk!bct or bct%ecsvax.ed.ac.uk@earn.rl.ac.uk
lamaster@pioneer.arpa (Hugh LaMaster) (09/30/87)
In article <8@radix> jimv@radix.UUCP (Jim Valerio) writes: >I am probably one of the few people who reads this newsgroup that has >microcoded transcendental floating-point instructions. I wrote significant >portions of the transcendental microcode for the 80387. I also have written >significant portions of math libraries that implement transcendental >functions. : > >The MIPS implementation is laudable, but there are many more issues than >speed involved here. One is accuracy. Often more important than accuracy >is monotonicity. (If the mathematical function is monotonic over a region, >is the approximated function monotonic over the same region?) Polynomial >approximation techniques often have monotonicity problems. Other issues >include working on making simple transcendental identities hold in >floating-point computations (e.g. sin(-x) = -sin(x), exp(-x) = 1/exp(x)). > >Now, I suspect that MIPS used the 4.3bsd libm. This is a very good math >library, with man-years of work put into it by floating-point experts to >give it many of the important attributes of a good library. It boasts of >high accuracy and no observed monotonicity errors. : > >I said that the 4.3bsd libm has no observed monotonicity errors. That means >that test programs running a few million points haven't found one. That >doesn't mean that the error doesn't exist. A few million points is a very : > >Jim Valerio {verdix,intelca!mipos3,intel-iwarp.arpa}!omepd!radix!jimv Are people familiar with the Kahan et. al. "paranoia" program, (available from netlib), and, if so, what do people think of: 1) The validity of the error test results that it provides (in other words, does it complain about things that are not valid complaints), and, 2) The completeness of the tests (how good it is at testing things which should be tested, such as monotonicity of certain functions)? Hugh LaMaster, m/s 233-9, UUCP {topaz,lll-crg,ucbvax}! NASA Ames Research Center ames!pioneer!lamaster Moffett Field, CA 94035 ARPA lamaster@ames-pioneer.arpa Phone: (415)694-6117 ARPA lamaster@pioneer.arc.nasa.gov (Disclaimer: "All opinions solely the author's responsibility")