dgh@sun.com (David Hough) (04/05/89)
I just went through a visit with a customer who complained bitterly about the quality of our math library and then admitted that he hadn't read or even opened the shrink wrap on the Floating-Point Programmer's Guide in the SunOS 4.0 document crate. This is too bad; he might have found some useful information there. Some recent sunspots postings suggest this customer is not unique. If you want to get the most out of a floating-point-intensive computation on a Sun, it would be an excellent idea to read the Floating-Point Programmer's Guide, part number 800-1552-10. It was written for SunOS 3.1 but most of the information is still relevant. And don't forget to also read the addendum for SunOS 4.0 which is contained in part number 800-1789-10 which has the somewhat misleading title "Software READ THIS FIRST Programmer's Guides Minibox". That addendum was kluged together at the last minute when it became clear that I wouldn't have time for a complete rewrite for 4.0. It's not as comprehensive but the information density is high. Once again I have promised a complete rewrite for 4.1. In the hope that I will fulfill the promise, I'd get glad to hear comments from anybody who's looked over the existing documentation. David Hough dgh@sun.com
prl@eiger.uucp (04/27/89)
In article <8903240124.AA03033@dgh.sun.com> you write: >X-Sun-Spots-Digest: Volume 7, Issue 225, message 8 of 11 > >I just went through a visit with a customer who complained bitterly about >the quality of our math library and then admitted that he hadn't read or >even opened the shrink wrap on the Floating-Point Programmer's Guide in >the SunOS 4.0 document crate. This is too bad; he might have found some >useful information there.... >Some recent sunspots postings suggest this customer is not unique.... As one of the people who has sharply criticised Sun's C maths library (the speed of most important functions in the C maths library is between 3-10 times slower than what it should be), I would like to respond to this. I agree that it is a good idea to read the Floating-Point Programmer's Guide, and I had, before posting. Neither the guide nor the READ THIS FIRST in fact say anything relevant about my criticism of the C maths library (SunOS 4.0 and 4.0.1, but probably earlier ones as well). The problem is that the C maths library doesn't use the assembly-language functions available in either the 68881 nor in the Sun3 FPA. Functions like sqrt(), sin(), cos() are simply the C versions of these functions, compiled with the appropriate compiler flag. If sqrt() is replaced by a function which uses the 68881 fsqrtx instruction the square root evaluation is 10 times faster. I think this is very good reason to complain (even bitterly) about the quality of the library. The FP Programmer's Guide concentrates almost entirely on FORTRAN, and reading it would not yield any useful information about this problem. "The /usr/lib/f*.il files' primary application is to accelerate calculations involving complex and doublecomplex data types in FORTRAN. ... intensive complex arithmetic may be twice as fast with inline expansion" p. 112 "With cc, use of almost any of the functions defined in <math.h> invokes switched floating point [not true in 4.0 and later, corrected in 4.0 Software Read Me First - prl] using [inlining] causes these calls to switched floating point to be replaced by inline code or calls to appropriate unswitched routines." p. 112 Quotes from Sun Floating Point Programmer's Guide, Part Number 800-1552-10, Revision A, of 19 September 1986. This rather old document was what was supplied with our SunOS4.0 documentation set. The only other place where the comparative performance of the library functions and inline code is mentioned is in some instructions about how to hand-inline one of the functions in the FORTRAN Whetstone benchmark (p. 56). I have sent a copy of my kit for creating a faster maths library to the moderator; I have not checked the index at the archive server to see if it arrived. The kit brings a 2 to 10-fold improvement in the speed of functions from Sun's -lm to my -lmfast. There is a minor incompatibility (the same as if you use the inlining facility), that the SysV matherr() function can never be invoked. >Once again I have promised a complete rewrite for 4.1. In the hope that I >will fulfill the promise, I'd get glad to hear comments from anybody who's >looked over the existing documentation. Do you mean the documentation, or the library? Both could do with a good overhaul. Sony manages to run these functions 5-10 times faster on nearly the same hardware, by having a decent implementation of the maths library and not forcing the user to depend on obscure and poorly documented hacks. My timings indicate that if libm is reasonably implemented, there is *little* speedup to be gained from using inlined code over calling the assembly instruction via a subroutine call! VVVVVVVVVVVVV -lm -lmfast speedup inline speedup speedup sec sec lm/lmfast sec lm/inln lmfast/inline cos() 3.45 0.83 4.16 0.82 4.21 1.01 sin() 3.23 0.75 4.31 0.73 4.42 1.03 tan() 4.60 1.63 2.82 1.57 2.93 1.04 acos() 3.55 2.00 1.77 1.97 1.80 1.02 asin() 3.95 2.05 1.93 1.90 2.08 1.08 atan() 2.97 1.23 2.41 1.23 2.41 1.00 log() 4.42 1.27 3.48 1.28 3.45 0.99 log10() 4.17 2.07 2.01 1.93 2.16 1.07 log2() 3.43 1.93 1.78 1.95 1.76 0.99 exp() 3.15 1.42 2.22 1.25 2.52 1.14 exp10() 4.90 1.75 2.80 1.67 2.93 1.05 exp2() 3.32 1.75 1.90 1.68 1.98 1.04 sqrt() 12.37 1.18 10.48 1.18 10.48 1.00 cosh() 2.75 1.95 1.41 1.82 1.51 1.07 sinh() 3.03 1.75 1.73 1.68 1.80 1.04 tanh() 3.33 2.00 1.67 1.93 1.73 1.04 atanh() 2.32 2.12 1.09 2.10 1.10 1.01 All times for 50000 calls, loop overhead subtracted, but subroutine call overhead (naturally) not subtracted. Note that the 10* improvement is not idle talk, sqrt() really is that bad in -lm! The routines in -lmfast could be further optimised; they were created using the Sun inline library. The timings are for SunOS4.0, Sun3/260, cc -O -f68881 ... {-lm, -lmfast, /usr/lib/f68881/libm.il} Peter Lamb uucp: seismo!mcvax!ethz!prl Tel: (01) 256 5241 (Switzerland) eunet: prl@iis.ethz.ch +411 256 5241 (International) Integrated Systems Laboratory ETH-Zentrum 8092 Zurich