colwell@mfci.UUCP (Robert Colwell) (11/03/88)
In article <19811@apple.Apple.COM> baum@apple.UUCP (Allen Baum) writes: >Square root is the same category as divide. Hardware is slow, so algorithms >tend to avoid them. The reason is fundamental. The hardware is slow, and it >is exceedingly difficult to make it faster. Strangly enough, floating point >divide can be made to run much faster, because of its normalized operands. One of the biggest problems with hardware sqrt/divide is that their hardware implementations want to be iterative, which makes these ops non-pipeline-able. That's a very bad feature in machines where all other arithmetic ops, esp. flt. pt. multiply/adds are pipelined. A software implementation of sqrt or div uses the pipelined ops, so the net effect is that the latency of a single op will be higher, but the net throughput is much better. Of course, the hardware can get you the last bit correctly rounded to IEEE specifications; the software could too, in principle, but I've not yet seen anyone do it. Bob Colwell mfci!colwell@uunet.uucp Multiflow Computer 175 N. Main St. Branford, CT 06405 203-488-6090