scottb@ogicse.cse.ogi.edu (Scott Baker) (02/02/91)
I am planning to implement a neural-net simulator using integer arithmetic. One of the motivations of integer vs. floating point is higher speed for an integer implementation. However, I have just been told that a Sparc 1 may actually be -slower- at integer multiplies than floating-point multiplies because of a lack of hardware support for integer multiplies. Is this true?! Thanks: Scott Baker scottb@ogicse.cse.ogi.edu
henry@zoo.toronto.edu (Henry Spencer) (02/02/91)
In article <16864@ogicse.ogi.edu> scottb@ogicse.cse.ogi.edu (Scott Baker) writes: >been told that a Sparc 1 may actually be -slower- at integer multiplies >than floating-point multiplies because of a lack of hardware support >for integer multiplies. Is this true?! Yes. The original Sparcs are somewhat unbalanced machines. I don't think it is definitively a mistake not to have a fast general integer multiply (since an awful lot of integer multiplies are by small constants, which can be done better by shift-add sequences), but it is a mistake to put a lot of effort into fast floating-point and none into fast integer multiply. There are signs that the Sparc world now realizes this, but it comes too late to help a lot of the early machines. Buy MIPS. :-) -- "Maybe we should tell the truth?" | Henry Spencer at U of Toronto Zoology "Surely we aren't that desperate yet." | henry@zoo.toronto.edu utzoo!henry
shand@prl.dec.com (Mark Shand) (02/13/91)
Integer multiply on SPARC is indeed poor. I recently added an assembler kernel for SPARC to our bignum package and found the fastest way to do multiprecision integer multiply was through the FPU. The primitive I use is 32bitx16bit->48bit which can be computed exactly in double precision. I've only timed it on a SPARCstation 1 which has a rather slow 9 cycle DP mult. The overall performance for multiprecision integer multiplies is about 4 times less than a MIPS R2000 which has a 12-16 (depending how you count) cycle 32x32->64 integer mult, but is still faster than any other way of doing full-word integer mult on an early SPARC. (our bignum package is available by mail from librarian@prl.dec.com, we will be announcing an FTP server soon) Even on a more balanced machine like the MIPS R2000,R3000 floating mult, although more resource intensive than integer mult, is a higher priority operation and, through the devotion of more hardware, takes fewer cycles. Moral: tradeoffs between integer vs float are subtle, just because an operation CAN be implemented more efficiently doesn't mean it HAS BEEN. Of course next year's CPU designers will benchmark your neural net code that you've finally decided to cast in floats even though ints would have served you equally well, and those designers will deprecate integer multiply even further. Questions: Does anyone know which SPARC implementations include integer multiply support beyond the multiply step instruction? What is the opcode? What happens if an early SPARC hits such an opcode? Have these SPARC implementations found their way into any product machines yet? Another thing that bugged me about multiply step was that it doesn't seem to give any way to get the high order part of the result. MIPS on the contrary gives you lo and hi result registers. This is essential in multiprecision work. Am I missing something in multiply step? Do the newer instruction help here? Mark Shand.