lgy@pupthy2.PRINCETON.EDU (Larry Yaffe) (05/24/88)
In article <2231@gumby.mips.COM> earl@mips.COM (Earl Killian) writes: [[ Much information about the R/3000 including: ]] >> Integer Multiply/Divide > >How is multiply is implemented [software, multiply step, hardware]? hardware >How many cycles to perform 32x32->32 multiply? 13 ^^ > >> Floating Point > >What are the floating point operation latency/issue/repeats? > > 32-bit 64-bit 80-bit > mul 4/ 1/ 4 5/ 1/ 5 n.a. ^^ >UUCP: {ames,decwrl,prls,pyramid}!mips!earl >USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086 I'm curious and a little surprised about the relative speed of integer and floating point multiplies. Are integer multiplies substantially less frequent than floating point mults (in some selection of "real" programs) that integer multiply being three times slower than f.p. is not a significant bottleneck? What fraction of integer multiplies are not multiplications by small compile-time constants (in that same set of programs)? Since MIPS Co. is reputed to have studied real programs when considering design trade-off's, I'm hoping someone there can offer some insight. Also, despite having looked through the MIPS architecture book quite carefully, I can not find any mention of the latency of integer multiply or divide operations. Am I just getting sloppy, or is the information really missing? ------------------------------------------------------------------------ Laurence G. Yaffe Internet: lgy@pupthy.princeton.edu Department of Physics Bitnet: lgy@pucc Princeton University UUCP: ...!princeton!pupthy!lgy PO Box 708, Princeton NJ 08544
earl@mips.COM (Earl Killian) (05/25/88)
In article <3006@phoenix.Princeton.EDU> lgy@pupthy2.PRINCETON.EDU (Larry Yaffe) writes:
I'm curious and a little surprised about the relative speed of
integer and floating point multiplies. Are integer multiplies
substantially less frequent than floating point mults (in some
selection of "real" programs) that integer multiply being three
times slower than f.p. is not a significant bottleneck?
What fraction of integer multiplies are not multiplications
by small compile-time constants (in that same set of programs)?
Since MIPS Co. is reputed to have studied real programs when
considering design trade-off's, I'm hoping someone there can
offer some insight.
There are several answers to this. One would be some statistics to
show that integer multiplies are less common than floating point ones.
They indeed are in most applications. DSP often uses integer
multiply, but given relative integer/fp speeds in the r3000, you'd be
best off changing it to fp!
Another answer is a practical one. The cpu chip and fpu chip were
designed serially rather than in parallel (MIPS was a small company in
those days). There just wasn't time to develop that kind of
multiplier technology during the cpu chip design (those are pretty
fast fp times). This is not to say that the integer multiply is bad
(or a "bottleneck") -- it's several times faster than most micros. Of
course, now that we've developed the technology for the fp, it would
not be unreasonable to apply it to a future cpu chip.
--
UUCP: {ames,decwrl,prls,pyramid}!mips!earl
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086
mash@mips.COM (John Mashey) (05/25/88)
In article <2241@gumby.mips.COM> earl@mips.COM (Earl Killian) writes: >In article <3006@phoenix.Princeton.EDU> lgy@pupthy2.PRINCETON.EDU (Larry Yaffe) writes: > I'm curious and a little surprised about the relative speed of > integer and floating point multiplies.... Here are the dynamic frequencies of integer and FP multiplies, as a percentage of instructions (NOT instruction cycles, which is much more complex to describe, given instruction overlap), in a quick sample of programs. No science presumed: I just grabbed a few programs believed to be OK: Program Integer FP as1 .02% 0 ccom .08% 0 espresso <.01% <.01% gnuchess .06% 0 hspice .02% 2.4% linpackd* .13% 8.2% nroff .45% 0 whetd* 1.42% 7.38% timberwolf .31% .45% As a gross cycle estimate, multiply the Integer # by 10, and the FP one by 5. (* = benchmark, above) Conclusions: 1) If a program uses FP in any serious fashion, FP mult > integer mult. 2) A few programs (nroff, whetd, and timberwolf) use integer mult enough that it would be nice to be faster, i.e., a 10-cycle multiply causes >2% of the instruction cycles. 3) Many programs don't execute integer mult enough to be noticable. Of course, our compilers do strength reduction, common subexpression elimination, etc, and convert mulitplies by constants into shifts/adds/subs; i.e., they've squished out a lot of actual multiplies. This is perfect example where your choice of benchmarks tells completely different answers to the question: "should we have an integer mult instruction, and if so, how much silicon are we willing to devote to making it X cycles." -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086