[comp.sys.sgi] Benchmarking the SGI: Floating point faster than integer?

chooft@ruunsa.fys.ruu.nl (Rob Hooft) (09/11/90)

Hello,

As I am planning to buy a PC, I am benchmarking all computers available to
me. On our PI this resulted in a 'pleasant' surprise: Floating point
multiplications are about twice as fast as integer multiplications. Is there
anybody that can explain the results of the following session to me?

Rob Hooft, Chemistry department University of Utrecht, The Netherlands

-----CUT HERE---------------------------------------------------------------
cia:[101]:hooft>cat test.f
      program test
      integer i
      real a,f,g
      integer ia,if,ig
      f=28.0
      g=36.0
      if=28
      ig=36
      t=cputime(0.0)
      do 100 i=1,3000000
100     ia=if*ig
      write(*,*) cputime(t)
      t=cputime(0.0)
      do 200 i=1,3000000
200     a=f*g
      write(*,*) cputime(t)
      t=cputime(0.0)
      do 300 i=1,3000000
300     continue 
      write(*,*) cputime(t)
      end
      function cputime(f)
      real g(2)
      call etime(g)
      cputime=g(1)+g(2)-f
      return
      end
cia:[102]:hooft>f77 -O0 -o test test.f
cia:[103]:hooft>test
    6.340000    
    4.200000    
    2.070000    
12.5u 0.1s 0:15 82%
cia:[104]:hooft>^D
-----CUT HERE---------------------------------------------------------------
-- 
------------------------------------------------------------------------
This is my brand new signature =========================================
------------------------------------------------------------------------

bron@bronze.wpd.sgi.com (Bron Campbell Nelson) (09/12/90)

In article <1464@ruunsa.fys.ruu.nl>, chooft@ruunsa.fys.ruu.nl (Rob Hooft) writes:
> As I am planning to buy a PC, I am benchmarking all computers available to
> me. On our PI this resulted in a 'pleasant' surprise: Floating point
> multiplications are about twice as fast as integer multiplications. Is there
> anybody that can explain the results of the following session to me?
[session deleted]

The explaination  is simple: on the MIPS R2000 and R3000 cpus, floating
point multiplication *IS* about twice as fast as integer multiplication
(actually, a bit more than twice as fast).  The ratio for division is
even greater.

My understanding is that MIPS decided to throw a lot of silicon at the
floating point problem, while they found that the majority of integer
multiplies in "real" programs involved a constant term, and so could be
done with shifts and adds.  Thus, less silicon was thrown at the integer
multiply problem (and even less at the integer divide).

--
Bron Campbell Nelson
bron@sgi.com  or possibly  ..!ames!sgi!bron
These statements are my own, not those of Silicon Graphics.

news@helens.Stanford.EDU (news) (09/12/90)

The program is a rather extreme benchmark.  A more balanced program
representative of your application would make more sense as a
purchasing criterion.

But if you want raw multiplication speeds, each of the FORTRAN
multiplies is actually four machine instructions: two loads, a
multiply and a store.  If you edit the assembly code to discount the
loads and stores by repeating the mul instruction N times, the FP
multiply is actually 3X faster than the integer!!!

Jim Helman
Department of Applied Physics			Durand 012
Stanford University				FAX: (415) 725-3377
(jim@KAOS.stanford.edu) 			Work: (415) 723-9127

jim@baroque.Stanford.EDU (James Helman) (09/12/90)

[repost, original apparently lost in NNTP land]

The program is a rather extreme benchmark.  A more balanced program
representative of your application would make more sense as a
purchasing critereon.

But if you want raw multiplication speeds, each of the FORTRAN
multiplies is actually two loads, a multiply and a store.  If you edit
the assembly code to discount the loads and stores by repeating the
mul instruction N times, the FP multiply is actually 3X faster than
the integer!!!

Jim Helman
Department of Applied Physics			Durand 012
Stanford University				FAX: (415) 725-3377
(jim@KAOS.stanford.edu) 			Work: (415) 723-9127

mark@mips.COM (Mark G. Johnson) (09/12/90)

In article <69040@sgi.sgi.com> bron@bronze.wpd.sgi.com (Bron Campbell Nelson) writes:
   >
   >The explaination  is simple: on the MIPS R2000 and R3000 cpus, floating
   >point multiplication *IS* about twice as fast as integer multiplication
   >(actually, a bit more than twice as fast).  The ratio for division is
   >even greater.
   >

Here are the cycle counts for R20x0 / R30x0 ; note that Nelson's remark
above is a little inaccurate -- the integer/FP cycle ratio is greatest
for multiply, not divide.
            integer multiply:        12 cycles
            integer divide:          34 cycles
            IEEE 32b FP multiply:     4 cycles
            IEEE 32b FP divide:      12 cycles
            IEEE 64b FP multiply:     5 cycles
            IEEE 64b FP divide:      19 cycles

  >My understanding is that MIPS decided to throw a lot of silicon at the
  >floating point problem, while they found that the majority of integer ...

A misconception, actually.  The FP multiplier has 9,064 transistors total:
8,302 transistors in the (regular layout structure) Datapath, and 762 in the
control logic (Booth encoders, etc).  The entire FP chip only contains
76,451 transistors total... thus the FP multiplier is 11.9% of the total.
Especially today in the era of million+ transistor CPUs, 9000 transistors
for FP multiplication can hardly be considered "lavish"; this was also
true even in Jan. 1987 when the FP chip hit first silicon.
-- 
 -- Mark Johnson	
 	MIPS Computer Systems, 930 E. Arques M/S 2-02, Sunnyvale, CA 94086
	(408) 524-8308    mark@mips.com  {or ...!decwrl!mips!mark}