[comp.arch] Cray divide speed

mccalpin@masig3.masig3.ocean.fsu.edu (John D. McCalpin) (10/30/89)

Does anyone have the figures handy for the peak speed of the Cray
X/MP, the Cray Y/MP, and the Cray 2 for vector divide operations?

I know that Cray uses an approximate reciprocal scheme, but I need to
know how many cycles per result are required for a plain vector
divide, i.e.

	DO 100 I=1,N
	    A(I) = B(I)/C(I)
  100	CONTINUE

On the Cyber 205 and ETA-10, the number is 6 cycles per element, and I
am trying to figure out if that is a representative number for vector
machines. I am using this information to modify the work estimates for
various implicit and explicit finite-difference schemes used in fluid
dynamics.
--
John D. McCalpin - mccalpin@masig1.ocean.fsu.edu
		   mccalpin@scri1.scri.fsu.edu
		   mccalpin@delocn.udel.edu

nelson@udel.edu (Mark Nelson) (10/31/89)

In article <MCCALPIN.89Oct29174054@masig3.masig3.ocean.fsu.edu> you write:
>Does anyone have the figures handy for the peak speed of the Cray
>X/MP, the Cray Y/MP, and the Cray 2 for vector divide operations?
>
>I know that Cray uses an approximate reciprocal scheme, but I need to
>know how many cycles per result are required for a plain vector
>divide, i.e.
>
>	DO 100 I=1,N
>	    A(I) = B(I)/C(I)
>  100	CONTINUE
>
For all Cray machines, a full precision divide requires one
pass through the reciprocal approximation functional unit and
three passes through the fp multiply functional unit.  For the
X/MP and Y/MP, these are distinct functional units and the
reciprocal can be chained with one of the multiplies, so the
peak speed is one result/three clock periods.

For the Cray 2, both operations are handled by the same functional
unit, and there isn't any chaining, so the peak speed is
one result/four clock periods.

If you only need half precision (26 bits of accurate mantissa)
the divide can be done in one multiply, so peak speed becomes:
X/MP Y/MP: one result/clock period
Cray 2:    one result/two clock periods.

However, I don't think you can persuade the FORTRAN compilers
to generate half precision divides.
--

Mark Nelson                 ...!rutgers!udel!nelson or nelson@udel.edu
This function is occasionally useful as an argument to other functions
that require functions as arguments. -- Guy Steele