[comp.arch] Hardware considerations

hrubin@pop.stat.purdue.edu (Herman Rubin) (05/14/91)

In article <1991May13.211555.28824@rice.edu>, preston@ariel.rice.edu (Preston Briggs) writes:
> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:

			.....................

> If you want to take advantage of new architectures and implementations,
> your probably going to have to rethink some of your old assumptions.
> For example, in the old days, FP was much more expensive than integer
> arithmetic.  Now they are about the same.  In the old days, FP was
> more expensive than memory accesses.  Now FP is generally cheaper.

For the earliest machines with FP hardware, FP multiplication/division
was usually slightly faster than integer, because of the reduced 
accuracy, and addition/subtraction slower, because of the need for
shifts and normalizations.  Any FP unit is essentially an integer
unit with a few additional features needed to handle the exponent.

The only reason that integer arithmetic, especially multiplication
and division, are so expensive, is that they are done in separate
units, and these units are not designed with the concerns going into
the FP units.  Now division is, admittedly, a major headache, but is
there any good reason not to use essentially the same hardware for
integer and floating multiplication?  Some machines do not even have
separate integer arithmetic, but this can only be done if normalization
is not forced.  Is the one bit gained by forcing normalization worth
the problems caused?

I see no prospects whatever for speeding up memory accesses, except 
what can be gained by pipelining, multiple accesses, and burst mode.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

weaver@jetsun.weitek.COM (Mike Weaver) (05/15/91)

In article <12295@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>.... Now division is, admittedly, a major headache, but is
>there any good reason not to use essentially the same hardware for
>integer and floating multiplication?  

One reason: the number of bits in the significand a floating point
number is less than the number of bits in the corresponding integer.

For example, a full array multiplier (a common way to make a fast
multiplier) has a size that may scale as the product of the sizes in
bits of the two inputs. Thus for IEEE floating point, I estimate that
the increase in the size of the array (of adding elements) as follows:

			n	n**2	m	m**2	ratio
	Single		24	 576 	32	1024	1.78	
	Double		53	2809	64	4096	1.46

    n = significand bits, m = integer bits, ratio = m**2/n**2, is my 
    estimate in the expansion in the size of the array to make a 
    floating point multiplier into a integer multiplier.

Also, when you do a floating point multiply, you know you will throw
away the least significant half of the product. All you really need to
know (for IEEE) is the most significant half of the product, plus the
next three bits of the product, and whether or not the remaining bits
were zero. This can lead to some hardware savings as the individual
wires for these bits do not need to be carried through with to the
normalization stage, only a zero/non-zero indicator bit.

My point is that there is a significant cost here (but you can use
a 53x53->56 bit multiplier for 32x32->32 bit multiplier).

Michael Weaver.

hrubin@pop.stat.purdue.edu (Herman Rubin) (05/15/91)

In article <1991May15.003712.5909@jetsun.weitek.COM>, weaver@jetsun.weitek.COM (Mike Weaver) writes:
> In article <12295@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> >.... Now division is, admittedly, a major headache, but is
> >there any good reason not to use essentially the same hardware for
> >integer and floating multiplication?  
 
 
> One reason: the number of bits in the significand a floating point
> number is less than the number of bits in the corresponding integer.
 
> For example, a full array multiplier (a common way to make a fast
> multiplier) has a size that may scale as the product of the sizes in
> bits of the two inputs. Thus for IEEE floating point, I estimate that
> the increase in the size of the array (of adding elements) as follows:
 
> 			n	n**2	m	m**2	ratio
> 	Single		24	 576 	32	1024	1.78	
> 	Double		53	2809	64	4096	1.46
 
>     n = significand bits, m = integer bits, ratio = m**2/n**2, is my 
>     estimate in the expansion in the size of the array to make a 
>     floating point multiplier into a integer multiplier.
 
> Also, when you do a floating point multiply, you know you will throw
> away the least significant half of the product. All you really need to
> know (for IEEE) is the most significant half of the product, plus the
> next three bits of the product, and whether or not the remaining bits
> were zero. This can lead to some hardware savings as the individual
> wires for these bits do not need to be carried through with to the
> normalization stage, only a zero/non-zero indicator bit.
 
> My point is that there is a significant cost here (but you can use
> a 53x53->56 bit multiplier for 32x32->32 bit multiplier).

On a decent number-crunching problem, 24-bit accuracy is not too 
useful.  It would make far more sense to call the IEEE single precision
half-precision and their double as single precision.  Some hardwares
recognize this by not even providing "single precision" hardware.
I know of at least one which uses 48 (really 47) for full precision
and 24 (really 23) for half precision, and provides access to the
least significant part of the product.  It also allows unnormalized
arithmetic, and has essentially no separate integer arithmetic.
The 11-bit IEEE exponent is also not always adequate.

If any more accuracy is needed, it is NECESSARY to go to integer arithmetic.
With unnormalized floating point, 32x32 -> 52 would not be too difficult,
but with IEEE, there is lots of overhead, so I am not convinced that 
26x26 -> 52 in the floating unit would be faster than 16x16 -> 32 in
the integer uniunit.  This is even more so if there are separate integer
and floating registers.  There are too many situations where forced
normalization is a major headache.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)