[net.arch] Integer Multiplier Question

geno@allegra.UUCP (E.Rice) (11/27/85)

I have a question about hardware integer parallel multipliers.
In many languages, the result of a integer multiply has the same length
as the operands.  Therefore it makes sense when building a parallel
multiplier to only generate the low order
half of the product.  A 32x32 multiplier
which only generates a 32 bit product has about half the hardware of
a multiplier generating a full 64 bit product.  Note that the low order
bit pattern of the product is identical whether the numbers are treated as
2's complement or unsigned.

My questions concerns overflow indication.  I wish to know when the product
is too big to fit in the low order half.  Note that overflow is different
in the 2's complement and unsigned cases.  Is there a simple way to
generate overflow indication for both cases?  By simple I mean much cheaper
than generating all the product bits and then testing them.

Thanks in advance for all advice and references.

gnu@l5.uucp (John Gilmore) (11/28/85)

In article <5456@allegra.UUCP>, geno@allegra.UUCP (E.Rice) writes:
> In many languages, the result of a integer multiply has the same length
> as the operands.  Therefore it makes sense when building a parallel
> multiplier to only generate the low order half of the product.

I proposed that the 68020 do this, and Motorola implemented it, but
in the process several people pointed out that if the arguments are 
considered as fractions (eg as in rational arithmetic, or in computing
floating point) then the relevant bits are the *high order* bits of the
double-length product.  Unless you are building something for a specific
application (it sounds like you aren't, if you care about hardware
detection of both signed and unsigned overflow) don't provide this as
the only multiply instruction.

PS:  Motorola didn't; they provide 32x32 --> 32 and 32x32 --> 64
multiply as well as the original 16x16 --> 32.  They punted the
signed/unsigned question by providing separate instructions for both.
In the 68010 it was faster to do it unsigned (40 vs 42 clocks); in the
68020 it doesn't seem to matter.

ken@turtlevax.UUCP (Ken Turkowski) (12/02/85)

In article <291@l5.uucp>, gnu@l5.uucp (John Gilmore) writes:
> In article <5456@allegra.UUCP>, geno@allegra.UUCP (E.Rice) writes:
> > In many languages, the result of a integer multiply has the same length
> > as the operands.  Therefore it makes sense when building a parallel
> > multiplier to only generate the low order half of the product.
> 
> I proposed that the 68020 do this, and Motorola implemented it, but
> in the process several people pointed out that if the arguments are 
> considered as fractions (eg as in rational arithmetic, or in computing
> floating point) then the relevant bits are the *high order* bits of the
> double-length product.

As long as you just want to compute addresses for arrays, the least
significant part of the product is all that is needed.  However, for
any kind of digital signal processing, the most significant part is
needed for digital filters, if you don't want to have to deal with
overflow.
-- 
Ken Turkowski @ CIMLINC (formerly CADLINC), Menlo Park, CA
UUCP: {amd,decwrl,hplabs,seismo,spar}!turtlevax!ken
ARPA: turtlevax!ken@DECWRL.DEC.COM