[comp.hypercube] Fast parallel integer multiply?

abbottj@csv.rpi.edu (John Abbott) (09/17/87)

[ This was on sci.math.  It seems an appropriate question for this
  group.
	Steve
]

Are there any articles about how to multiply large integers quickly on
parallel machines? Maybe related to some of the recentish calculations
of pi to lots of decimal places. Fast division would be nice too.

Thank you.

John Abbott.
Department of Computer Science, RPI, Troy, NY 12180, USA
Telephone number: USA 518 271 2726.

fpst@hubcap.clemson.edu (09/18/87)

[This response to the question of integer multiply was on sci.math]

From: bs@linus.UUCP (Robert D. Silverman)
]In article <67@rpicsb8] abbottj@csv.rpi.edu (John Abbott) writes:
]Are there any articles about how to multiply large integers quickly on
]parallel machines?

The basic question is: How Large? If the numbers are say several THOUSAND
computer words long (e.g. say 20000+ decimal digits) then FFT techniques
can be used and these parallelize well. Under that I'm afraid that the
old-fashioned algorithms in Knuth which run in O(N^2) time (N = # bits)
are inherently serial. Ditto for the recursive partitioning algorithms that
say run in O(N^lg(3)) time.

Bob Silverman

howard@cpocd2.UUCP (Howard A. Landman) (09/24/87)

[ This one was on sci.math.
  Steve
]

In article <67@rpicsb8] abbottj@csv.rpi.edu (John Abbott) writes:
>Are there any articles about how to multiply large integers quickly on
>parallel machines?

In article <13410@linus.UUCP>, bs@linus.UUCP (Robert D. Silverman) writes:
> If the numbers are say several THOUSAND computer words long (e.g. say 20000+
> decimal digits) then FFT techniques can be used and these parallelize well.
> Under that I'm afraid that the old-fashioned algorithms in Knuth which run
> in O(N^2) time (N = # bits) are inherently serial. Ditto for the recursive
> partitioning algorithms that say run in O(N^lg(3)) time.

In article <579@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>If we have M^2 processors and M^2 memory available (M = # words, where
>multiplying 2 words to produce a double word is available) then the
>usual algorithm can be implemented in log M time.

Also, with only 2N bit-serial processors, linked in a linear arrary where
each processor can only communicate with its 2 nearest neighbors on each side,
we can multiply 2 N-bit numbers in O(N) time.  I forget who first observed
this, but Dick Lyon built such hardware while at Xerox PARC and described
it in his notes on digital signal processing, published internally by Xerox.
This has a better space-time product than Herman's "usual algorithm": O(N^2)
versus O(N^2*logN).

It would be fairly easy to implement this on a Connection machine for numbers
up to at least 32K bits, and possibly as high as 8M bits or more with clever
programming.  It doesn't do too much good to get faster than linear, because
input and output are still linear.
-- 
	Howard A. Landman
	...!{oliveb,...}!intelca!mipos3!cpocd2!howard
	howard%cpocd2%sc.intel.com@RELAY.CS.NET