[comp.parallel] Parallel algorithms for integer multiplication: R.F.I.

fagin@eleazar.dartmouth.edu (Barry S. Fagin) (12/02/89)

Anyone out there know the best parallel algorithm for large
integer multiplication, and where I could find a reference
for it?  Please reply via e-mail, as I don't track this
newsgroup.  Thanks in advance.

--Barry Fagin

(barry.fagin@dartmouth.edu, ...!dartmouth!eleazar!fagin

pratt@polya.Stanford.EDU (Vaughan R. Pratt) (12/04/89)

The best parallel algorithm for integer multiplication requires time
O(log n) on n^3 processors (whence this problem is in NC).  It is due
to the Australian computer scientist Chris Wallace, who found it in
1964 or so while working in the US on the Illiac I.  This gave rise to
the so-called Wallace tree, for which one can buy an integrated circuit
off the shelf any number of which can be connected together to form an
O(log n) multiplier.

Multiplication of a*b can be described as adding together up to n
copies of a, one copy for each 1 bit of b, each copy shifted according
to the position of that 1 bit.  Forming these addends takes constant
time with unbounded fanout, or log n with fanout 2.  By balancing the n
additions they can be carried out in parallel in time log n times the
time for one addition.  By using one of various carry-save schemes
involving redundant representation (e.g. Avizienis notation, sum-carry
representation, etc.) each addition can be performed in constant time.
Conversion from the redundant form to binary at the end takes an
additional log n; if sum-carry representation is used this simply
amounts to adding the sum vector to the shifted carry vector as
ordinary binary numbers.

I don't know who first adapted Wallace's hardware method to SIMD
parallel computation.  An early paper doing so appears in STOC-74, the
journal version of which is Pratt and Stockmeyer, "A Characterization
of the Power of Vector Machines", JCSS, 12, 2, 1976, 198-221.
					Vaughan Pratt
-- 
		_______________________________________________________
		Vaughan Pratt   pratt@polya.stanford.edu   415-494-2545
		=======================================================

mohamed@uunet.UU.NET (Moataz Mohamed) (12/04/89)

fagin@eleazar.dartmouth.edu (Barry S. Fagin) writes:

>Anyone out there know the best parallel algorithm for large
>integer multiplication, and where I could find a reference
>for it?  

Shinji Nakamura "Algorithms for iterative array multiplication",
IEEE Transaction of Computers, Aug 1986.

The above paper presents iterative array (systolic) parallel algorithms
for integer multiplication. One of the algorithms persented employs
folding  to reduce the carry delay to "n". However, if you are interested
in building a chip, I would encourage you to think twice before doing it.
I actually designed and implemented (layout and simulations) a chip 
for the algorithm. The problems arise from the fact that the folded
array is triangular which creates lots of layout problems and thus the 
delays encoutered due to long wires actually exceed the predicted 
theoretical result.

>--Barry Fagin

>(barry.fagin@dartmouth.edu, ...!dartmouth!eleazar!fagin


.. Moataz Mohamed

mohamed@cs.uoregon.edu

CIS Dept. 
Univ. of Oregon
Eugene, OR 97403

=============================================================================
"People do different things out of the same motives, yet they sometimes do the
same things for dramatically different reasons" --Nobody you know!
=============================================================================