[net.lsi] Systolic Array On a Chip: Summary

sundar@cwruecmp.UUCP (05/03/83)
   Summary of the responses for the Systolic Array on a Chip Query:

	Thanks to all of you for responding to my query. Frankly
	speaking there weren't very many responses. I am sure there
	are more people working in this area than I am forced to
	believe by counting the number of responses. Here I have
	attempted to summarize the responses.

	I tried to find out if any one had implemented the idea of
	systolic arrays on a chip and if yes, what problems they had.

   	At MIT, as part of the 6.371 project, a systolic Stack/Queue
	was designed and fabricated. However, these haven't been tested
	and I have no idea of the problems they encountered. Also,
	a systolic priority queue was designed last year. It is
	not known if this has been fabricated and tested.

	There are a couple of groups working on using the concept of
	systolic arrays to connect many chips externally (as opposed
	many components within a chip). There is also some interest
	in chips that are specially designed to facilitate such
	an external interconnection.

	TRW makes an FIR (Finite Impulse Response Filter) chip called
	the TDC1028. It is not known if it is systolic.

	The NYU Ultracomputer uses a systolic structure that performs
	2x2 switching, queuing messages in a congested network and
	combining messages destined for a memory address.
	The implementation is yet to be done. The interested viewer
	is referred to Feb 83 issue of IEEE Transactions on Computers.

	At CMU, 88 pin chips have been fabricated that contain 64x64
	microstore, some random logic and RAM. I am told that they had
	a 100% failure rate after the first fabrication attempt.
	However, all errors seem to have been detected an corrected,
	and a second fabrication order to Mosis has already been sent.
	The idea is to use these chips in a systolic array for solving
	partial differential equations.

	The University of Waterloo has developed a Chess Legal Move
	Generator using the concept of systolic arrays.
	A description of the chip can be found in the proceedings of the
	3rd Caltech Conference on VLSI (1983).  A more readable version
	of the paper can be found in the May-June 1983 issue of VLSI Design.


        Though there is a lot of good work that has been reported in these
	responses, I feel that my original question is still left
	unanswered. I was (still am) looking for problems at the
	implementation stage (e.g. the well known global clock distribution)
	and design alternatives to correct them (e.g. use of asynchronous
	modules in the chip, tree distribution of the clock, etc) that
	have actually been proven to be feasible in practice.

	If you would like to contribute to this list, please mail me
	the information. I will post a second summary to the net.

	Sundararavarathan R Iyengar (sundar)
	Case Western Reserve University
	decvax!cwruecmp!sundar,	sundar.Case@UDEL-RELAY