[comp.arch] New Barrel Processor Work

aglew@mcdurb.Urbana.Gould.COM (06/01/89)

Just a note that the latest IEEE Transactions on Computers
contains a paper by a Japanese group who are exploring
HEP style barrell processing, saying that it is appropriate
for Josephson junction copmputers (ie. for the tradeoffs
that come with JJ) and high latency memory Si machines.

kleonard@gvlv2.GVL.Unisys.COM (Ken Leonard) (06/02/89)

* Just a note that the latest IEEE Transactions on Computers 
* contains a paper by a Japanese group who are exploring 
* HEP style barrell processing, saying that it is appropriate 
* for Josephson junction copmputers (ie. for the tradeoffs 
* that come with JJ) and high latency memory Si machines. 
Yes, Hummmmmm, Is anyone else working in this area? 

leichter@CS.YALE.EDU (Jerry Leichter (LEICHTER-JERRY@CS.YALE.EDU)) (06/03/89)

In article <229@gvlv2.GVL.Unisys.COM>, kleonard@gvlv2.GVL.Unisys.COM (Ken Leonard) writes...
>* Just a note that the latest IEEE Transactions on Computers 
>* contains a paper by a Japanese group who are exploring 
>* HEP style barrell processing, saying that it is appropriate 
>* for Josephson junction copmputers (ie. for the tradeoffs 
>* that come with JJ) and high latency memory Si machines. 
>Yes, Hummmmmm, Is anyone else working in this area? 

The Horizon project, of which Burton J. Smith, one of the fathers of the HEP,
is a principle member, is designing a supercomputer based on barrel-like
ideas, among others.  There are several papers describing the system in the
Proceedings of Supercomputing '88, which are available as an IEEE publication.
Quite a machine - to be composed of 256 processors and 512 memory modules on
a very fast interconnect providing a 2^48th bit shared address space.  The
processors should have a 5ns cycle time, with each instruction containing two
operations.  The ISP has some really interesting ideas.  For example, there
are no branch delay slots.  Instead, the icache is, in a sense, "exposed":  A
branch takes its address from one of a small number of "target" registers
which you must pre-load.  Once a target register is loaded, the processor can
begin speculatively filling the icache.

The barrel stuff shows up as support by the hardware for up to 128 threads
of execution.  The cost for shifting to another thread is zero - in fact, in
normal operation it's rare for two successive instructions to come from the
same thread.  Thread selection is done by the hardware based on the readiness
of the next instruction in the thread; a lot of latency is hidden here.  (They
assume a 20-cycle access time in each memory unit plus an average 40-60 cycles
in the routing network, so they have a lot to hide!  The functional units are
pipelined, but they're no one-clcye jobbies either.)

Like the HEP, Horizon has support for data-flow-like multiprocessing in the
memory system, which carries (among other things) a "full/empty" bit with
each 64-bit word.  You can, for example, specify that a memory access is to
stall until the word is "full".  Interestingly, the memory also provides
unlimited indirection - there's an "indirect" bit associated with each word,
too.  I'm not sure what they want this for - none of the examples they give
use it.

If you want to get away from all the RISC/CISC flaming, have a look at these
papers.  The problems and constraints in this design space are so different
from the microprocessor world that the solutions aren't even vaguely classifi-
able in the same terms.

The whole thing should be good for 100 gigaflops.  It's not something I'll be
seeing on MY desk any time soon, but it's nice to see that there are still
people with big ideas out there....
							-- Jerry

hankd@pur-ee.UUCP (Hank Dietz) (06/06/89)

The "Horizon" was the name for it when Burton Smith was at the SRC and
it was strictly a research effort; he now has a company called "Terra"
which will be marketing such a machine.  Watch for it....

						-hankd@ee.ecn.purdue.edu