[comp.arch] superconcurrency and the cpu with 3 brains

msp33327@uxa.cso.uiuc.edu (Michael S. Pereckas) (11/09/90)

In the October 1990 Supercomputing Review Richard Freund and Sunny
Conwell wrote about ``superconcurrency''---distributed heterogeneous
supercomputing.  The idea is that since specialized processors do best
on the tasks that they were designed for, what we need are collections
of different specialized processors.  

In each program there are parts that run best on vector machines,
parst that run best on MIMDs, SIMDs, etc., so the problem should
be spread around, they say.  

The machines can be widely distributed and networked, or all in the
same box.  Perhaps this is a good use for the CPU with 3 brains.  Two
RISC scalar units and a vector unit on one chip, anyone?  

A shared-memory heterogeneous multiprocessor might allow even small
sections of code to run on the best suited processor.  I suppose this
could be done as a VLIW, but another (maybe better, or at least more
flexible) possability would be a MIMD style system, with each process
running on one CPU, possably spawning sub-processes that run on
different CPUs.  (and you thought that programming was hard now :-))

Of course, I'll be the first to admit that I don't really know what
I'm talking about.  Have at it, guys :-)

--
Michael Pereckas               * InterNet: m-pereckas@uiuc.edu *
just another student...          (CI$: 72311,3246)
*Jargon Dept.: Decoupled Architecture--sounds like the aftermath of a tornado*

jimf@idayton.field.intel.com (Jim Fister) (11/10/90)

msp33327@uxa.cso.uiuc.edu (Michael S. Pereckas) writes:

>                    ``superconcurrency''---distributed heterogeneous
>supercomputing.  The idea is that since specialized processors do best
>on the tasks that they were designed for, what we need are collections
>of different specialized processors.  

>The machines can be widely distributed and networked, or all in the
>same box.  Perhaps this is a good use for the CPU with 3 brains.  Two
>RISC scalar units and a vector unit on one chip, anyone?  

Something like the i860?  

Sorry, couldn't resist.  Anyway, I've heard talk of people counting all of 
the transistors in, say, a Cray and thinking something like, gee, my micro
budget in the year 2000 will be twice that big.  What's to say that some
company won't just sweep some parallel supercomputer onto sand in the not-so-
far future?  AMD and Intel both have onsey, twosey pc-on-a-chip parts now.
Somebody should be putting a real computer (define real: non-DOS) there
any day, right?

Disclaimer:  They never tell me anything, so I can't speak for them.  I'm
just lil' ol' me.

Greetings from the Rocking Metropolis
JimF

msp33327@uxa.cso.uiuc.edu (Michael S. Pereckas) (11/11/90)

In <1990Nov9.213205.8026@idayton.field.intel.com> jimf@idayton.field.intel.com (Jim Fister) writes:

>Something like the i860?  

>Sorry, couldn't resist.  Anyway, I've heard talk of people counting all of 
>the transistors in, say, a Cray and thinking something like, gee, my micro
>budget in the year 2000 will be twice that big.  What's to say that some
>company won't just sweep some parallel supercomputer onto sand in the not-so-
>far future?  AMD and Intel both have onsey, twosey pc-on-a-chip parts now.
>Somebody should be putting a real computer (define real: non-DOS) there
>any day, right?

I don't claim to have any expertise in this (or much of anything
else), but doesn't putting the cpu on a large number of chips instead
of just one help with the I/O pin problem?  If you put a Y/MP cpu on
one chip, how many pins would you need?  And don't forget power and
ground pins.  The Cray-3 and the latest NEC design are interconnect
nightmares as it is.  

The i860 is bottlenecked at the memory bus.  (so was the Cray-1).
Multiple memory ports and a memory divided into many banks (like the
X/MP / /Y/MP) can greatly increase performance.  If you want to read 2
and write 1 word with each clock, you are going to need a lot of
wires, and you need to do that to operate on long vectors at ~1
FLOP/CLOCK.  The i860 reaches its peak only when one operand is a
constant.

--
Michael Pereckas               * InterNet: m-pereckas@uiuc.edu *
just another student...          (CI$: 72311,3246)
*Jargon Dept.: Decoupled Architecture--sounds like the aftermath of a tornado*