[comp.arch] Multiple data buses

mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (04/27/91)
In article <TH_A6-F....>, peter@ficc.ferranti.com (peter da silva) writes:
> Ah, the old Von Neumann botleneck. Time to apply RISC design techniques to
> memory subsystems. How about multiported memory, or even banked memory?
> Multiple data and address busses? Who knows, but memory subsystems are the
> current bottleneck and something's gotta give.

Of course, "real" supercomputers have been using banked memory with
(effectively) multiple data busses for a decade or so....  For example
(since Herman brought it up), the ETA-10 could execute 4 64-bit loads,
2 64-bit stores, 2 64-bit FP adds, and 2 64-bit FP multiplies per cpu
per cycle.

With all the talk of "vector" processing on micros, it is helpful to
consider the following hierarchy of architectural features that lead
to "real" supercomputers:

	1. multiple functional units	(separate FP add and multiply)
	2. pipelined functional units	(independent stages for FP ops)
	3. multi-word xfr from cache/(vector registers) to FPU
	4. multi-word xfr from main memory to cache/(vector registers)/(fpu)

Examples of machines reaching to each level:
	1. MIPS R2000/3000
	2. Intel i860(*), IBM RIOS(*), Motorola 88000
	3. Cray 1, Cray 2, Convex C200, IBM 3090VF
	4. Cray X, Cray Y, Cyber 205/ETA-10

(*) These architectures sacrifice independent multiple functional
units in favor of one or more combined add/multiply instructions.

I reserve the use of the term "vector processor" for the last two
categories only, and for category 3 with reservations.  I reserve the
term "supercomputer" for machines implementing the feature of category 4.
(Of course, nomenclature gets more difficult for parallel machines....)
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@brahms.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET