[comp.arch] On chip caches

neideck@kaputt.dec.com (Burkhard Neidecker-Lutz) (11/10/89)

For those that believe they can get by without at least some on-chip
cache (never mind RISC or CISC, as long as things move fast I'll take it),
there is an interesting paper:

	Integration and Packaging Plateaus of Processor Performance
	Norman P. Jouppi
	International Conference of Computer Design
	IEEE, Cambridge, Massachussets, October 2-4 1989

He starts out with some of the exotic ideas used by the high performance
folks (Cray, Prisma) and their packaging (well, the VAX9000 falls somewhat
in the same category) and then dismisses all that for a very simple reason:

He doesn't count gate delays (go ahead, choose a 0ns technology) but only
counts the times it takes to get off-chip, off-board, etc. Interesting
thought experiment and even better when you have a simulator to "design"
a processor and run programs on it like Norman has. It runs variations of
the MultiTitan research processor, a very MIPS-like RISC machine.

Makes for little guessing and relatively accurate numbers. It turns out that
there is a gap of a factor of about 10 to 20 between machines that have
to go for an external cache each cycle and those that do not have to. The
limiting factor is the interconnect delay and that cannot be arbitrarily 
lowered. The whole data is impossible to reproduce here, read the paper.

		Burkhard Neidecker-Lutz, Digital CEC Karlsruhe,
					 Project NESTOR

jesup@cbmvax.UUCP (Randell Jesup) (12/03/89)

In article <8911100815.AA00800@decwrl.dec.com> neideck@kaputt.dec.com (Burkhard Neidecker-Lutz) writes:
>He doesn't count gate delays (go ahead, choose a 0ns technology) but only
>counts the times it takes to get off-chip, off-board, etc.
...
>Makes for little guessing and relatively accurate numbers. It turns out that
>there is a gap of a factor of about 10 to 20 between machines that have
>to go for an external cache each cycle and those that do not have to. The
>limiting factor is the interconnect delay and that cannot be arbitrarily 
>lowered. The whole data is impossible to reproduce here, read the paper.

	As I (and others here on comp.arch) have been saying: interconnect
speed is rapidly becoming a limiting factor for microprocessors (as it
already is for mainframes/supers).  However, the statement that interconnect
delay cannot be arbitrarily lowered is misleading at best.  It can be lowered
in a number of ways (some of which require major changes in packaging/fab and
related changes in chip design - not just a drop-in).  Interconnect delay
cannot be lowered to 0, but then again neither can gate delays.

	Sounds like an interesting paper.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"