[net.arch] Cache Revisited Really Data Caches are OK

mash@mips.UUCP (John Mashey) (09/12/85)

> In article <455@mtxinu.UUCP> ed@mtxinu.UUCP (Ed Gould) writes:
> >In article <170@mips.UUCP> mash@mips.UUCP (John Mashey) writes:
> >>                                                 Consider the ultimate
> >>case: a smart compiler and a machine with many registers, such that
> >>most code sequences fetch a variable just once, so that most data references
> >>are cache misses.  Passing arguments in registers also drives the hit
> >>rate down.
> >
> >With this ultimate machine/compiler combination it seems intuitively
> >that a data cache would then be a *bad* idea, since having a cache
> >can't be faster than an uncached memory reference (for what would
> >be a miss) and is often slower.  We can then use the real estate saved
> >for even more registers!

My original words must have caused some confusion.  Let me fix that.

1) There was never the slightest intent to downplay the use of data caches,
which are, after all, Good Things.  The original impetus for all of this
came from amplifying an offhand comment about being careful about comparing
cache hit rates without understanding what else was happening.  "Ultimate"
didn't mean that the best possible system had a zillion registers, it meant
that this was the worst cache-hit performance you could get, which might
well have nothing to do with overall CPU performance.  It didn't mean that
I thought the best machine had zillions of registers and no data cache.

2) Tradeoff between registers and data caches:
Caches are good because:
	a) They are uniform; compiler need not know what's going on.
	b) they adapt to real usage frequency and locality of reference.
	c) They can be much bigger than what you can put on a single chip.
Registers are good because:
	a) A reasonable number can go on a chip and be accessed very fast.
	b) A smart compiler can make good use of a bunch of them.
	c) It doesn't take many bits to address them fast. [Note that to
	address memory on most machines, either you have all or most of
	a full virtual address, or you have (register) + offset; the first
	takes bits; the 2nd needs an adder in the critical path. [Yes,
	I'll ignore certain kinds of multiple register sets and stack
	caches and the like.]
BOTTOM LINE: they do different things, and they're both worthwhile; and all
of this is too general to be worth much!

3) It is very hard to debate any single feature in computer architecture
without looking at the others; intuition is always bad; the whole topic
is rife with hidden gotchas that grab you when you try fix something that
looks simple, especially when you're aiming at high performance.

4) A good recent reference on cache behavior and what's known is:
"Cache Evaluation and the Impact of Workload Choice", Alan Jay Smith,
SIGARCH 13, 3(June 85).
-- 
-john mashey
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash
DDD:  	415-960-1200
USPS: 	MIPS Computer Systems, 1330 Charleston Rd, Mtn View, CA 94043