[comp.arch] I cache/D cache

my@dtg.nsc.com (Michael Yip) (12/17/90)

In article <44130@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>5) Amongst the MIPS camp, there are chips with R3000+R3010 stuck
>together (Performance Semiconductor), which can help performance & cost;
>and embedded control chips
>(IDT, LSIL) which integrate up to 8KB I-cache & 2KB D-cache with
>an R3000 + buffering (i.e., 1,2,5,6,7, and sometimes 4, but not 3)
>together.   Intel 960s come in various combinations, some of
>which are similar to these.

Can someone tell me why IDT or LSI chose to integrate 8K I-cache and
2K D-cache?  From what I understand, a 2K I-cache and 8K D-cache may
yield a better overall cache hit.  I always think that a smaller
I-cache is "enough" to do the job (so that small loops and sequencial
instructions get executed from the cache and since data references is
not as localize as instruction references, therefore a big D-cache
will help hit rate more.  Can someone tell me if I am totally wrong?
(Sorry, I work on FDDI and don't know much about processor
architecture.)

By the way, can someone tell me the I-cache size that is needed to
achieve X percent (say 90%) of I-cache hit for a particular CPU camp?
[This should not be depended on secondary cache or main memory 
latency, right?]

-- Mike Yip
   my@dtg.nsc.com

brett@cayman.amd.com (Brett Stewart) (12/18/90)

In article <1589@berlioz.nsc.com> my@berlioz.UUCP (Michael Yip) writes:
>
>By the way, can someone tell me the I-cache size that is needed to
>achieve X percent (say 90%) of I-cache hit for a particular CPU camp?
>[This should not be depended on secondary cache or main memory 
>latency, right?]

You might want to look at "Aspects of Cache Memory and Instruction Buffer
Performance"  by Mark Donald Hill, Report No. UCB/CSD 87/381 from
Berkeley CSD. (Mr. Hill's thesis)

There is a very nice quantitative analysis there.  One nice thing
about it is it talks about the implication of Branch Target Cache buffers,
(of which our Am29000 is the only one in a single-chip commercial
RISC)  and its impact on performance, and it contains quantitative information
to refute many of the conclusions that have been authoritatively advanced
in this news string.
Brett Stewart
Advanced Micro Devices, Inc.           +1 512 462 5051 FAX
5900 E. Ben White Blvd MS561           +1 512 462 4336 Telephone
Austin, Texas 78741      USA           brett@cayman.amd.com

mash@mips.COM (John Mashey) (12/18/90)

In article <1589@berlioz.nsc.com> my@berlioz.UUCP (Michael Yip) writes:
>In article <44130@mips.mips.COM> mash@mips.COM (John Mashey) writes:
>>5) Amongst the MIPS camp, there are chips with R3000+R3010 stuck
>>together (Performance Semiconductor), which can help performance & cost;
>>and embedded control chips
>>(IDT, LSIL) which integrate up to 8KB I-cache & 2KB D-cache with
>>an R3000 + buffering (i.e., 1,2,5,6,7, and sometimes 4, but not 3)
>>together.   Intel 960s come in various combinations, some of
>>which are similar to these.

>Can someone tell me why IDT or LSI chose to integrate 8K I-cache and
>2K D-cache?  From what I understand, a 2K I-cache and 8K D-cache may
>yield a better overall cache hit.  I always think that a smaller
>I-cache is "enough" to do the job (so that small loops and sequencial
>instructions get executed from the cache and since data references is
>not as localize as instruction references, therefore a big D-cache
>will help hit rate more.  Can someone tell me if I am totally wrong?
>(Sorry, I work on FDDI and don't know much about processor
>architecture.)

As noted below, there's no one right answer.  However, observe that
the high-end Adobe RIP boards use R3000s, and that a really obvious use
for the IDT or LSIL parts is for cheap-but-fast laser printers,
and that executing Postscript or equivalent is a LOT of code,
and that 8K I-cache might help, whereas no reasonable cache would ever
hold all of an image anyway. 

>By the way, can someone tell me the I-cache size that is needed to
>achieve X percent (say 90%) of I-cache hit for a particular CPU camp?
>[This should not be depended on secondary cache or main memory 
>latency, right?]
This is impossible to answer in general: it depends heavily on the
type of application:
	a) Commercial environments like big I & D caches, because they
	execute OS & DBMS code and multi-task as lot, and often want
	good hit rates for DBMS buffers.
	b) Pure scientific machines get away with smaller I-caches,
	and would like either huge D-caches, or really high memory
	bandwidth, or both, since they spend their time in small loops
	crunching thru arrays.
	c) Embedded control varies all over the map.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086