[net.arch] help choosing cache sizes

dfh@scirtp.UUCP (02/22/86)

Our hardware folks are starting to design a CPU board using the 80386 and
they asked me what size and type of on-board cache would be optimum for a 
UNIX environment.  I have no idea.

The box is (will be) an 80386 with 80186 I/O processors using multibus
I.  There is a high speed memory bus from the CPU to the expansion
memory card(s).  The current implementation uses an 10 MHz 80286 at 0
wait states in lieu of the '386.

Has anybody else looked at the '386 with this in mind yet?

What cache sizes are common in other microprocessor systems?

Pointers to literature on the subject appreciated.

-- 
				David Hinnant
				SCI Systems, Inc.
				...{decvax, akgua}!mcnc!rti-sel!scirtp!dfh

aglew@ccvaxa.UUCP (02/24/86)

There's only one optimum cache size - the largest you can afford.
That is, unless you are considering having several layers of different
speed caches, in which case there is an optimization problem for a
given cost. Convex says pretty much the same thing.

clif@intelca.UUCP (Clif Purkiser) (03/06/86)

Subject: Re: help choosing cache sizes
Newsgroups: net.arch
Distribution: net
References: <565@scirtp.UUCP>

> Our hardware folks are starting to design a CPU board using the 80386 and
> they asked me what size and type of on-board cache would be optimum for a 
> UNIX environment.  I have no idea.
> 
> The box is (will be) an 80386 with 80186 I/O processors using multibus
> I.  There is a high speed memory bus from the CPU to the expansion
> memory card(s).  The current implementation uses an 10 MHz 80286 at 0
> wait states in lieu of the '386.
> 
> Has anybody else looked at the '386 with this in mind yet?
> 
> What cache sizes are common in other microprocessor systems?
> 
> Pointers to literature on the subject appreciated.
> 
> -- 
> 				David Hinnant
> 				SCI Systems, Inc.
> 				...{decvax, akgua}!mcnc!rti-sel!scirtp!dfh

Sorry for posting this but mail didn't seem to work.  Hope it may
prove useful other 386 designers.



Dave:

	I glad to see you are finally designing with a good microprocessor.

	In general you want as large a cache as possible.  Most of the
cache designs I have seen use a direct map cache of either 16K or 64K
bytes with 4 byte line sizes.    Most cache designs use one bank of
8 SRAMs for data and 2 or 3 SRAMs for tags.  SRAM sizes are either
4k X4 or 16 x4.  You need 35-45 nanosecond SRAMs for tags and 45-60
nanoseconds for data.


	Glen Shires published an article in the Nov/Dec issue of Intel's
Solution magazine which provides some more information of cache designs
for the 386.  Your local Intel salesman can get you a copy of Solutions
or I can send you one.

	The following caching data is based on the address references
of a group of large programs (Vi, C compile, link and execute) run under a Unix
like OS.   The numbers probably slightly understate the effectiveness of a
cache for a typical microprocessor Unix systems, (i.e smaller number of 
processes and smaller programs (utilities).

		Unified Cache  (combined code and data)
Cache Configuration	    |			Miss Rate (Hit rate = 1 - Miss Rate)
Size Assoc  Line Size	|Overall 	Branches	Fetches 	Reads 	Writes
1K	 4-way	8 bytes		|35.8		56.2		36.1		35.8	34.1
2K	 4-way	16			|19			44.1		17.8		21		18.4
4K	 4-way	16			|13.9		30.7		12.7		15.7	14
8K	 direct 4 bytes		|26.7		26.3		28.6		24.2	25.3
8K   direct 8			|17.4		26.6		17			18.4	16.6
8K   2-way  4			|21.4		20.1		22.2		19.7	22.9
8K   2-way	8			|14.2		20.3		13.4		15.1	14.7
32K	 direct 16			|6.4		11.6		4.8			8.6		7.1
64K  direct 4			|12.3		9.8			10.9		13		16.1
64K  direct 8			|7.8		9.2			6.2			9.7		9.8
64K  2-way  4			|11			8.6			9.5			11.4	15.8
128K direct 4			|11.6		9			10.1		12.3	16
128K direct 8			|7.3 		8.2			5.6			9		9.6
128K 2-way  4			|10.8		8.4			9.4			11.3	15.7	


		Code Only Cache  
Cache Configuration		|			Miss Rate (Hit rate = 1 - Miss Rate)
Size Assoc  Line Size	|Overall 	Branches	Fetches 	Reads 	Writes
256  direct 4			|77.3		71.8		77.3		100		100
1K   4-way	8			|30.2		46.4		30.2		100		100
8K	 direct 4			|21.6		19.4		21.6		100		100
16K  direct 4			|13.9		12.5		13.9		100		100
64K	 direct	4			|9.7		8.7			9.7			100		100
 
	As can be seen from the data increasing the line-size helps the hit
rate a lot as does increasing the cache size up to about 64K.  More 
sophisticated write policies such as "buffered" or "deferred-write" also
increase the effectiveness of the cache compared to the write-through policy 
which is assumed for these numbers.

Hope this was of some use.



-- 
Clif Purkiser, Intel, Santa Clara, Ca.
HIGH PERFORMANCE MICROPROCESSORS
{pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!clif
	
{standard disclaimer about how these views are mine and may not reflect
the views of Intel, my boss , or USNET goes here. }