keith@mips.COM (Keith Garrett) (09/26/90)
In article <10550@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: > >It occurs to me that as chips get bigger, we have to cross an >interesting discontinuity. > >Today, a million transistor chip (like the 68040 or i860) will have 8 >KB of onchip cache. Of course, that's not enough: a high end system >needs 128 KB or even 1 MB per processor. When we get to 50 or 100 >MT/chip, then we can do that. > >The discontinuity is that for intermediate chip sizes, we _don't_ >want to just have an intermediate sized cache. The reason is that >high speed is best served by having a fast primary cache, and a >slightly slower secondary cache. But notice the following simulation >results (High Performance Systems, Sept89 P. 76): > > Sustained Native MIPS for an idealized 100 MHz RISC CPU: > >Secondary Cache in KB = 64 128 256 512 > >Primary = 8 KB 49 62 75 80 >Primary = 16 KB 51 65 79 85 >Primary = 32 KB 52 67 83 89 > >...that is, the cache faults slow the CPU from 100 MIPS to 89 MIPS >(best case) or 49 MIPS (worst case). The important thing here is to >read these numbers across, and then read them down. Notice that >throughput just isn't that sensitive to the size of the primary >cache. these numbers are interesting, but it would be of interest to see the affect of the primary cache size a) without secondary cache, and b) with primary miss times 2-3 times longer than primary hit times, that is, 2-3 cycle miss penalties. > >I conclude that between 1 MT and 100 MT, there is a region where we >can't get the secondary cache on-chip, and a primary cache big enough >to fill up the chip would have nil or negative benefit. this seems rather pessimistic. current ram technology is 256Kb for srams and 4Mb for drams. 1Mb and 16Mb are just around the corner. assuming 30% overhead for checkbits and tags (thats 3 bits/byte or 11 bits/byte total), 128KB ~ 1.4Mb and 1MB ~ 11Mb. i don't know the relative sizes, but the technology should be simular for the '040, 256Kb sram, and 1Mb dram. assuming they are the same size, we can calculate relative sizes for new chips providing sram secondary cache, or dram secondary cache. for 256kb srams: 128kB cache uses 5.6 rams + processor = 6.6 ram equivalents 1MB cache uses 44 rams + processor = 45 ram equivalents for 1Mb srams: 128kB cache uses 1.4 rams + processor = 2.4 ram equivalents 1MB cache uses 11 rams + processor = 12 ram equivalents for 4Mb drams: 128kB cache uses .4 rams + processor = 1.4 ram equivalents 1MB cache uses 3 rams + processor = 4 ram equivalents for 16Mb drams: 128kB cache uses .1 rams + processor = 1.1 ram equivalents 1MB cache uses .7 rams + processor = 1.7 ram equivalents these numbers are very crude. perhaps someone can share info about the relative die sizes and process CD's for the various processors and rams. the worst number here suggests that you are only off by a factor of 2, but i suspect that 2-5x the '040 is a more reasonable estimate of equivalent die area to include a reasonable secondary cache. BTW, dram isn't bad for secondary cache if you can avoid address muxing. there was a startup that flare up a couple of years ago, offering sram equivalents using dram technology. perhaps someone else remembers their name. their worst case times were ~30ns. refresh got in the way, but i think that you can schedule that reasonably well for an integrated cache. -- Keith Garrett "This is *MY* opinion, OBVIOUSLY" Mips Computer Systems, 930 Arques Ave, Sunnyvale, Ca. 94086 (408) 524-8110 keith@mips.com or {ames,decwrl,prls}!mips!keith