lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (09/24/90)
It occurs to me that as chips get bigger, we have to cross an interesting discontinuity. Today, a million transistor chip (like the 68040 or i860) will have 8 KB of onchip cache. Of course, that's not enough: a high end system needs 128 KB or even 1 MB per processor. When we get to 50 or 100 MT/chip, then we can do that. The discontinuity is that for intermediate chip sizes, we _don't_ want to just have an intermediate sized cache. The reason is that high speed is best served by having a fast primary cache, and a slightly slower secondary cache. But notice the following simulation results (High Performance Systems, Sept89 P. 76): Sustained Native MIPS for an idealized 100 MHz RISC CPU: Secondary Cache in KB = 64 128 256 512 Primary = 8 K 49 62 75 80 Primary = 16 KB 51 65 79 85 Primary = 32 KB 52 67 83 89 ...that is, the cache faults slow the CPU from 100 MIPS to 89 MIPS (best case) or 49 MIPS (worst case). The important thing here is to read these numbers across, and then read them down. Notice that throughput just isn't that sensitive to the size of the primary cache. I conclude that between 1 MT and 100 MT, there is a region where we can't get the secondary cache on-chip, and a primary cache big enough to fill up the chip would have nil or negative benefit. -- Don D.C.Lindsay
usenet@nlm.nih.gov (usenet news poster) (09/24/90)
In article <10550@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: >[...] >high speed is best served by having a fast primary cache, and a >slightly slower secondary cache. But notice the following simulation ^^^^^^^^^^^^^^^ >results (High Performance Systems, Sept89 P. 76): > > Sustained Native MIPS for an idealized 100 MHz RISC CPU: > >Secondary Cache in KB = 64 128 256 512 > >Primary = 8 K 49 62 75 80 >Primary = 16 KB 51 65 79 85 >Primary = 32 KB 52 67 83 89 > >...that is, the cache faults slow the CPU from 100 MIPS to 89 MIPS >(best case) or 49 MIPS (worst case). The important thing here is to >read these numbers across, and then read them down. Notice that >throughput just isn't that sensitive to the size of the primary >cache. Isn't the key here the use of a secondary cache which is only slightly slower? >I conclude that between 1 MT and 100 MT, there is a region where we >can't get the secondary cache on-chip, and a primary cache big enough >to fill up the chip would have nil or negative benefit. What would the results be in a system configured without a secondary cache? Maybe the high end system will require secondary cache. Might not a big chip which didn't require secondary cache be an overall win on lower end systems? >-- >Don D.C.Lindsay David States
my@dtg.nsc.com (Michael Yip) (09/24/90)
Instead of just expanding the on-chip cache size there are many things that we can do to use up that 49 million transisters (;-). We can probably integrate some specialized functional units on-chip. For example, units for vector processing and units for off loading the routing of mesages (for Transputers). Or even put many processors on the same Si and have parallel processing on one chip. I am not an expert to computer architecture but it is interesting to see what we can do in a few years (hopefully). -- Mike my@dtg.nsc.com
chrism@mips.COM (Christopher Marino) (09/27/90)
In article <10550@pt.cs.cmu.edu> lindsay@gandalf.cs.cmu.edu (Donald Lindsay) writes: > >It occurs to me that as chips get bigger, we have to cross an >interesting discontinuity. I don't dispute your observation that there is somewhat of a discontinuity here. However, there are implementation issues that complicate the problem a little. The MIPS 6280 is a machine that confirms the data that you present. Here is a machine that has a relatively small primary cache and a large secondary cache. (16 KB primary data, 64 KB primary instruction, 512 KB shared secondary) The penalty of a primary cache miss depends a lot on how you build it (speed of the parts, datapaths etc.). Given the constraints of this design, the benefits of larger primary caches could not be justified. My guess is that the design of the secondary cache would be optimized to take full advantage of the primary cache that does get integrated on the chip. What comes to mind is that faster parts and wider datapaths would be used for the secondary cache. This would alter the performance characteristics of the machine as the secondary cache gets smaller (the penalty of a miss would be smaller). Getting the primary cache on chip will be change many of the working assumption that were used to generate the data that you present. > >Secondary Cache in KB = 64 128 256 512 > >Primary = 8 K 49 62 75 80 >Primary = 16 KB 51 65 79 85 >Primary = 32 KB 52 67 83 89 > >...that is, the cache faults slow the CPU from 100 MIPS to 89 MIPS >(best case) or 49 MIPS (worst case). The important thing here is to >read these numbers across, and then read them down. Notice that >throughput just isn't that sensitive to the size of the primary >cache. > >I conclude that between 1 MT and 100 MT, there is a region where we >can't get the secondary cache on-chip, and a primary cache big enough >to fill up the chip would have nil or negative benefit. > >-- >Don D.C.Lindsay