grenley@nsc.nsc.com (George Grenley) (03/12/88)
In article <1071@PT.CS.CMU.EDU> lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) writes: >In article <3460011@hpsrla.HP.COM> brucek@hpsrla.HP.COM (Bruce Kleinman) > writes about the 68030: >>Ahh, those massive 256-Byte caches are really going to speed this puppy up :-) I'm sure they will. Heaven knows it needs it... >Plus, the two caches access in parallel (versus the one cache of the 68020). As do the two caches on the 32532 >Plus, the caches now take one clock (versus 2 on the 68020). Likewise on the 532 >Plus, the caches now have burst refill (if the board designer supports it, >of course.) So does the 532. We also have larger caches (512 instruction, 1K 2 way set associative data). I have seen the studies on hit rate vs size for the 532; since the 030 is roughly similar architecture I expect they have the same tradeoffs. 256 bytes is better than no bytes, but it is still pretty small. George Grenley NSC
mash@mips.COM (John Mashey) (03/12/88)
In article <5009@nsc.nsc.com> grenley@nsc.UUCP (George Grenley) writes: >So does the 532. We also have larger caches (512 instruction, 1K 2 way >set associative data). I have seen the studies on hit rate vs size for >the 532; since the 030 is roughly similar architecture I expect they have >the same tradeoffs. 256 bytes is better than no bytes, but it is still >pretty small. I recall there was speculation when the 68030 was announced that the D-cache might actually cost you performance in general applications, and that people would end up turning it off [unlike the I-cache, where even a small cache is almost always useful]. However, I've seen no data published one way or another on this yet, and I don't have any. Do you (or anybody else) have any good data on a 256-byte cache with 16 16-byte lines? (i.e., the 68030 D-cache) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
north@Apple.COM (Donald N. North) (03/15/88)
In article <1853@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes: >I recall there was speculation when the 68030 was announced that the >D-cache might actually cost you performance in general applications, >Do you (or anybody else) have any good data on a 256-byte cache with >16 16-byte lines? (i.e., the 68030 D-cache) Having had some '030 experience as of late, I have found that (in real working hardware) the D-cache is always a 'win', even though it is small by most standards. I have yet to find a benchmark (Dhrystone, any others of the small integer class) or some real application code in which the performance is less when the cache is enabled than disabled. The typical performance improvement ranges from a low of 5% to a high of 25%, 'average' for larger applications appears to be about 20%. In looking at the cache organization (direct mapped, 16-16 byte lines) one could construct a sequence of references (accessing data locations exactly 256 bytes apart, for example) that causes thrashing in particular cache lines. This is a problem in all direct mapped caches, and one would think it to be especially severe with such a few number of entries (16 in the '030s case). I suspect this is one reason for the relatively low performance improvement figures; the other is that the cache is just too small to be 'really' useful except in limited situations. Two that come immediately to mind are pushing stack arguments that are then accessed relatively soon, and accesses to local stack storage. Don North ----- Apple Computer, Inc. ----- Advanced Technology Group UUCP: {voder,nsc,dual,sun}!apple!north CSNET: north@apple.com {{ Facts are facts, but any opinions expressed are my own, and *do not* }} {{ represent any viewpoint, official or otherwise, of Apple Computer, Inc.}}
lindsay@K.GP.CS.CMU.EDU (Donald Lindsay) (03/17/88)
In article <7672@apple.Apple.Com> north@apple.UUCP (Donald N. North) writes: >Having had some '030 experience as of late, I have found that (in real working >hardware) the D-cache is always a 'win', even though it is small by most >standards. I have yet to find a benchmark (Dhrystone, any others of the small >integer class) or some real application code in which the performance is less >when the cache is enabled than disabled. The typical performance improvement >ranges from a low of 5% to a high of 25%, 'average' for larger applications >appears to be about 20%. You didn't mention how many wait states on a memory access. Also, you didn't mention if the board supports burst-fill. I assume that the 20% is for a hot board. I would expect that the cache gets more useful as boards get slower. In particular, the 68030 should be much more useful than a 68020 when given a slow 8-bit-wide memory - i.e. a minimum configuration. I was recently surprised to learn that 68020 minimum configurations weren't just showing up in minimum-cost systems. Apparently, some are embedded in other systems, doing the sort of thing that an 8-bitter could hack (like, hardware diagnostics). I assume that the 68030 will show up eventually in this role. Of course, 68020's have also been used as IO controllers and the like. Does anyone have insight into the minimum/maximum aspects of these uses, or the likelihood of SPARC/MIPS/etc pushing into these roles ? -- Don lindsay@k.gp.cs.cmu.edu CMU Computer Science