bcase@uiucdcs.UUCP (11/01/84)
["...she's a line eater..." -- Hall and Oats] Let me add my two cents to the discussion by agreeing that any sane 68020 implementation will have a data cache. The Signetics MAC makes the inclusion of one rather painless, I think. And the software time spent invalidating the cache it absolutely of no conern here: one cycle (instruction?). Even a more expensive cache built out of TI's 2150 tag buffer chips would still only require one cycle to invalidate (assuming, as has been stated, that the cache keys on virtual addresses). And regarding Falcone's comments: It seems to me that you are saying that one VAX MIP is roughly equivalent to 3-4 68000 MIPS (my conclusion is based on your final table which shows, at 200 ns memory, a 68020 performance of roughly .5 VAX MIPS, while the IEEE paper claims roughly 2.0 68020 MIPS). This, and correct me please if I have gone astray here, conclusion seems a little bogus. The reason: the average VAX instruction that gets executed in the average program is not, in my opinion, 3-4 times more powerful than the average 68020 instruction for the same program, and the VAX is certainly not operating at 3-4 times the raw speed of the 68020. bcase
kissell@flairvax.UUCP (Kevin Kissell) (11/06/84)
> And the software time > spent invalidating the cache it absolutely of no conern here: one > cycle (instruction?). Even a more expensive cache built out of TI's > 2150 tag buffer chips would still only require one cycle to invalidate > (assuming, as has been stated, that the cache keys on virtual addresses). The price paid for purging a virtual-address cache when the virtual map changes is not so much in the overhead of the act of purging itself, but in the loss of those cache lines that otherwise would have remained valid and thus been available for use when the pre-purge context was restored. This can be significant in systems with large caches and frequent context switches.
bcase@uiucdcs.UUCP (11/06/84)
> > Falcone is setting up something of a "straw man" here. A conscientious > > 68020 design MUST use a data cache, and the data cache keys must be > > VIRTUAL addresses. > > Whether the cache should be virtual or real-address based would seem to > me to depend on a tradeoff between burning an MMU cycle for every access > (real-address cache) and purging the cache every time the virtual map > changes (virtual-address cache). > > Kevin D. Kissell That is correct, and it is generally agreed (at least among people I know, and apparently among a lot of system designers) that burning an MMU cycle on every memory reference is much worse for performance in general than purging the (possibly only user partition of the) cache tags on process switches. Slowing cache accesses down from one to two cycles is a 100% increase while causing a little extra cache refilling due to cold start conditions in only a marginal percentage overhead, on the average (since process switches occur (much) less than 200 times a second most of the time, especially in single user systems). bcase
bcase@uiucdcs.UUCP (11/08/84)
/* Written 7:50 am Nov 7, 1984 by kissell@flairvax in uiucdcs:net.arch */ > And the software time > spent invalidating the cache it absolutely of no conern here: one > cycle (instruction?). Even a more expensive cache built out of TI's > 2150 tag buffer chips would still only require one cycle to invalidate > (assuming, as has been stated, that the cache keys on virtual addresses). The price paid for purging a virtual-address cache when the virtual map changes is not so much in the overhead of the act of purging itself, but in the loss of those cache lines that otherwise would have remained valid and thus been available for use when the pre-purge context was restored. This can be significant in systems with large caches and frequent context switches. /* End of text from uiucdcs:net.arch */ Perfectly true, but this should not be considered a *SOFTWARE* (look at the original wording) overhead; to me, software overhead is constituted by such things as trying to decided when to invalidate caches, page table entries in the TLB, and then actually performing the operation. Accordingly, I was pointing out that in modern systems a full invalidation costs only one cycle, as opposed to one cycle per entry as in older systems.