zs01+@andrew.cmu.edu (Zalman Stern) (04/07/90)
[Discussion of address aliasing problems on the IBM Risc System/6000] In the Mach port to the RIOS, we get around this problem by almost always running in virtual mode. The wired kernel memory (text segment and unpageable data mapped at boot time) is mapped virtual=real. The only time the machine goes into real mode is on interrupts. (System calls and traps do not have to go into real mode on this machine.) Memory accessed in an interrupt handler has to be wired anyway (and is usually statically allocated as well). Besides, device drivers already have to deal with some cache flushing since there is no hardware to provide consistency between the cache and IO space. Since the RIOS has an inverted page table, aliases between virtual addresses requires taking page faults to move the correct virtual address into the IPT. (That is is one alias is in the IPT, accessing that memory through a different alias will take a page fault.) When this happens, the fault handler flushes that page from the cache as well. Sincerely, Zalman Stern Internet: zs01+@andrew.cmu.edu Usenet: I'm soooo confused... Information Technology Center, Carnegie Mellon, Pittsburgh, PA 15213-3890
JOSH@IBM.COM ("Josh Knight") (04/11/90)
In <1830@gannet.cl.cam.ac.uk> cet1@cl.cam.ac.uk (C.E. Thompson) writes: > This brought to mind a question that has been niggling me for some > years: how is the trick worked when the software is *not* so > constrained? The IBM 308x and 3090 mainframes have (mostly) 64K caches > (per processor) which are 4-way set associative; and again only the > bottom 12 bits of the address are invariant under the virtual-to-real > mapping. However, the software is allowed to (and IBM operating systems > in fact do) reference a page of storage at different times by both > virtual and real addresses, whose low-order 14 bits will not, usually, > be equal. > > How is it done? I have never found an answer in the review articles in > the IBM journals (R&D, Systems). Is it, perhaps, a trade secret? In > earlier models, such as the IBM 3033, as the cache increased in size > so did the multiplicity of the associative lookup. > The answer for the 3090 is in the article referenced in the appended refer format citation, in this quote from page 10 of the cited article: An interesting complexity in cache design that has been given special treatment in the 3090 cache has to do with synonyms. Virtual storage in System/370-XA architecture allows relocation of 4K-byte pages. This means that the low-order 12 address bits that address a byte within a page are the same for both a virtual and a real address. Architecture, however, allows different virtual addresses to map to the same real address. Thus the cache is managed by real addresses, despite the fact that it is accessed by virtual address. Since it takes 16 bits to address a 64K-byte cache and there are only 12 real bits available, we lack four bits. There are thus 16 places in the cache where an operand might reside. Four of these locations are read out of the cache simultaneously on the initial cache read operation. The directory, however, is built to read out all 16 entries simultaneously. Thus, if there is a miss on all of the primary four locations but a hit on one of the other 12, the cache can be read correctly with a minimum delay. %T The IBM 3090 System: An Overview %A S.G. Tucker %J IBM Systems Journal %V 25 %N 1 %P 4-19 %D 1986 Josh Knight josh@ibm.com