[comp.arch] RS/6000 cache

mark@hubcap.clemson.edu (Mark Smotherman) (02/07/91)

Thanks to those who emailed responses.  I did not understand that the
RS/6000 cache is a **hybrid** organization.  That is, the set selection
is done using bits from the virtual address, while memory transfers are
done using bits from the physical address.

The key concept that I was missing was that the tag is the full PFN --
and not merely the high PA (physical address) bits beyond the index
and line offset (as in figure 8.8 on p. 411 in Hennessy and Patterson).

Most overlapped lookup schemes for physically addressed caches only
use the 12-bit page offset for the index and line offset fields (since
these 12 bits are identical in both the VA and the PA).  (See H&P p.
438.)

     20 (or 30) bit VPN   12 bit offset
VA  [____________________|____________]
    >>>>>>>>>map<<<<<<<<< VVVVVVVVVVVV
PA  [____________________|____________]
     20 bit PFN           12 bit offset

    /                    / subdivide  /
    / PFN becomes tag    / into index /
    /                    / and line   /
    /                    / offset     /

However, the current models of the RS/6000 send the low 14 bits of the
VA to the cache for lookup (the architecture reserves the right to send
up to 20).  Thus, the high two bits of the index are from the VA and
may differ from the bottom two bits of the PFN.

                       14 bits to the cache
                       (++ +++++ +++++++)
VA  [__________________.__|_____._______]   128 bytes/line => 7 bit offset
                       / 7 bit  /7 bit  /   14-7=7 bit index => 128 sets
                       / line   /offset /   4 way set assoc => 4 lines/set
                       / index  /in line/

128 sets * 4 lines/set * 128 bytes/line = 64K bytes

Yet you keep the full 20 bit PFN as the tag (and not just the high 18 bits).

PA  [__________________.__|_____._______]
    / tag                 /

Thus, on a cache miss you keep the 7 bit index for knowing which set will
reload the line (which is obtained from memory using the PA).  And on
write back you generate the line PA by taking the 20 bit tag and appending
the low 5 bits of the index (ignoring the top two bits of the index, which
may differ).

Somehow I didn't see that the tag overlaps the index.  Very interesting
approach to increasing cache size while retaining the advantages of a
physically-addressed cache with overlapped lookup.

-- 
Mark Smotherman, Comp. Sci. Dept., Clemson University, Clemson, SC 29634
INTERNET: mark@hubcap.clemson.edu    UUCP: gatech!hubcap!mark