[comp.arch] Inverted Tables

lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (02/26/90)

Someone asked via email how you look up an inverted page table. Since
attempts to reply directly didn't work, I'm posting the answer.
Please skip NOW to the next message if you already knew. 


From the CPU's viewpoint, it works like this on a typical machine:

- a virtual address is presented by the program.
- the Translation Lookaside Buffer consults all its entries in parallel, 
  for a recent virtual-to-physical mapping that applies.
- the TLB indicates that it has no mapping: omigod a TLB fault.
- the hardware takes the virtual page# (the high bits of the VA),
  prepends any address space (process) identifier, and hashes.
  (The hash function is defined in the architecture spec.)
- the hashed number is used as a subscript into a table in memory.
  (The CPU keeps the address of this table in a register, or else
  the architecture spec says where the table is.)
- the table entry gives you a physical page #. However, hash
  functions have a probability of collisions. So, the CPU
  uses that physical page# as a subscript, and fetches an entry
  from that famous inverted table.
- that table entry gives a virtual page#. If it matches, you are
  done. Shove the entry into the TLB and things are back to normal.
- if it doesn't match, then the entry also contains a pointer.
  Follow this: you are now going down a single-linked-list
  which threads through further entries in the inverted table.
- eventually, you find an entry (in the inverted table) which
  matches, or, you run off the end of the linked list. In that
  case, there is no valid mapping, and you take a page fault.
- Total: the CPU fetches one word from the hash table, and then
  fetches N entries from the inverted page table. If everything
  is well tuned, then N should on the average be 1.1 or 1.2 or
  so. On the IBM RS, the entries are 16 bytes, which conveniently
  happens to be the width of the bus to memory.
- so, IBM is expecting a TLB fault to take two or three memory reads.

The 88200 uses the direct (tree) scheme, and takes exactly two
one-word reads. The MIPS R3000 takes more, because they handle TLB
faults in software; they claim the faults are too rare to be worth
the hardware.

Other fun issues: keeping the tables in virtual memory: sharing:
multiprocessors.
-- 
Don		D.C.Lindsay 	Carnegie Mellon Computer Science