[net.arch] TLB design, nostalgia, & the PDP-8

rpw3@redwood.UUCP (Rob Warnock) (11/13/85)

Original-Subject: Re: 386 info
+---------------
| ...One certainly doesn't want the TLB to have a different idea of these bits
| than those kept in memory, not only for the reasons above, but because:
| 	2) Even worse, you have to keep track of where the TLB entry CAME
| 	FROM, so you can write it back, which adds a lot more state per entry.
| -john mashey | UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash
+---------------

Ah, nostalgia time! (See below.) Yes, indeedy, John, you don't want to know
where the entry came from, because you might want it to come from something
other than the page table due to a TLB fault!

First, you might want a diagnostic instruction which (deliberately) writes
an arbitrary TLB cache line into the TLB. In this case, the entry did not
come from ANY entry in the page table.

A more important (and practical) reason: you want the TLB to be "listening"
to memory-bus writes to the page table, so that if a page table line (entry)
is written and the TLB contains a corresponding line, the TLB line is
automatically updated (or at LEAST invalidated). This is so that if you
are using the page table to manipulate data structures (as is often done
in real-time systems or with various IPC message-passing mechanisms), when
you diddle with a page-table entry you don't have to flush the entire TLB.
(Yes, Virginia, a few systems out there have *large* TLBs, which are loaded
by faulting but flushed by a single hardware command.)

void nostalgia(interested)
BOOL interested;
{	if(!interested)		/* some won't be ;-} */ 
	    return push('n');
	else
	    read(what_follows);

John Alderman and I built a paging box for a DEC PDP-8 quite a few years
ago, and I designed the kernel of a communications operating system around
it. (The pager was known as the "Core Window" and the O/S was "WSCHED".)
It used a TLB as big as the page table (obviously, direct-mapped), and was
flushed each time you loaded a new page-table address (base register/pointer).

Because the TLB "listened" for writes to the currently active page table,
it was not necessary to flush the TLB when you touched the page table. It
just copied the new descriptor into the TLB whenever it saw it "go by" on
the bus. (Note that it was possible that the FIRST loading of a TLB entry
after a context switch might be via such "fly-by" monitoring. No matter.
It worked.)

Thus, there was no speed penalty for changing a page-table entry. This made
it VERY convenient to use "clists" (in the Unix TTY-line-discipline sense),
with one full page per "clist" chunk, with the linked-list "next" field
actually being a page-table descriptor for the next page in the "clist".
(The pages were quite small, so this was not unreasonable.)

Further convenience was obtained by a software convention that required
that the current page table always be mapped into the virtual address
space (that is, the page table always had certain entries which mapped
itself into a known absolute virtual address). So all you had to do was
jam a new descriptor into the absolute address which was (mapped to) the
page-table entry for a page, and that virtual page now correctly mapped
the new physical page. This made linked-list following be blindingly fast
(for a PDP-8), even across "data fields" (a PDP-8 name for its clunky
4096-word "segments" -- and you thought the 8086 was bad!), for the page
table entries could point anywhere in the full 32K. (Yup! 32K max phys. addr.)

A code sample will say it all (for those who remember PDP-8 assembler!):

	/C is the absolute virtual address of the page table itself
	/    CX is the offset within the page table of the C-segment descriptor
	/    WX is the offset within the page table of the W-segment descriptor
	/W is the absolute virtual address of a "scratch" or "work" page
	/    WLNK is an offset within a W page for the forward link to "next"

	/Code fragment to map the next page in a clist. Roughly equivalent
	/in C to "ptr = ptr->next". Note that "TAD" is PDP-8 "add", which
	/is used also (with AC == 0) as "load"; "DCA" is "store, then clear AC".

			/On entry, W is mapped to the "current" clist page:
	TAD	W+WLNK	/get descriptor of next page
	DCA	C+WX	/store into page table
			/Addresses W, W+1, W+2, ... now "are" the next page.

That's it! Two instructions. With the "Core Window" installed, one might
almost believe a PDP-8 had index registers, instead of only a single
accumulator register!

In a communications operating system, this was a real winner. There was
one "context" (a "process" or page table) per source/destination connection
(actually two -- one in each direction). It took about three instructions
to set up the context for the connection, once the source of an interrupt
had been identified. The page tables were stored sequentially, so adding
the base of the page-table array to the unit number of the interrupting
device gave you the descriptor to be loaded into the MMU's page table base
register.  Poof! All of the context of the connection was then available
AT ABSOLUTE VIRTUAL ADDRESSES. (Remember: The PDP-8 had NO index registers,
or address registers of ANY sort!) This architecture was able to handle over
3000 user data characters per second, with an interrupt per character per
port (i.e., more than 6000 int./sec., one on the way in and one heading out).

Footnote: The "WSCHED" operating system was easily ported several years
later from the "PDP-8 + Core Window" to a vanilla Z-80, by TRANSLITERATING
the assembly language and converting absolute virtual addressing (mapped)
to indexed addressing using the Z-80's "X" and "Y" registers. The key was
that no algorithms were changed; the code was just "hand compiled" from
one machine to the other. (Took a couple of weeks.) 'Course it DID help
that the code had originally been written in BLISS and hand-compiled to
the PDP-8 in the first place... ;-}  (The BLISS stayed as comments in the
assembler.)

} /* end "nostalgia" */


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	627 26th Ave, San Mateo, CA  94403

mat@amdahl.UUCP (Mike Taylor) (11/14/85)

> Original-Subject: Re: 386 info
> A more important (and practical) reason: you want the TLB to be "listening"
> to memory-bus writes to the page table, so that if a page table line (entry)
> is written and the TLB contains a corresponding line, the TLB line is
> automatically updated (or at LEAST invalidated). This is so that if you
> are using the page table to manipulate data structures (as is often done
> in real-time systems or with various IPC message-passing mechanisms), when
> you diddle with a page-table entry you don't have to flush the entire TLB.
> (Yes, Virginia, a few systems out there have *large* TLBs, which are loaded
> by faulting but flushed by a single hardware command.)

You don't, by any chance, mean System/370 ? and PTLB ? IBM did fix that,
(in the 3033, I think) with the IPTE (Invalidate Page Table Entry)
instruction.  Listening to memory-bus writes doesn't help much in,
for example, a multiprocessor system where each CPU has its own TLB
and non-store-through cache, and the TLBs may contain entries for
address spaces that are not currently being dispatched.  For further
complication, page tables are of variable format (segment/page hierarchy),
in order to keep their size reasonable, and the memory system is
multiported.
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]