[comp.arch] AM29000 memory management

stuart@rochester.UUCP (04/22/87)

In article <67@bernina.UUCP>, tve@ethz.UUCP (Th. von Eicken) writes:
> When reading the data sheet I noticed that the TLB entries
> don not have any "page used" flag nor any "page modified"
> flag. Does that mean that the AM29000 memory managenent is even
> more crippled than on a VAX (which doesn't have a "page used" flag???

A translation lookaside buffer (TLB) is not the same as page tables (PT).
The TLB serves as a cache of recently used address translations, while
the PT serves as the source of translation information.  Reference
(page used) and dirty (page modified) flags belong in the PT.

Stu Friedberg  {seismo, allegra}!rochester!stuart  stuart@cs.rochester.edu

tim@amdcad.AMD.COM (Tim Olson) (04/23/87)

In article <27207@rochester.ARPA>, stuart@rochester.ARPA (Stuart Friedberg) writes:
> In article <67@bernina.UUCP>, tve@ethz.UUCP (Th. von Eicken) writes:
> > When reading the data sheet I noticed that the TLB entries
> > don not have any "page used" flag nor any "page modified"
> > flag. Does that mean that the AM29000 memory managenent is even
> > more crippled than on a VAX (which doesn't have a "page used" flag???
> 
> A translation lookaside buffer (TLB) is not the same as page tables (PT).
> The TLB serves as a cache of recently used address translations, while
> the PT serves as the source of translation information.  Reference
> (page used) and dirty (page modified) flags belong in the PT.
> 
> Stu Friedberg  {seismo, allegra}!rochester!stuart  stuart@cs.rochester.edu

Actually, the referenced and changed bits are for use by the page replacement
algorithm, and are associated with physical addresses.  The PTs, on the other
hand, are (usually) searched using virtual addresses.  Unless you use an
inverted page table (IPT) structure, the R & C bits should be placed in a
separate structure which can be "searched" with physical addresses.

The "best" place for the referenced and changed bits, however, are in an 
external memory array, which "watches" the bus and automatically updates the
R & C bits.  This array can also be read from or written to via I/O space
to read or clear the bits.

Benefits:

	1) Speed -- R & C bits do not need to explicitly be written to memory,
	   nor do they need to be periodically cleared from the TLB.

	2) single copy -- R & C bits exist in only one place and are always
	   up-to-date.  If they were to exist in the TLB, they would have to
	   be "flushed" from the TLB into the correct memory locations every
	   time the page replacement algorithm runs.  This would *really* be
	   a headache for multiprocessor systems which use shared memory.
	   Works with I/O, too.

	3) Fairly inexepensive & easy solution (heck, even the IBM RT-PC does
	   it this way!)

If you *really* want to keep the R & C bits in a software structure in main
memory, it can be done using the standard tricks.  EXERCISE FOR READER:

given a tlb entry which holds the following information:

	VTAG	-- virtual address for this translation
	  UR	-- page has user read permission
	  UW	-- page has user write permission
	  UE	-- page has user execute permission
	 RPN	-- real (physical) page number
	 PGM	-- two user-programmable bits (which also appear on the pins)
	   F	-- a user-programmable bit (doesn't appear on the pins)

devise a method to collect page reference and change statistics.

	-- Tim Olson
	Advanced Micro Devices
	Processor Strategic Development
	(tim@amdcad.AMD.COM)

rich@motsj1.UUCP (Rich Goss) (04/24/87)

In article <27207@rochester.ARPA> stuart@rochester.ARPA (Stuart Friedberg) writes:
>In article <67@bernina.UUCP>, tve@ethz.UUCP (Th. von Eicken) writes:
>> When reading the data sheet I noticed that the TLB entries
>> don not have any "page used" flag nor any "page modified"
>> flag. Does that mean that the AM29000 memory managenent is even
>> more crippled than on a VAX (which doesn't have a "page used" flag???
>
>A translation lookaside buffer (TLB) is not the same as page tables (PT).
>The TLB serves as a cache of recently used address translations, while
>the PT serves as the source of translation information.  Reference
>(page used) and dirty (page modified) flags belong in the PT.
>
>Stu Friedberg  {seismo, allegra}!rochester!stuart  stuart@cs.rochester.edu

The used and modified flags should be stored along with the page
address in the TLB. Otherwise, the MMU will always have to check
the page table descriptor in main memory to see if these two
flags have been updated on every access to the page being
referenced. For example, a page is read accessed for the first time
and the modify bit is brought into the TLB but not set. If the
page is read accessed again the entry for the page in the TLB is
correct and the page table descriptor in main memory need not be referenced.
However, the next access is a write access to the page. The modify
bit in the TLB is checked, then set, then the MMU should go out
to the page table descriptor in main memory to set the modify bit
in the page table descriptor to agree with the copy of the modify
bit in the TLB. The next access is a write to the page. The TLB
indicates the modify bit has already been set. Therefore the page
table descriptor is correct (the modify bit having already been
set) and no further action is required by the MMU. One can see
that if the modify bit was not cached in the TLB, the MMU would
have to go out to main memory every time the page is referenced 
in order to check and/or set the modify bit.

I do not know how the 29000 MMU operates but the scenario I have
described is used in many demand paged MMU schemes including the
Motorola 68851 PMMU.

-- Rich Goss,
   Motorola Western Regional Field Applications Engineer for 68000 Family

mash@mips.UUCP (John Mashey) (04/25/87)

In article <121@motsj1.UUCP> rich@motsj1.UUCP (Rich Goss) writes:
...discussions of TLBs that don't have hardware-set modify and access
bits...
>
>The used and modified flags should be stored along with the page
>address in the TLB. Otherwise, the MMU will always have to check
>the page table descriptor in main memory to see if these two
>flags have been updated on every access to the page being
>referenced......
>I do not know how the 29000 MMU operates but the scenario I have
>described is used in many demand paged MMU schemes including the
>Motorola 68851 PMMU.

1) It is true that many do work this way.  That is NOT a reason to
claim that they "should" work this way, and have trouble working without.
2) I guarantee you that you don't
need to have hardware-set modify and reference bits in the TLB.
Machines work perfectly well without them.  The one I'm writing this on
works without them, and its performance is fine, and both System V and
4.3BSD ports came up quickly on it.
3) With some operating systems, the LAST thing in the world that
the OS wants is for the hardware to set modify bits anywhere without
trapping to the OS. For example, System V does copy-on-write handling,
and it ends up making pages look read-only that are really writable,
so it can trap writes, copy the page, and then give it a modifiable
page.  I.e., it has to fake out the hardware to get done what it wants.
4) All the statistics say that "change writable, but clean page to
dirty page" is an infrequent event, that happens on the order of
page-in rates, not on the order of memory access rates.  Hence it is
quite reasonable to do it in software.
5) Others have already talked on the rationale for doing TLBs this way.
See also "Operating System Support on a RISC", IEEE COMPCON,
SanFrancisco, March 1986, 138-143: we talked about this a year ago.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD:  	408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

ned@john.ksr.com (Kittlitz) (04/28/87)

Another advantage of putting the M bits in the memory controller is
powerfail recovery. If you keep your memory board backed up, the
modified information stays for free. If you have substantial
modularization, there is no longer a need to keep the processor on
battery power. (All this assumes that you have some small amount of time
to save register state, etc. when powerfail is first detected.)

ned@john.ksr.com (Kittlitz) (04/28/87)

How does MIPS TLB load deal with used/modified bits? I assume that
part of the path always executed determines that the bits are in the
correct state. i.e. some of the instructions included in your count
are for is-used-set?, is-this-a-write-well-then-is-modified-set?.

Assuming that these are coded with branch-not-taken being common
(used/modified already has correct value), a multiprocessor still has
to do some kind of exclusivity locking when it diddles the bits.
Got any multi implementation or plans you can talk about? e.g.
how long would THAT sequence be?

Can you describe any special hardware characteristics of the TLB in
greater detail (or is it published)? How about the way you use the TLB
in your UN*X implementation.



-----
disclaimer: generic disclaimer *

* - have you seen advertisements with '*'s, and no footnote?
It's the same with this disclaimer...