[comp.arch] memory management implementations

jonathan@cs.pitt.edu (Jonathan Eunice) (04/02/91)

My impression is that most workstations, and perhaps a fair number of
other systems as well, use some low-cost approximation of the LRU
algorithm--such as a variation on the Clock algorithm--for page
replacement.  (If this is wrong, let me know.)

Are the LRU approximations used "good enough?"  That is, with all of
the transistors we can now put on a chip or multi-chip CPU, does it
make sense to use more expensive, closer-to-true-LRU algorithms?
Would this help VM performance some measurable amount, and is it
reasonable to do?  What is the real-estate/transistor-count cost?
What is the design/opportunity costs?

mash@mips.com (John Mashey) (04/02/91)

In article <JONATHAN.91Apr1213714@speedy.cs.pitt.edu> jonathan@cs.pitt.edu (Jonathan Eunice) writes:
>My impression is that most workstations, and perhaps a fair number of
>other systems as well, use some low-cost approximation of the LRU
>algorithm--such as a variation on the Clock algorithm--for page
>replacement.  (If this is wrong, let me know.)

>Are the LRU approximations used "good enough?"  That is, with all of
>the transistors we can now put on a chip or multi-chip CPU, does it
>make sense to use more expensive, closer-to-true-LRU algorithms?
>Would this help VM performance some measurable amount, and is it
>reasonable to do?  What is the real-estate/transistor-count cost?
>What is the design/opportunity costs?

I don't recall the exact numbers, but I recall that the effect was:
1) If you have 2-set-associative TLB, LRU is important, and has a
better miss rate than random replacement, and is cheap to do.
2) as the degree of associativity rises, the difference between
LRU and random mostly disappears; the cost for implementing LRU may go
up; the cost to implement random is minimal.

maybe somebody can cite some references.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086

renglish@cello.hpl.hp.com (Bob English) (04/03/91)

mash@mips.com (John Mashey) writes:
> In article <JONATHAN.91Apr1213714@speedy.cs.pitt.edu> jonathan@cs.pitt.edu (Jonathan Eunice) writes:
> >My impression is that most workstations...use some low-cost
> >approximation of the LRU algorithm... Are the LRU approximations
> >used "good enough?"

> I don't recall the exact numbers, but I recall that the effect was:
> 1) If you have 2-set-associative TLB, LRU is important, and has a
> better miss rate than random replacement, and is cheap to do.
> 2) as the degree of associativity rises, the difference between
> LRU and random mostly disappears; the cost for implementing LRU may go
> up; the cost to implement random is minimal.

For caches and TLBs, this is probably right.  It's not clear to me,
however, whether the question refers to the hardware replacement
policies used by the uPs, or the software replacement policies used in
paging VM systems.  In the latter case, the miss penalty for choosing
the wrong page can be 50MHz * ~20msec = 1 Million cycles, and it can be
worth spending a lot of CPU time to avoid a single miss, particularly if,
as is often the case, the workstation doesn't have other, important work
to do.

--bob--

mmm@cup.portal.com (Mark Robert Thorson) (04/07/91)

Someone once told me (hence I could be all wrong when repeating it) that
the clock or "scheduled sweep" algorithms are preferred to LRU
because although they don't have quite the best average performance,
they also lack LRU's worst-case performance.  For example, let's say
somebody is repeatedly processing the same array;  under LRU, every new
array element you move to is the element you visited longest-ago in time,
hence most likely to have been paged to disk.