[comp.arch] Working Set replacement, use bits

wilson@carcoar.Stanford.EDU (Paul Wilson) (08/29/89)

(This is in response to some rather uninformed traffic in comp.arch.
Since I'm comparatively uninformed myself, os folk may want to
correct mistakes or point to other interesting stuff.)

   First some references, then some explanations.

People who are interested in the Working Set virtual memory page
replacement policy should probably read Denning's article
"Virtual Memory," from Computing Surveys sometime in 1970.
(By the way, Denning invented WS.)

If you want a pretty good explanation why you shouldn't expect
any other policy to work dramatically better than WS, you should check
out Denning's 1980 IEEE TSWE (vol. 6, #1, I think) article,
"Working Sets Past and Present."

If you want to know how to implement good approximations of
Working Set on hardware without use bits, check out Ozalp
Babouglu and Domenico Ferrari, "Two-Level Replacement
Decisions in Paging Stores," IEEE Trans. on Computers,,
vol. 32, #12, December 1983.

   An explanation of WS:

The basic idea of the Working Set policy is that each process gets
to keep in memory all of the pages that it has referenced within
some period of time T.  For any given process, the Least Recently Used
(LRU) page is always evicted.  What makes it different from normal
LRU is that the amount of memory allocated to a process is variable --
it is any number of pages referenced within the last T units of time.
(T units of the given process' execution time that is.)


This works better than setting aside a fixed number of pages to
each process, because the working set size for a process varies
over time.  When a process goes through a phase of having a large
working set, it uses more pages.  As long as there aren't too
many processes with large working sets at the same time, all is
well.  If the total memory requirements become too large, some
process is selected and de-activated.  (This may also happen
by attrition, I think:  a process may terminate and the system may not
replace it with another.)

This avoids thrashing by guaranteeing that processes are allowed
to keep their working sets in memory, so that they make significant
progress.  There's little sense in scheduling more processes than
that, because they'd spend most of their time fighting over pages
of main memory.


The Working Set policy uses Denning's notion of "working set," which
he operationally defines as the pages touched within the last T units
of time.  This of course is only an approximation of the intuitive
notion of a working set, but it is a pretty good approximation.

One problem with this approximation is that it includes too many
pages when a program switches from one phase of operation to another.
When a program changes working sets (in the intuitive sense), its
"working set" actually includes *both* the working set it just stopped
referencing and the working set it has begun to reference.  This
causes peaks in the size of the "working set" as determined by
Denning's algorithm.

This isn't so much of a problem on a large time-sharing system,
where you can average over a bunch of processes, so transitory
peaks are usually easily absorbed.  With a single dominant
process (as is often the case on workstations), however, there
simply may not be enough slack to absorb the peak.  In that
case, the WS policy degenerates to a (fixed-space) LRU policy.

LRU seems to work about as well as anything anybody's tried, but
my own suspicion is that a smarter adaptive algorithm could work
significantly better.  (Not dramatically better, but significantly.)


   An explanation of how to fake WS without use bits:

The reason why you don't need use bits to implement a good replacement
policy is that you can essentially fake them with protection exceptions.

Suppose you simply access-protect all of your pages.  Whenever a page
is referenced, you get a protection exception, and your exception
handler can move the page to the beginning of the queue.  _Voila_, a perfect
LRU policy without use bits.  The problem is that the exceptions cost
something, so this would be very inefficient.  They don't cost a whole
lot though, because there's no disk-seek time or anything like that --
this all happens in memory. 

To get rid of most of the rest of the cost, you can give pages a "grace
period" after you move them to the head of the queue.  For a while,
you don't access-protect them.  Thus the head of the LRU queue is 
really a FIFO queue, since you're not detecting references and
reordering things.  Only when a page gets into the other part of
the queue do you access-protect it.

This sort of scheme can give you a very good approximation of LRU without
very many exceptions and without requiring use bits.



Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.uic.edu
Box 4348   Chicago,IL 60680 
Paul R. Wilson                         
Software Systems Laboratory               lab ph.: (312) 996-9216
U. of Illin. at C. EECS Dept. (M/C 154)   wilson@bert.eecs.edu
Box 4348   Chicago,IL 60680

hascall@atanasoff.cs.iastate.edu (John Hascall) (08/29/89)

In article <11564> wilson@carcoar.Stanford.EDU (Paul Wilson) writes:
 
}   An explanation of WS:
 
}The basic idea of the Working Set policy is that each process gets
}to keep in memory all of the pages that it has referenced within
}some period of time T.  For any given process, the Least Recently Used
}(LRU) page is always evicted.  What makes it different from normal
}LRU is that the amount of memory allocated to a process is variable --
}it is any number of pages referenced within the last T units of time.
}(T units of the given process' execution time that is.)
 
    It has been a while since I looked at this, so my memory could be
    a little moldy...

    I believe that when designing the WS scheme for VMS they studied
    various replacement algorithms (LRU, FIFO and RANDOM) and found
    that RANDOM was just about as effective as the other two.  It also
    has the virtue of being easier to implement.

    Does VMS, in fact, still use random replacement?

 John Hascall