[comp.sys.amiga.introduction] Commodore Research and Development.

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (01/16/91)
daveh@cbmvax.commodore.com (Dave Haynie) writes:

> Actually, the same kind of thing seems to be true of Sun SPARC
> machines. Though the SPARCs seems to have kind of a plateau effect --
> they drop off linearly for CPU hog tasks 1..N, then all of a sudden
> take a nose dive. I don't know if this is a Sun 4 implementation
> detail, or an expected effect of the SPARC architecture, though.

Nope, that is a well know and easily explained characteristic of _any_
virtual memory system, usually called the "working set" phenomenon.

It is a characteristic of any well written, modular software that it can
execute along quite happily with a small subset of its total
executable's virtual pages in real memory, because execution is focused
for a "substantial" period of time on a limited number of virtual memory
pages called the "working set". It is characteristic that the task
executes some instructions out of _each_ page of its working set in every
normal preemptive time slice. So, as long as

   (number of cpu intensive tasks) * (average working set of pages) < 
   (total available real memory for virtual pages) 

page faults will be relatively rare and cpu bound jobs will mull happily
away, execution traces limited to pages already in memory, for
"substantial" (many time slices) periods of time, getting a substantial
portion of 1/Nth of their stand-alone performance out of the cpu.

When a process does page fault, there is a page not currently in demand
that can be swapped out, and the other jobs use the extra time to their
benefit.

But, let the needed virtual page working sets exceed the total real
memory available to hold them, and merry hell breaks loose.  Since there
aren't enough real memory pages to hold all the working sets, some job
will do a page fault, and the page that gets swapped out is going to be
one that another job is going to need _immediately_ when its time slice
comes around in turn, so it will page fault, causing a page to be
swapped out that another job is going to need in its next time slice,
which causes a page fault, ... ad infinitum.

The process is called thrashing, physically it means the read heads are
moving so fast across the swap area your computer is trying to walk off
your desk, and _no_ job gets any work done, since every job that comes
up to finally execute in its newly swapped in page promptly hits another
page in the working set that has been swapped out to satisfy some other
job, so it page faults and goes back to sleep.

The net result is that all N+1 jobs are sleeping waiting for page fault
i/o almost all the time, the swap area i/o is maxed out, and performance
drops into a black hole.

It's really instructive to work through the numbers on this one, but I
don't have the input data available, so you'll have to live with the
qualitative description above.

And, this having become a tutorial, a copy goes to .introduction.
Snaffle it, Ferry, for the FAQs, please.  Followups back to .advocacy.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>