[comp.unix.i386] Is VHANDFRAC --> VHANDL dynamic?

jr@oglvee.UUCP (Jim Rosenberg) (07/07/90)

I have seen paging behavior under AT&T UNIX V.3.2 on a 6386 where the number
of free memory pages as reported by sar seems to sit forever drastically
below what I had *thought* was the low-water mark.  This always seems to
happen after paging orgies, which continue long after processes that caused
all the paging have terminated.  According to the AT&T documentation:

"VHANDFRAC determines the initial value for the system variable VHANDL.
VHANDL is set to the maximum user-available memory divided by VHNDFRAC or
the value of GPGSHI, whichever is larger." (Operations/System Administration
Guide, under Paging Parameters.)

This sort of implies but doesn't really say that VHANDL is computed at boot
time and thereafter left alone.  Is this really how it happens?  Can VHANDL
get recalculated?  How do I find out what the "real" low-water mark is?  How
else can I explain a quiet system just sitting there with far fewer free
pages than the low-water mark?

The V.3 paging parameters mystify me.  I get more JUNK in my mail about
training seminars, but I would *beg* my management to go to a good one-day
tutorial on V.3 paging parameters.  Help!
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/07/90)

In article <562@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:

   This sort of implies but doesn't really say that VHANDL is computed at boot
   time and thereafter left alone.  Is this really how it happens?  Can VHANDL
   get recalculated?

From what I understand VHANDL *is* a constant.

   How do I find out what the "real" low-water mark is?  How else can I
   explain a quiet system just sitting there with far fewer free pages
   than the low-water mark?

You probably are thinking of the wrong definition of 'free' page. In
theory in a machine where the total of process virtual memory sizes is
larger than the physical memory available, there should not ever be any
really free pages. What happens is that the pger will make a distinction
between active and *inactive* pages; and will put the *inactive* pages
on the to-be-free list, ready to be reused, but it will not actually
clean them out and reuse them unless there is demand for actually-free
memory.

   The V.3 paging parameters mystify me.  I get more JUNK in my mail
   about training seminars,

What about reading the book "The design of the UNIX operating system" by
Bach, Prentice-Hall, ISBN 0-13-201757-1, that describes in some detail
the algorithms used by the System V pager? I have a copy before me just
now, and it has Chapter 9 "Memory management policies" which is quite
explicit, e.g. subsection 9.2.2, "The Page-Stealer Process", page 294.
It might also help to read the article on the 386 port that appears in
"Unix Papers" published by SAMS.

   but I would *beg* my management to go to a good one-day tutorial on
   V.3 paging parameters.  Help!

Unfortunately the System V paging and swapping policies *stink* as Bach
regretfully hints, and the only advice you will get from AT&T is to buy
more memory so that they will never get exercised. Hope that S5.4 is not
that bad -- after all it is largely influenced by SunOS, and Sun
recently corrected at least one of the most glaring mistakes they had
made in the SunOS paging/swapping algorithms...

Probably helping Toshiba and Hitachi clear the memory chip glut is the
easiest way out. This offends my aestethics, but hey, USA operating
system designers that care/know about virtual memory management are
probably rarer than Japanese VLSI process engineers :-).

--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

rick@pcrat.uucp (Rick Richardson) (07/10/90)

In article <PCG.90Jul7174558@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

>Probably helping Toshiba and Hitachi clear the memory chip glut is the
>easiest way out. This offends my aestethics, but hey, USA operating
>system designers that care/know about virtual memory management are
>probably rarer than Japanese VLSI process engineers :-).

I'll give 'em a little slack - it wasn't until recently that they
had a decent test suite for the policies.  Thanks go to the OSF,
which came out with the "Motif Brand System V.3 Virtual Memory
T[h]rasher".

	 20K	HelloWorld, Text Version
	100K	VGA Graphics Version
	300K	X11 Athena Widgets Version
	700K	X11 OSF Motif Version

-Rick

-- 
Rick Richardson | Looking for FAX software for UNIX/386 ??? Ask About: |Mention
PC Research,Inc.| FaxiX - UNIX Facsimile System (tm)                   |FAX# for
uunet!pcrat!rick| FaxJet - HP LJ PCL to FAX (Send WP,Word,Pagemaker...)|Sample
(201) 389-8963  | JetRoff - troff postprocessor for HP LaserJet and FAX|Output

stripes@eng.umd.edu (Joshua Osborne) (07/10/90)

In article <1990Jul9.192912.2001@pcrat.uucp> rick@pcrat.UUCP write:

  [...Size of diffrent "Hello World" programs]
>	300K	X11 Athena Widgets Version
Try that on a Sun (or any other place with shared libs).

jolt: cd /usr/local/X11R4/bin
jolt: ls -s
total 3240
   1 X@                    2 startx*              24 xlogo*
 800 Xsun*               176 twm*                 24 xlsatoms*
  16 appres*              32 xauth*               24 xlsclients*
  16 atobm*               40 xbiff*               24 xlsfonts*
  32 bdftosnf*            80 xcalc*               16 xlswins*
  72 bitmap*              24 xclipboard*          24 xmag*
  16 bmtoa*               40 xclock*              56 xman*
  16 constype*            24 xconsole*           120 xmh*
  64 ico*                 24 xcutsel*              1 xmkmf*
  24 imake*               88 xditview*            32 xmodmap*
  16 kbd_mode*            96 xdm*                 72 xpr*
  40 listres*              3 xdpr*                40 xprop*
  24 makedepend*          24 xdpyinfo*            32 xrdb*
  24 maze*                40 xedit*               16 xrefresh*
   1 mkdirhier*           24 xev*                 32 xset*
  16 mkfontdir*           48 xeyes*               24 xsetroot*
  16 muncher*             24 xfd*                 32 xstdcmap*
  40 oclock*              32 xfontsel*           168 xterm*
  16 plaid*              120 xgc*                 24 xwd*
  40 puzzle*              24 xhost*               32 xwininfo*
  24 resize*              24 xinit*               24 xwud*
  24 showrgb*             24 xkill*
  24 showsnf*             24 xload*
jolt:

Of corse I don't have all the contrib stuff built for 4.1 yet...
-- 
           stripes@eng.umd.edu          "Security for Unix is like
      Josh_Osborne@Real_World,The          Mutitasking for MS-DOS"
      "The dyslexic porgramer"                  - Kevin Lockwood
"Don't try to change C into some nice, safe, portable programming language
 with all sharp edges removed, pick another language."  - John Limpert

ssb@quest.UUCP (Scott S. Bertilson) (07/11/90)

> jr@oglvee.UUCP (Jim Rosenberg) wrote:
> I have seen paging behavior under AT&T UNIX V.3.2 on a 6386 where the number
> of free memory pages as reported by sar seems to sit forever drastically
> below what I had *thought* was the low-water mark.  This always seems to
> ...
> This sort of implies but doesn't really say that VHANDL is computed at boot
> time and thereafter left alone.  Is this really how it happens?  Can VHANDL
> get recalculated?  How do I find out what the "real" low-water mark is?  How
> ...
> The V.3 paging parameters mystify me.  I get more JUNK in my mail about
> training seminars, but I would *beg* my management to go to a good one-day
> tutorial on V.3 paging parameters.  Help!

  I suppose this is old hat, but since I haven't seen a reply to Jim's
query, I'm posting my reply in hopes that we'll both get barraged
with pages of useful information. :-)

  I've played with this a fair amount under SVR3.2 on Altos machines.
"VHANDL" doesn't change once the system is up.  You can verify this
using "crash":
	crash
	od -d tune 11
(my system has 11 4 byte integers in the structure defined in
"/usr/include/sys/tuneable.h")
  I've also noticed the behavior you described - not necessarily
during heavy paging, but it seems that free memory as listed
by "sar -r" often goes below GPGSLO without causing paging.
I've adjusted the numbers on my system fairly substantially (the
Altos defines these and other values in "/usr/sys/master.d/kernel"):
	VHNDFRAC=12
	GPGSLO=40
	GPGSHI=100
	GPGSMSK=0x0420
	VHANDR=3
	VHANDL=10
	MAXSC=64
	MAXFC=64
Several values draw heavily on previous versions of UNIX
from Altos.  I have looked at SVR2 on a 68020 and XENIX/SVR2 on a 386.
They don't have as complex a paging system, but do have several
parameters in common.
  My changes did seem to improve things, but I still can't figure
out why free pages should go so low.  Perhaps it is because the
pager will only steal pages if they have been unused for VHANDR * 2
seconds (this is mentioned in "<sys/tuneable.h>" - it's also worth
looking at "<sys/getpages.h>").  I suppose a situation like this
either could mean you're running a very large application and/or
that you are short of physical memory for your application mix.
I sure wish someone would write about the design goals of the SVR3
pager and describe how the parameters are supposed to interact.
I hoped at one point to find something in the Bach book, but
that was probably foolish considering that this is a fine
point that is somewhat implementation dependent.
-- 

Scott S. Bertilson   ...uunet!rosevax!rose3!quest!ssb
			scott@poincare.geom.umn.edu

jr@oglvee.UUCP (Jim Rosenberg) (07/12/90)

In <PCG.90Jul7174558@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:

>In article <562@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>   How do I find out what the "real" low-water mark is?  How else can I
>   explain a quiet system just sitting there with far fewer free pages
>   than the low-water mark?

>You probably are thinking of the wrong definition of 'free' page. In
>theory in a machine where the total of process virtual memory sizes is
>larger than the physical memory available, there should not ever be any
>really free pages. What happens is that the pger will make a distinction
>between active and *inactive* pages; and will put the *inactive* pages
>on the to-be-free list, ready to be reused, but it will not actually
>clean them out and reuse them unless there is demand for actually-free
>memory.

Hello?  Let me see if I understand this.  The page-stealing demon will not
*really* move a page out to swap space unless a process actually *asks* for
more memory.  Ah, but a process asking for more memory might *not* allow
sleep, and since the page-stealing daemon is asynchronous there's no
guarantee exactly when it will run.  So I could just sit there with free
pages as shown by sar well below what I think the low-water mark should be,
and then if a burst of activity (lots of forks, say) happens very rapidly,
free memory could fall to 0 before the paging daemon could catch up.

Now if this is how it works, I have to say *WHY*, for crying out loud!  Why
doesn't the page-stealing daemon *steal pages* when the number of free pages
falls below the low-water mark???  Were they worried about thrashing?

>Unfortunately the System V paging and swapping policies *stink* as Bach
>regretfully hints, 

You can say that again!  BTW I have read Bach.  I actually reread the paging
stuff when I began having these problems, and found no enlightenment (other
than the obvious mantra, "Buy Them Chips!  Buy Them Chips! ..."  I guess I
should go read it again.

I've also observed what appears to be a kind of *deadlock*.  I have a batch
database job running -- *extremely* disk intensive.  All of a sudden the
hard disk light goes out, even though the job has not finished.  The system
is just "stuck"!  If I toggle to another virtual terminal with a getty
running on it and press RETURN, woila, the batch job comes suddenly unstuck.
Most disconcerting.

At home I have an AT&T 3b1.  It has a curious bastardized version of UNIX:
SVr0 with a patchwork of V.2 stuff and a few BSD utilities and Convergent's
various enhancements.  I believe its VM system is competely Convergent
homebrew.  The system has 2M, and I have *NEVER* seen any of the kinds of
problems I see all the time with V.3.2.  The machine is quite slow by
today's standards, but it sure has a *solid* feel.  "Fragile" is more than
kind as a description of the V.3 paging system.  I sure hope the new VM
system in V.4 is solid, what we have now is a mess.
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr@dsi.com                      /      /

wsinpdb@lso.win.tue.nl (Paul de Bra) (07/13/90)

In article <565@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:
>In <PCG.90Jul7174558@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>>... What happens is that the pger will make a distinction
>>between active and *inactive* pages; and will put the *inactive* pages
>>on the to-be-free list, ready to be reused, but it will not actually
>>clean them out and reuse them unless there is demand for actually-free
>>memory.
>
>Hello?  Let me see if I understand this.  The page-stealing demon will not
>*really* move a page out to swap space unless a process actually *asks* for
>more memory...

I think this is wrong. The page-stealing demon will copy a page to swap
space and mark it as 'free'. It does not zero the page or anything, so
if the process wants the page back and the page has not been ackuired by
another process in the meantime the original process can get its original
page back. It need not be paged-in from the swap space.

Is this correct?

The problem which remains is what happens when a process suddenly needs
more pages that are currently marked free.

Paul.
(debra@research.att.com)

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/17/90)

In article <565@oglvee.UUCP> jr@oglvee.UUCP (Jim Rosenberg) writes:

   In <PCG.90Jul7174558@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo
   Grandi) writes:

   >Unfortunately the System V paging and swapping policies *stink* as Bach
   >regretfully hints, 

   I've also observed what appears to be a kind of *deadlock*.  I have
   a batch database job running -- *extremely* disk intensive.

I gues most of that is paging. Use vsar or the recently posted mon or
u386mon programs (THANKS! they are both great) to see the swap rates
and the system time expended by the swapper/pager, and horrify.

   All of a sudden the hard disk light goes out, even though the job
   has not finished.  The system is just "stuck"!  If I toggle to
   another virtual terminal with a getty running on it and press
   RETURN, woila, the batch job comes suddenly unstuck.  Most
   disconcerting.

Oh no. This is actually very common -- happens to me all the time. It
is the stinkiest problem with the swapper. The swapper goes nuts, even
if the working set of the application is smaller than "available" real
memory. We have discussed it to death, and Chen from AT&T insists that
my most horrible suspicion (expansion swapping!) is unfounded.  If it
is not like that, I wonder what it can be.

Switching to another vt and typing return will case the process
attached to it (shell, getty, anything) to be rescheduled, memory to
get shuffled, the swapper to be called, and since the memory layout
will have changed, the deadlock will cease to exist.

The only "solution" is to make sure that there is around 2-3 times more
real memory than the combined size of all active working sets, e.g. to
let memory lie around 70 percent unused on average.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/17/90)

In article <1289@tuewsd.win.tue.nl> wsinpdb@lso.win.tue.nl (Paul de Bra)
writes:

   The page-stealing demon will copy a page to swap space and mark
   it as 'free'.

Different versions do different things, but most will not save a page
to swap until it is required, just in case it is reused and modified
again, so avoiding some IO traffic.

Some pagers will even (and it is a big mistake) preferentially select as
victims pages that have not been modified, to avoid having to save them
prior to reuse.

   It does not zero the page or anything, so if the process wants the
   page back and the page has not been ackuired by another process in
   the meantime the original process can get its original page back. It
   need not be paged-in from the swap space.

   Is this correct?

Yes.

   The problem which remains is what happens when a process suddenly
   needs more pages that are currently marked free.

The page stealer (clock hand) would be invoked; and/or the swapper would
be invoked and some process (possibly the one that requested the extra
page) will be expansion swapped. I suspect that in the current System
V/386 the latter course is taken, with the outswap candidate being the
process requesting the extra page (which is often a poor choice for
obvious reasons). Again, Bach hints that the algorithms used to select
outswap (especially if outswap was because of expansion of memory
allocation) and inswap candidates should be rewritten.

In practice there is very poor interaction between the balance set
manager (swapper) and the working set manager (pager, the clock
algorithm), because their functions (block is a *global* policy) do
overlap somehow, and their logic has not been well integrated and
designed.

Again, the solution recommended by AT&T is to avoid exercising the
swapper and pager, by allocating 2-3 times more real memory than the
expected worst case usage (e.g. 512KB to 1MB per user, when the average
working set size of a user command is well below 100-300KB).
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk