[comp.sys.dec] Performance Tuning a DEC 5000 Ultrix 4.0 Risc Workstation

corey@milton.u.washington.edu (Corey Satten) (09/21/90)

: ----- cut here ----- cut here ----- cut here ----- cut here -----
: This is a "shell archive".  Save everything after the cut mark
: in a file called thisstuff, then feed it to sh by typing sh thisstuff.
: SHAR archive format.  Archive created Thu Sep 20 09:13:16 PDT 1990
echo x - READ_ME
echo '-rw-r--r--  2 corey       6125 Sep 20 09:12 READ_ME    (as sent)'
sed 's/^-//' >READ_ME <<'+FUNKY+STUFF+'
-        Performance Tuning a DEC 5000 Ultrix 4.0 Risc Workstation
-
-		Corey Satten, corey@cac.washington.edu
-				 and
-	      Laurence Lundblade, lgl@cac.washington.edu
-
-		  Networks and Distributed Computing
-		       University of Washington
-			  Seattle, Washington
-			    September 1990
-
-
-
-History:
-
-      Until August 1990, our department was using a rather maximally
-  configured pmax (DEC 3100 running Ultrix 3.1) as a time-sharing host.
-  It had five disks, mostly Maxtor 660 megs.  It served /usr/local/bin
-  via NFS to about a dozen workstations; was the departmental electronic
-  mail machine; host to some campus wide mailing lists; our anonymous FTP
-  server; one of two campus default domain nameservers; and also
-  time-sharing host for about 16 X-terminals plus about a dozen other users
-  connected via telnet.  We were supporting about 150 megs of swap space
-  on some small portion of the 24 megabyte physical memory.  A 'ps aux'
-  listing usually had 250-300 lines in it.
-
-      As you might guess, the machine wasn't always snappy, but it did
-  admirably.  It was clearly disk i/o limited -- mostly, we assumed,
-  because it was usually thrashing.  Still, the load average was usually
-  between 1-2 and it was mostly the spikes which were annoying.
-
-      Mid August we upgraded to a 3max (DEC 5000) running Ultrix 4.0.  We
-  doubled our RAM to 48 megs, increased our MIPS rating by 50-80% and felt
-  that the system was slower than ever.  According to the `ps' program we
-  were still thrashing even though our active virtual memory was less than
-  the physical memory available to support it.  As we looked more closely,
-  we discovered that the system wasn't even paging, it was swapping, and
-  making stupid choices of what to swap, at that!
-
-Analysis:
-
-      Eventually we decided that the constants involved in the 2-handed
-  clock paging algorithm are no longer appropriate.  In particular:
-
-      lotsfree = 128 (512k)
-      desfree  = 64  (256k)
-      minfree  = 24  (96k)
-      maxpgio  = 60  (4k pages per second)
-      slowscan = 94  (computed)
-      fastscan = 47  (computed/2)
-
-      In the old days, programs were small and the extra memory needed
-  to start several could be obtained from a 512k-byte free list.  Today,
-  programs are bloated with X libraries, etc.  Our average process is
-  about 500k.  At the scan rates we were seeing:  100-200 4k pages/second,
-  scanning simply couldn't keep up with the demand.  Our free list hovered
-  right around the minimum threshold which triggered swapping.
-
-      We examined some old source code and discovered what factors can
-  trigger swapping.  Several of these, such as load>2 are compiled into
-  the kernel as constants or are computed into local variables -- these
-  can only be changed by recompiling the kernel -- something we can't do
-  until DEC releases the current source.  Fortunately a significant number
-  of the terms in the equation are stored in global variables which can
-  be fiddled on a running system.  By changing a few values, we believe
-  we have virtually eliminated swapping on our system and raised the
-  interactive performance level substantially.
-  
-      On our system we have made the following changes:
-
-      lotsfree = 1280 (5 meg)
-      desfree  = 256  (1 meg)
-      minfree  = 64   (256k)
-      maxpgio  = 125  (4k pages per second)
-      slowscan = 30
-      fastscan = 10
-
-      In this way, we try to have 5 megs of free list for programs to
-  absorb transient loads, we can replenish the free list 5 times faster
-  than the default, and we've increased the allowable page-in plus page-out
-  rate to 125 (I can easily make our system burst to 150 and sustain 125,
-  so I don't think 125 is indicative that the paging system is in distress.
-  Also, when choosing your own numbers, remember that vmstat displays `pi'
-  and `po' in 1k pages).
-
-      Since DEC has phased-out adb, I wrote a program to allow us to make
-  these changes.  I've called it `kmem' and it works like this:
-
-      prompt% kmem lotsfree desfree		# to read values
-      lotsfree(0x8014ba40)        1280
-      desfree(0x8014ba48)         256
-
-      prompt% kmem -w lotsfree=1281 desfree=257	# to write values
-      lotsfree(0x8014ba40)        1280 -> 1281
-      desfree(0x8014ba48)         256 -> 257
-
-      Once you find values you're happy with, stick it in /etc/rc.local
-  and be happy.  The source to `kmem' is included in this directory.
-
-Final Disclaimers:
-
-      By re-compiling the kernel, we expect we can do still better.  We
-  believe the clock paging algorithm still isn't working very well and
-  even though we see better performance when paging than swapping, we
-  suspect that because the "global page replacement" algorithm is making
-  its decisions on very local (2megabyte spread between hands) page use
-  data we aren't making very good use of physical memory.  To support this
-  claim, we notice that our cpu usually shows substantial idle time even
-  when the load is greater than 1 and the "active real memory" field we
-  print in our "vmstat" listing (from t_arm) usually shows lots of our
-  physical memory is "inactive" when we think it shouldn't be.
-
-      By increasing desfree to 640 (2.5meg) we can partially re-enable
-  swapping of only "deadwood" (jobs sleeping for longer than 20 seconds).
-  We find this helps increase our active real memory and decrease our idle
-  cpu but at an unacceptable degradation in interactive response time.
-
-      Before I finish, I should probably point out that in addition to the
-  load you might expect on our system, we have 3 anomalies:  first, we
-  have about 60-80 processes such as xclock, which wake-up every now and
-  then to check/update something and then sleep for a short while longer.
-  Second, we have an unusually large number of very popular shell scripts
-  which start dozens of little awks, seds, greps, etc.  Third, we have
-  3 swap disks configured and we think we've done a good job of spreading
-  all disk requests across all the drives.
-
---------
-Corey Satten, corey@cac.washington.edu
-Networks and Distributed Computing
-University of Washington
+FUNKY+STUFF+
chmod u=rw,g=r,o=r READ_ME
ls -l READ_ME
echo x - kmem.c
echo '-rw-r--r--  2 corey       2910 Sep 10 20:20 kmem.c    (as sent)'
sed 's/^-//' >kmem.c <<'+FUNKY+STUFF+'
-/*
- * a tool to use in place of adb (on systems without adb) which lets you
- * peek and poke at the values of kernel variables in /dev/kmem
- *
- * usage:	kmem var1 var2 ... varN
- *  or
- * usage:	kmem -w var1=val1 var2=val2 ... varN=valN
- *
- * Corey Satten, corey@cac.washington.edu, 9/6/90 - Ultrix 4.0 version
- */
-#include<stdio.h>
-#include<nlist.h>
-#include<sys/file.h>
-
-struct nlist *nl;		/* how we find locations of names */
-int *nv;			/* the new values for each name */
-int w_flag = 0;			/* write new values? */
-char *file = "/vmunix";		/* default file to read symbols from */
-int kmem;
-
-main(argc, argv)
-    int argc;
-    char *argv[];
-{
-    int f;			/* walks argv upto index of first non-flag */
-    int i;			/* walks through remaining arguments */
-    int value = 0;
-    int rc = 0;
-
-    /*
-     * flag parsing
-     */
-    for (f=1; f<argc && *(argv[f]) == '-'; ++f) {
-	switch(argv[f][1]) {
-	default:
-	    fprintf(stderr, "%s: unknown flag -%c\n", argv[0], argv[f][1]);
-	    exit(1);
-	case 'w':
-	    w_flag = 1;
-	    break;
-	case 'f':
-	    file = argv[++f];
-	    break;
-	}
-    }
-
-    /*
-     * handle the remaining arguments as either symname or symname=value
-     * depending on whether -w (w_flag) was specified.
-     */
-
-    nl = (struct nlist *) malloc( sizeof(*nl) * (argc-f+1) );
-    nv = (int *) malloc( sizeof(int) * (argc-f+1) );
-    if (!nv || !nl) {perror("malloc"); exit(1);};
-
-    for (i=0; i<argc-f; ++i) {
-	char *name = (char *)malloc(strlen(argv[i+f]+1));
-
-	if (!name) {perror("malloc"); exit(1);};
-	rc = sscanf(argv[i+f], "%[^=]=%d", name, &value);
-	if (rc - w_flag != 1) {
-	    fprintf(stderr, "%s: bad argument: %s\n", argv[0], argv[i+f]);
-	    exit(1);
-	    }
-	nl[i].n_name = name;
-	nv[i] = value;
-	}
-    nl[i].n_name = "";
-
-    /*
-     * now figure out where to read/write in /dev/kmem and do it
-     */
-    
-    nlist(file, nl);
-
-    kmem = open("/dev/kmem", w_flag ? O_RDWR : O_RDONLY);
-    if (kmem < 0) {
-	perror("/dev/kmem open");
-	exit(1);
-	}
-
-    for (i=0; i<argc-f; ++i) {
-	long seekto = (long)nl[i].n_value;
-
-	if (nl[i].n_type == 0) {
-	    fprintf(stderr, "%s: symbol `%s' not found in namelist of %s\n",
-		argv[0], nl[i].n_name, file);
-	/*
-	 *  We promise to do all writes in command line order, so if one
-	 *  is going to fail, we'd best bail out rather than continue.
-	 */
-	    if (w_flag) exit(2);
-	    else	continue;
-	    }
-	if ( lseek(kmem, seekto, 0) != seekto ) {
-	    perror("/dev/kmem lseek"); exit(2);
-	    }
-	if ( read(kmem, &value, sizeof(int)) != sizeof(int) ) {
-	    perror("/dev/kmem read"); exit(2);
-	    }
-
-	printf("%s(0x%x)\t%d", nl[i].n_name, nl[i].n_value, value);
-
-	if (w_flag) {
-	    if ( lseek(kmem, seekto, 0) != seekto ) {
-		perror("/dev/kmem lseek"); exit(2);
-		}
-	    value = nv[i];
-	    printf(" -> %d", value);
-	    if ( write(kmem, &value, sizeof(int)) != sizeof(int) ) {
-		perror("/dev/kmem write"); exit(2);
-		}
-	    }
-	putchar('\n');
-	}
-}
+FUNKY+STUFF+
chmod u=rw,g=r,o=r kmem.c
ls -l kmem.c
echo x - Makefile
echo '-rw-r--r--  2 corey         28 Sep 10 17:56 Makefile    (as sent)'
sed 's/^-//' >Makefile <<'+FUNKY+STUFF+'
-kmem:	kmem.o
-	cc -o $@ $@.o
+FUNKY+STUFF+
chmod u=rw,g=r,o=r Makefile
ls -l Makefile
exit 0