[net.unix-wizards] OS performance measurements

allbery@ncoast.UUCP (Brandon Allbery) (12/10/85)

Expires:

Quoted from <319@polaris.UUCP> ["Re: 4.2 scheduler (and CPU utilization)"], by herbie@polaris.UUCP (Herb Chong)...
+---------------
| .........the 15 minute load average rising above 10 is a warning that
| system limits are being reached.  any time the 1 minute load average is
| over 14 for more than 30 seconds, the system is beginning to thrash.
+---------------

And you discuss system time as well.  Foo.

I wrote a program to average out the system III struct sysinfo (the source
of timex's information) values for system, user, and idle time.  On a system
with one user active, the figure U/(U+S+I)*100 (``utilization''?) is from
1% to 97%, depending on the programs running.  (ld caused the 97% figure;
an idle shell was 1%.)  With multiple users running COBOL and Informix-SQL
the ``utilization'' varies from 0.7% to 90%.  And *none* of it indicates
when the system is thrashing.  (Strangely enough, I have a program which
does give a reliable index, in our situation, of the system load.  It counts
the number of file table entries(!).  When it reaches 180, the system begins
to thrash; at 200, it's on its knees and echo at 9600 baud can be 15 seconds
out of sync.)

This program doesn't give the same results on ncoast.  It works on tdi2
because the processes which eat the most memory, causing a swap-thrash
whiplash (say it 5x fast :-), all open 10-15 files apiece while they are
crunching.  On ncoast, if they system thrashes it's because someone on
tty15 is running hack and the sysop and I are in phantasia . . .  (N.B.
The resident time for phantasia in 1/2 hour is greater than that for processes
which had been constantly active over four hours!  It's also higher than
that for init, which may/may not be reasonable in sys3 (what is that blasted
process doing now?  Why can't it be a well-behaved v7 init?!))

I have it in mind to try to write a reliable sys3 load heuristic; it'll
check the system times, the file and inode tables, the p_cpu field in
the proc table (weighted by flags and status), etc.  Problem is, it's
more than likely going to be a perfect example of the uncertainty prin-
ciple; it'll drive the load way up just by running. . .

--Brandon
-- 

			Lord Charteris (thurb)

ncoast!allbery@Case.CSNet (ncoast!allbery%Case.CSNet@CSNet-Relay.ARPA)
..decvax!cwruecmp!ncoast!allbery (..ncoast!tdi2!root for business)
6615 Center St., Mentor, OH 44060 (I moved) --Phone: +01 216 974 9210
CIS 74106,1032 -- MCI MAIL BALLBERY (WARNING: I am only a part-time denizen...)