[net.bugs.4bsd] Accounting Errors

vilot@gsg.UUCP (Mike Vilot) (12/16/85)

		UNIX 4.2BSD Accounting Log Errors

This message reports a  major  error  and  several  lesser  inaccuracies
regarding  process  resource  usage  accounting  on BSD 4.2 (and also on
4.2-based ULTRIX).  The comments below refer to resource usages (average
memory  and  CPU  time) as recorded in the /usr/adm/acct file.  Refer to
the acct(5) accounting structures.

Average Memory:

The average memory amount reported is too high by a factor of 100.   For
actual  average  memory  sizes greater than about 320 kbytes, this error
results in negative memory being  reported  (the  inflated  memory  size
toggles  the  sign  bit  of  the  *short* field allocated to receive the
value).

The above error can be fixed by recompiling  procedure  kern_acct.c  and
adjusting the denominator *i* which normalizes the memory integral.  The
correct term to divide by is (100*i).

Note that if the actual (not logged--see below) sum of user  and  system
CPU time is less than one second, then the average memory logged will be
zero!

CPU Time:

In BSD 4.2 both user and system CPU time, as recorded in the  accounting
log,  now  have units of integral seconds (not sixtieths of a second, as
was the case in 4.1).

Procedure kern_clock  accumulates  CPU  times  in  two  parts:  integral
seconds   and   also   in  microseconds  (with  a  resolution  of  10000
microseconds, or one  10-msec  clock  tick).   Unfortunately,  kern_acct
ignores the *tv_usec* portion when logging CPU time.

In  particular,  processes  having user or system CPU times of less than
one second will have these times logged as zero (and the logged  average
memory usage will also be zero).

The  above  inaccuracy  can  be  fixed  by adding the *tv_usec* portion,
appropriately scaled to units of seconds, to the  user  and  system  CPU
times when they are logged in kern_acct.

Further Notes:

1. The TIME command does report CPU times (to resolution of one-tenth of
a second) and code+data average memory (in  units  of  kilobytes).   See
Note 4 for a comment on accuracy.

2.  The  SA  utility is apparently still geared to 4.1.  It thus assumes
that CPU times are in integral units of sixtieths of a second.  Thus the
*j*  option will actually yield minutes (not seconds) and the SA command
without the *j* option will yield CPU  times  in  hours  (not  minutes).
Also, the memory reported by SA will be too high by a factor of 50.

3.  Even  with the patch to kern_acct mentioned above, the memory logged
in /usr/adm/acct will be too high by a factor of:
	(decimal   part   of   CPU   time)/(actual   CPU  time).   These
discrepancies  can  be  seen  by  comparing  memory  usages   given   by
/usr/adm/acct and those given by the TIME command.  The potential memory
error--even with the patch--is especially acute for total CPU  times  of
less  than  about  10  seconds.   For CPU times of about 10 seconds, the
maximum memory value error is about 20%.

4.  For  both  /usr/adm/acct  and  the  TIME  command,  CPU  times   are
accumulated  in  10-msec chunks, for whichever process has the processor
at each interval timer interrupt.  Memory integrals are also computed in
kbyte-click chunks.  These tally methods are approximate.

						James Bouhana

Route comments or discussion to:

-- 
Michael J. Vilot			decvax!gsg!vilot	(UUCP)
General Systems Group			vilot@wang-inst		(CSNET)
51 Main Street  			MVilot@USC-ISIF		(ARPA)
Salem, NH  03079			(603) 893-1000		(DDD)