[comp.unix.internals] Meaning of %CPU in top/ps

barmar@think.com (Barry Margolin) (04/19/91)
I've redirected followups to comp.unix.internals, since there is nothing in
the question that is specific to Ultrix.  In fact, much of my response here
is based on a very non-Unix-like OS (Genera on Symbolics Lisp Machines),
but I strongly believe that my answer is pretty correct for Unix as well.

In article <THOMSON.91Apr18122955@zazen.macc.wisc.edu> thomson@zazen.macc.wisc.edu (Don Thomson) writes:
>My manager is trying to analyze performance on our DECstation 3100 running
>Ultrix 4.0 and has asked me to clarify the meaning of the percentage CPU time
>that is displayed.  We're looking at percentage idle and guessing that 
>percentage CPU time used + idle time should add up to 100%, but are rarely 
>seeing CPU percentages wander over 1 or 2%.  I have a feeling we're comparing
>apples and oranges here

>	%CPU      CPU utilization of the process.  This is a decaying
>                  average over a minute or less of previous (real) time.
>                  Because the time base over which this is computed
>                  varies since processes may be very young, it is possi-
>                  ble for the sum of all %CPU fields to exceed 200%.

The problem is that on a single-CPU computer, at any instant at most one
process can be running, so an instantaneous CPU time percentage would show
one process having 100% CPU time and all the rest having 0%.  On the other
hand, if the CPU percentage were simply the amount of CPU time the process
has used during the last real minute divided by 1 minute, processes that
were started less than a minute ago would have artificially low CPU
utilization, and processes that became idle in the middle of the minute
would have misleadingly high CPU utilitization; for instance, a process
that started monopolizing the system 30 seconds ago would only show 50%
usage, even though it has actually taken over the system.

The common solution to this is to use a weighted average, with decaying
weights.  Averages are computed more often than just once a minute, and the
more recent averages are multiplied by weighting factors to make them more
significant in the final number.  Thus, if only two processes ran during
the last minute, and process A was the only process during the first 30
seconds and process B was the only process during the second 30 seconds,
process B would have a higher %CPU than A.

However, the algorithms for computing this generally don't actually keep
around all the little per-period averages.  Instead, running weighted
averages are computed at process switch time.  Furthermore, the weighting
factors are generally arbitrary, and decay exponentially rather than
linearly, with very recent activity having a very high weight.  It's
basically a heuristic, and it's a white lie to call these things
percentages.  Like car mileage, the numbers should be used for comparison
purposes only.
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar