[comp.windows.x] What does xload measure ??

wengland@stephsf.stephsf.com (Bill England) (10/05/90)

 I got sunclock to run yesterday ( The imake worked  right out of the box.  
 Did not have to tweak a thing. :-)  ). 

 When sunclock is running xload shows the load going from 1/2 to just 
 over 2.  Thats a 'load' increase of over 400%.  The cpu usage from ps 
 shows only 12/sec per hour though.  (stephsf has 9Mb of memory)

 What does xload measure anyway ???   

 +--------
 |  Bill England
 |  wengland@stephsf.COM
 |
  * *      H -> He +24Mev
 * * * ... Oooo, we're having so much fun making itty bitty suns *
  * *

msm@condo.cis.ufl.edu (Michael S. McLean) (10/05/90)

In article <372@stephsf.stephsf.com>, wengland@stephsf.stephsf.com (Bill
England) writes:

|> When sunclock is running xload shows the load going from 1/2 to just 
|> over 2.  Thats a 'load' increase of over 400%.  The cpu usage from ps 
|> shows only 12/sec per hour though.  (stephsf has 9Mb of memory)
|>
|> What does xload measure anyway ???   
|>

Xload displays the sampled average of the length of the run queue
over the last minute.  Hence, unless sunclock is forking other
processes, it can only drive up the load average by one process.
Even to do that, it would have to run continuously without ever
sleeping or blocking for I/O, and would very probably get more
than 12 sec/hour of cpu time.


Michael S. McLean (msm@ufl.edu)    
Computer and Information Sciences  "We are all the Swedish Chef."
College of Engineering                   -- Harry Berryman
University of Florida              

jef@well.sf.ca.us (Jef Poskanzer) (10/05/90)

In the referenced message, msm@ufl.edu (Michael S. McLean) wrote:
}In article <372@stephsf.stephsf.com>, wengland@stephsf.stephsf.com (Bill
}England) writes:
}> When sunclock is running xload shows the load going from 1/2 to just 
}> over 2.  Thats a 'load' increase of over 400%.  The cpu usage from ps 
}> shows only 12/sec per hour though.  (stephsf has 9Mb of memory)
}>
}> What does xload measure anyway ???   
}
}Xload displays the sampled average of the length of the run queue
}over the last minute.

Yes.

}                       Hence, unless sunclock is forking other
}processes, it can only drive up the load average by one process.
}Even to do that, it would have to run continuously without ever
}sleeping or blocking for I/O, and would very probably get more
}than 12 sec/hour of cpu time.

Unfortunately, this turns out not to be the case.  The problem is that
the kernel samples the length of the run queue at regular intervals,
and meanwhile xload and sunclock and other similar programs are
sleeping and running at similar regular intervals.  The intervals can
start beating against each other, going in and out of sync, and making
the load average number meaningless.

For example, on a quiescent Mac II my xload used to do this:

                 ___-------------------
                /                      |
               /                        \
              |                          \
______________|                           ---____________

It would be zero for a long time, then all of a sudden would climb to
one, stay there for a minute or so, then go back to zero.  Turns out
that A/UX was sampling the run queue every two seconds, and xload was
running every two seconds, and they were drifting in and out of sync.

Given this, it seems quite possible that adding in a sunclock would
make both the xload and the sunclock go into sync with the kernel,
raising the alleged load to two.

Moral: dump xload in the toilet.  Use xcpu or xmeter or something like
them instead.
---
Jef

  Jef Poskanzer  jef@well.sf.ca.us  {ucbvax, apple, hplabs}!well!jef
                  CONCENTRATED!!! DILUTE!!! DILUTE!!!

mouse@LIGHTNING.MCRCIM.MCGILL.EDU (10/05/90)

>>> When sunclock is running xload shows the load going from 1/2 to
>>> just over 2.  The cpu usage from ps shows only 12/sec per hour
>>> though.

> The problem is that the kernel samples the length of the run queue at
> regular intervals, and meanwhile xload and sunclock and other similar
> programs are sleeping and running at similar regular intervals.  The
> intervals can start beating against each other, going in and out of
> sync, and making the load average number meaningless.

As another data point, I had a program that asked the kernel for a
SIGALRM 100 times a second, which amounted to every hardware clock
tick.  Every clock tick, therefore, this process would go runnable and
be counted in the kernel's load average figure.  However, the vast
majority of these clock ticks caused very little computation - perhaps
several dozen instructions plus signal delivery overhead.  Result: an
increase in the load figure of 1, but no real cycle sink.

The load average figure is nothing but a rather crude approximation.
It is useful for some things but there are many things that can cause
it to be nearly worthless.  (One machine here - a slow machine, a
VAX-11/750 - has been seen with the load figure over 60 but with
response time not noticeably degraded compared to the usual value of
around 2.  Granted, this was an extreme case.)

> Moral: dump xload in the toilet.  Use xcpu or xmeter or something
> like them instead.

What are xcpu and/or xmeter, and how do they avoid the sort of problems
under discussion?

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

jef@well.sf.ca.us (Jef Poskanzer) (10/06/90)

In the referenced message, mouse@LIGHTNING.MCRCIM.MCGILL.EDU wrote:
}> Moral: dump xload in the toilet.  Use xcpu or xmeter or something
}> like them instead.
}
}What are xcpu and/or xmeter, and how do they avoid the sort of problems
}under discussion?

xmeter was posted to comp.sources.x a week ago, v9i59.  It's a generalization
of xload to handle any statistic that can be rstatted, including non-idle
cpu percentage, which is what you want to measure to get a load estimate.
Cpu percentage doesn't suffer from the syncronization/beating problems
discussed, because it is a direct measure, not sampled.

xcpu is a program that has been kicking around since X10, originator
unknown.  It graphs cpu percentage, and is otherwise just like xload.
Craig Leres and I posted a version for X11R2 I believe, and he has been
working on an up to date version.

The only quibble I have with cpu percentage meters is that they don't
differentiate between niced and normal percentage, so if you have a
niced background process computing mandelbrots or guessing passwords or
something, your cpu meter becomes useless.  The right way to fix this
is to graph the niced and normal percentages on top of each other, but
I don't think the StripChart widget lets you do that.

By the way, my opinion of the run queue length measure is that it dates
from the dark ages of huge timesharing systems, when the assumption was
that the cpu would never be idle.  It's not so much a measure of load
as of *overload*.  But modern workstations don't get overloaded in the
cpu area, they are more likely to be I/O-limited.
---
Jef

  Jef Poskanzer  jef@well.sf.ca.us  {ucbvax, apple, hplabs}!well!jef
            ...lucky for me morning only comes once a day.

leres@ace.ee.lbl.gov (Craig Leres) (10/06/90)

Jef Poskanzer writes:
> xcpu is a program that has been kicking around since X10, originator
> Craig Leres and I posted a version for X11R2 I believe, and he has been
> working on an up to date version.

Actually, I wrote the original version (for X10R4 in 1987).

When I made the switch from suntools to X, I wanted something like the
cpu perfmeter. I looked at xload but the idea of graphing an
exponential average was "just too funny" so I turned a copy of xload
into xcpu.

Jef correctly states that I have been working on a X11R4 version. In
fact, it's been fully functional since July. I've also added a
"LineGraph" option to the StripChart widget (by coincidence, I just
finished debugging it last night). I'll try to get the manual entry
tweaked up and make it all available in the near future.

		Craig

wengland@stephsf.stephsf.com (Bill England) (10/07/90)

In article <21008@well.sf.ca.us> Jef Poskanzer <jef@well.sf.ca.us> writes:
>In the referenced message, mouse@LIGHTNING.MCRCIM.MCGILL.EDU wrote:

  After reading these threads and talking to John about sunclock it
  seems that the xload 'problem' is releated to the way sunclock's 
  event loop is structured.  Apperently sunclock calls 'sleep(1)'
  before updating it's time varialbles.  This causes it to almost
  always be in the run que when xload takes its sample.
>
>xmeter was posted to comp.sources.x a week ago, v9i59.  It's a generalization
>of xload to handle any statistic that can be rstatted, including non-idle
>cpu percentage, which is what you want to measure to get a load estimate.
>Cpu percentage doesn't suffer from the syncronization/beating problems
>discussed, because it is a direct measure, not sampled.

---Xmeter---
I'm getting a make file error when I try to run Imake to create xmeter.
Probally the Imake file here is not configured correctly.  The man page 
for Imake on SCO's ODT is incomplete. Could someone please forward me 
a copy of the nroff-able Imake manual page.


Thanks,

 +--------
 |  Bill England
 |  wengland@stephsf.COM
 |
  * *      H -> He +24Mev
 * * * ... Oooo, we're having so much fun making itty bitty suns *
  * *