shani@GENIUS.TAU.AC.IL (Oren Shani) (05/30/91)
Can anyone tell me why the load avarge graph shows definite patterns of exponential decay? It seems that most (by far most) of the points of the LA graph are on lines of the form c*exp(-a*(t-t0))+b, in which 'a' is some cosmic constant (I sampled LA in severl computers, for several time periods, and got the same 'a' every time). Is this due to some policy of the system's time shearing mechanism, or did I discover a new cosmic law? O.S. -- --- --- Oren Shani (shani@genius.tau.ac.il) / / / Faculty of Engineering, Tel Aviv univ. / / --- Israel / / /
mycroft@goldman.gnu.ai.mit.edu (Charles Hannum) (05/31/91)
In article <2155@ccsg.tau.ac.il> shani@GENIUS.TAU.AC.IL (Oren Shani) writes:
Can anyone tell me why the load avarge graph shows definite
patterns of exponential decay?
It seems that most (by far most) of the points of the LA graph are
on lines of the form c*exp(-a*(t-t0))+b, in which 'a' is some cosmic
constant (I sampled LA in severl computers, for several time periods,
and got the same 'a' every time). Is this due to some policy of the
system's time shearing mechanism, or did I discover a new cosmic law?
I noticed this a long time ago, while running xload. For some reason,
every 30 or 60 seconds, the load will suddenly jump and slowly decay
on an otherwise idle machine.
Note that the load average that xload displays is the average over the
past minutes -- which explains the slow decay. But why the sudden
jump? I've always attributed it to 'update' ('syncer' on some
systems), and ignored it.
shani@GENIUS.TAU.AC.IL (Oren Shani) (06/05/91)
I think I know already what the reason is. It's simply because avaraging a stairs function (in this case, the length of the ready queue), over a constant period, gives a seemingly exponential pattern. Infact, this is simply a derive of a geometrical series... This is quite easy to show... Geez, I didn't think I will get so much response to that :-) -- --- --- Oren Shani (shani@genius.tau.ac.il) / / / Faculty of Engineering, Tel Aviv univ. / / --- Israel / / / --- * --- * "And that's the last time I trust a woman!"
torek@elf.ee.lbl.gov (Chris Torek) (06/09/91)
In article <2155@ccsg.tau.ac.il> shani@GENIUS.TAU.AC.IL (Oren Shani) asks: >Can anyone tell me why the load avarge graph shows definite patterns >of exponential decay? It seems that most (by far most) of the points >of the LA graph are on lines of the form c*exp(-a*(t-t0))+b, in which >'a' is some cosmic constant ... Surprisingly, I have seen no answers to this at all, when the reason is trivial. This exponential decay is there because it was designed to be there. The load average is computed by iterations of the formula: -1/k -1/k average = average e + n (1 - e ) t t-1 where `t' is time, `n' is the instantaneous `number of runnable jobs', and `k' is the number of discrete t's that occur per `load average time'. Since the load average sample interval is 5 seconds, the one-minute average has k=12 (12*5 = 60 seconds), the 5-minute average has k=60 (60*5 = 300 s = 5 min), and the 15-minute average has k=180 (180*5 = 900 s = 15 min). When n is zero, as it typically is on workstations, this reduces to -t/k average = average e t 0 i.e., exponential decay. The reason this is consistent across many systems is that it was done at Berkeley for 4BSD and then copied into those systems. In article <MYCROFT.91May31031208@goldman.gnu.ai.mit.edu> mycroft@goldman.gnu.ai.mit.edu (Charles Hannum) writes: >I noticed this a long time ago, while running xload. For some reason, >every 30 or 60 seconds, the load will suddenly jump and slowly decay >on an otherwise idle machine. ... I've always attributed [these spikes] >to 'update' ('syncer' on some systems), and ignored it. This is almost certainly the correct explanation (/etc/update is counted as runnable while waiting for the sync() system call, which it typically issues once every 30 seconds). In article <MEISSNER.91May31111801@curley.osf.org> meissner@osf.org (Michael Meissner) writes: >Another thing could be the activity to run the various xclock >programs, and such. I would imagine that on timesharing systems with >lots of xterms, this could be significant. xclock is particularly unlikely to add to the load average (although it does add to the machine load!) because of a design misfeature in most Unix systems. The problem is that the system metering---the code that computes the load average, cpu utilization for each process, and so on---is run off the same clock as the scheduler. Thus, at every clock tick (or every n'th tick), we first see what is going on---nothing--- then we schedule the clock program, which runs for a short while and goes back to sleep. In particular, given the usage-sensitive CPU scheduling found in most BSD-derived schedulers (which is to say every SunOS system through at least SunOS 3.5, and probably 4.x as well), it is possible for a program to use the clock to drive itself just after it is sampled as sleeping, work until just before the next sample, and then go to sleep waiting for the next clock tick. By doing this it appears to use no CPU time, hence gets fairly high priority (the kernel believes that it has not got its fair share of CPU yet) and runs immediately on the next clock tick, and thus is asleep again by the time the clock ticks again. This perpetuates the cycle. Such a process can starve out other processes. The solution is simple but requires relatively precise clocks. Fortunately such clocks exist on Sun SparcStations (unlike Sun-3s). The 4BSD Sparc kernel will use them, once I get around to fixing that part of the system. (First I have to get running multi-user, now that single user boots work, and write such minor [ahem] things as a frame buffer driver and get enough going to make X run.... Sorry, Masataka, but I intend to run X windows on *my* workstation, at least until something better comes along. :-) ) -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
shani@GENIUS.TAU.AC.IL (Oren Shani) (06/11/91)
In article <MEISSNER.91May31111801@curley.osf.org>, meissner@osf.org (Michael Meissner) writes: |> Another thing could be the activity to run the various xclock |> programs, and such. I would imagine that on timesharing systems with |> lots of xterms, this could be significant. Yea, I think so too. Infact, this is why I started analayzing the LA garph - I intend to use the graph's "ruffness" as a meausre to the volume of activity in the computer, i.e. the number of little vi's and xterm's, etc, running arround. |> -- |> Michael Meissner email: meissner@osf.org phone: 617-621-8861 |> Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142 |> |> You are in a twisty little passage of standards, all conflicting. --- Geez! someone still remeber ol' Zork! :-) -- __ __ Oren Shani (shani@genius.tau.ac.il) / / / Faculty of Engineering, Tel Aviv university / / -- Israel /__/ . __/ . "Hold your temper" -- The caterpillar to Alice
dhesi@cirrus.com (Rahul Dhesi) (06/12/91)
In <14081@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes: >The load average is computed by iterations of the formula: > -1/k -1/k > average = average e + n (1 - e ) > t t-1 Just to give credit where it is apparently due, we should note that the concept of an exponentially-decaying measure of the number of jobs in the ready queue was probably invented in the TENEX operating system, which ran on DECsystem-10 machines. BSD seems to have borrowed the idea from there and given it a life of its own. -- Rahul Dhesi <dhesi@cirrus.COM> UUCP: oliveb!cirrusl!dhesi
stodola@orion.fccc.edu (Robert K. Stodola) (06/12/91)
In article <14081@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes: >In article <MEISSNER.91May31111801@curley.osf.org> meissner@osf.org >(Michael Meissner) writes: >>Another thing could be the activity to run the various xclock >>programs, and such. I would imagine that on timesharing systems with >>lots of xterms, this could be significant. [Much very interesting text deleted here. Thanks Chris!] >The solution is simple but requires relatively precise clocks. ... One of my associates and I did a study of this a number of years ago (actually it was with a PDP-11/70 running IAS). We found that there was substantial clock synchronized usage on the system. The solution we found didn't require very precise clocks at all. Simply one whose rate was relatively prime to the system clock. We got very good results using a clock at 5x.y Hz (don't remember the exact speed, but was a strange one in the 50's) on a system driven off a 60Hz clock. This was adequate to desynchronize the sampling rate from the system rhythms. Because it was a slow clock, it didn't add much load to the system, but did gave an adequate statistical picture of individual usage and load.