jjb@sequent.UUCP (Jeff Berkowitz) (04/28/89)
In article <67727@pyramid.pyramid.com>, csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: > >There's a reason for that. Dynix divides the load average by the number of >CPUs you have. If uptime(1) displays 1.6, and you have four CPUs, then the >load average is really 6.4. > "really"? :-) Very early in the history of DYNIX, Sequent experimented with both alternative implementations of load average. The existing one was selected because it more accurately described the behavior that users perceived. In addition, some daemons refuse to run if the "load average" is very high. Since customers can also write code that checks this, the computed load average should reflect reality; each processor can simultaneously run a program. How does a four processor 9845 handle load average? I presume from your comment that Pyramid does not divide by the number of processors? Does this mean performance does not scale linearly? -- Jeff Berkowitz N6QOM uunet!sequent!jjb Sequent Computer Systems Custom Systems Group
steve@polyslo.CalPoly.EDU (Steve DeJarnett) (04/28/89)
In article <15248@sequent.UUCP> jjb@sequent.UUCP (Jeff Berkowitz) writes: >In article <67727@pyramid.pyramid.com>, csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: >>There's a reason for that. Dynix divides the load average by the number of >>CPUs you have. If uptime(1) displays 1.6, and you have four CPUs, then the >>load average is really 6.4. >"really"? :-) > >Very early in the history of DYNIX, Sequent experimented with both >alternative implementations of load average. The existing one was >selected because it more accurately described the behavior that >users perceived. Load average was (long ago) defined to be "the average number of jobs in the Run queue over the last 1,5,15 minutes". To quote directly from the Dynix Version 3.0.4 man page for 'w': The load average numbers give the number of jobs in the run queue averaged over 1, 5 and 15 minutes. Are there multiple run queues on a Balance 8000?? I've never studied the implementation of Dynix (lack of source makes it more difficult also :-), but I'd suspect there's one run queue, and processors grab the next job eligible when they're free. Am I correct here?? If so, then the notion of dividing the # of jobs in the run queue by the number of processors to obtain load average is in conflict with what the manuals say you're doing. Of course, as has been pointed out, load averages are merely subjective measurements of your system "response". As we all know, system "response" depends on a great number of things. So, the question boils down to this: Do you want to generate load averages "like the rest of the world" that reflects how many jobs are in your run queue, and then have some added caveat of "but we have N processors to run these M jobs on, so the effective load (or some such term as that) is really M/N", or do you generate load averages the way you currently do with "load average is dependent on the number of processors AND the number of processes (and, oh, therefore our load averages MAY or MAY NOT compare directly with those of machine X)". Personally, I prefer the former. This gives you a way of comparing Apples to Apples (figuratively, not literally). If there's a load average of 7.5 on my Pyramid, and a load average of 7.5 on my Sequent, I would know that they are measuring the same thing. Then if I log into my Sequent and find that response time is faster (or slower), I would have a means of direct comparison that is quantifiable (sp??). I realize that in the end, this whole thing boils down to a religious issue over what you believe is "right". I personally (if it wasn't already apparent) believe that "number of jobs in the run queue" is the appropriate measure. That's just me, though. One last question. When Sequent computes their load average, do they take into account the possibility that some of the processors might not have been available during the last 1,5,15 minutes?? If I have 2 processors running user processes, but the Sequent is basing its calculation of load average on 10 processors (or how many there actually are in my system), then a load average based on that premise is not a truly representative number. >How does a four processor 9845 handle load average? I presume >from your comment that Pyramid does not divide by the number of >processors? Does this mean performance does not scale linearly? I don't think they do on a 2 processor 98x, so I doubt things are that different on a 9845 (our machine's kernel actually believes that it's a 9810, but that's a totally different story). Load average on a Pyramid (correct me if I'm wrong, Carl) is "Average # of jobs in the run queue over the last 1, 5, and 15 minutes". The fact that you have 4 processors there to keep things going makes it all the better. One other question springs to mind here (sorry this is getting very long): Given more processors to run jobs, won't the jobs that are there finish (hopefully) sooner than they would on a system with fewer of the same processors, and therefore result in there being fewer jobs in the run queue at any given moment in time overall?? This would seem to be another argument (if it is indeed true) against Sequent's method of load average computation. >Jeff Berkowitz N6QOM uunet!sequent!jjb >Sequent Computer Systems Custom Systems Group ------------------------------------------------------------------------------- | Steve DeJarnett | Smart Mailers -> steve@polyslo.CalPoly.EDU | | Computer Systems Lab | Dumb Mailers -> ..!ucbvax!voder!polyslo!steve | | Cal Poly State Univ. |------------------------------------------------| | San Luis Obispo, CA 93407 | BITNET = Because Idiots Type NETwork | -------------------------------------------------------------------------------
csg@pyramid.pyramid.com (Carl S. Gutekunst) (04/28/89)
Hi Jeff! In article <15248@sequent.UUCP> jjb@sequent.UUCP (Jeff Berkowitz) writes: >How does a four processor 9845 handle load average? The total number of processes in run state or in non-interruptable sleep. >In addition, some daemons refuse to run if the "load average" is very high. Anything besides sendmail? >Since customers can also write code that checks this, the computed load ave- >rage should reflect reality; each processor can simultaneously run a program. Any program that decides to alter its behavior based on load average *better* make that value run-time selectable. Sendmail does. I understand that you are trying to make this relatively meaningless number more useful and intuitive. But given per-process multiprocessing, I don't see how "more processors" differs from "faster processors." Taken to its logical extreme, it would seem that every vendor should divide their load average by their VUPS rating. :-) It's up to the system administration to determine what an "acceptable" load average is. This is going to vary based on the needs of the site, and the type of machine they are using. If I add more horsepower to my machine, then in my mind I've increaed the allowable load average, regardless of whether I did it by adding bigger processors (9805 to 9815, or Balance to Symmetry) or by adding more processors. If the load average is divided by the number of CPUs, then the calculation is distorted; I end up mentally multiplying the number I see by a magic factor to turn it back into something I can use. On the other hand, there *is* the warm fuzzy of installing more processors and seeing the load average drop. Me, I'm not real wild about warm fuzzies. (A warm tribble, perhaps....) >Does this mean performance does not scale linearly? How is this relevant to the discussion? To answer the question, though -- How linear is linear? :-) The Pyramid 9000 is within 5% of linear. I gather that it's not quite as flat as a Balance, Symmetry, Multimax, or Elxsi; but it seems to be considerably more linear than a VAX 8800. <csg>
arosen@hen.ulowell.edu (MFHorn) (04/30/89)
In article <10847@polyslo.CalPoly.EDU> steve@polyslo.CalPoly.EDU (Steve DeJarnett) writes: > Are there multiple run queues on a Balance 8000?? I've never studied the > implementation of Dynix (lack of source makes it more difficult also :-), but > I'd suspect there's one run queue, and processors grab the next job eligible > when they're free. Am I correct here?? I believe the BSD kernel maintains something like 30 run queues (Chris Torek could probably give more accurate information). The Dynix kernel is a pretty close clone of the 4.2 kernel, so they probably have just as many. I do know that Dynix maintains a queue (one per processor?) of jobs that have been 'affinitied' to a processor. The scheduler checks this queue before the 'normal' run queues. > One last question. When Sequent computes their load average, do they > take into account the possibility that some of the processors might not have > been available during the last 1,5,15 minutes?? If I have 2 processors running > user processes, but the Sequent is basing its calculation of load average on > 10 processors (or how many there actually are in my system), then a load > average based on that premise is not a truly representative number. There are system calls available to find out how many processors are online at the time of the call. The kernel function that computes the load averages (loadav in vm_sched.c) divides by 'nonline'. If a processor goes offline, the load averages won't be accurate, but after 15 minutes they will be. -- Andy Rosen | arosen@hawk.ulowell.edu | "I got this guitar and I ULowell, Box #3031 | ulowell!arosen | learned how to make it Lowell, Ma 01854 | | talk" -Thunder Road RD in '88 - The way it should've been