[comp.unix.wizards] Record High Load Average

brian@ucsd.EDU (Brian Kantor) (05/11/89)

A few years ago our Vax-780 'sdcc3' set the netwide load average record
high of 128 (and may still hold it for a single-processor Unix system).

Yesterday we got our network connections unjammed and our mail gateway
'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
is on a machine which has NO users, just sendmails.  Thank goodness
we're getting some faster hardware in soon.

Truly a frightful experience.

	Brian Kantor	UCSD Office of Academic Computing
			Academic Network Operations Group  
			UCSD C-010, La Jolla, CA 92093 USA
			brian@ucsd.edu ucsd!brian BRIAN@UCSD

bin@primate.wisc.edu (Brain in Neutral) (05/12/89)

From article <1704@ucsd.EDU>, by brian@ucsd.EDU (Brian Kantor):
> 
> Yesterday we got our network connections unjammed and our mail gateway
> 'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
> is on a machine which has NO users, just sendmails.

I once got up over 50 for a short while on my VAXstation 2000 under
similar circumstances.  It made my surface plot of the month's activity
look a bit strange; large areas of slightly ripply plains surrounding
the tower of Babylon.

> Truly a frightful experience.

That's for sure...

Paul DuBois
dubois@primate.wisc.edu		rhesus!dubois
bin@primate.wisc.edu		rhesus!bin

seindal@skinfaxe.diku.dk (Rene' Seindal) (05/13/89)

brian@ucsd.EDU (Brian Kantor) writes:

> A few years ago our Vax-780 'sdcc3' set the netwide load average record
> high of 128 (and may still hold it for a single-processor Unix system).

> Yesterday we got our network connections unjammed and our mail gateway
> 'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
> is on a machine which has NO users, just sendmails.  Thank goodness
> we're getting some faster hardware in soon.

I have seen a load of more than 200 on a Vax-785.  A hard error on a disk made
every process using the disk hang in disk i/o.  Because disk i/o is usually
short-termed, the processes still added to the load, even though they didn't
consume any cpu.

Talking to the machine became more and more difficult, as more and more
daemons got stuck too.  At a point we could only do rsh's to it, and only run
certain programs (whose that didn't touch the sick disk, which had /tmp on
it!).  Every time you made a mistake, one or two processes more would get
stuck, increasing the load further.


Rene' Seindal (seindal@diku.dk).

pl@etana.tut.fi (Lehtinen Pertti) (05/13/89)

From article <4657@freja.diku.dk>, by seindal@skinfaxe.diku.dk (Rene' Seindal):
> brian@ucsd.EDU (Brian Kantor) writes:
> 
> I have seen a load of more than 200 on a Vax-785.  A hard error on a disk made

	On SUN 3/260 (with SUNOS 4.0) load easily goes somewhere
	between 50-100 on occasions when one of network daemons
	die ( or hang ), but on normal situations even 10 is
	quite rare.

						Pertti Lehtinen
						pl@tut.fi
pl@tut.fi				! -------------------------------- !
Pertti Lehtinen				!  Alone at the edge of the world  !
Tampere University of Technology	! -------------------------------- !
Software Systems Laboratory

rayan@AI.TORONTO.EDU (Rayan Zachariassen) (05/13/89)

>From my archives:

> Date:    Sun, 4 Dec 88 20:43:55 EST
> From:    Rayan Zachariassen <rayan>
> To:      lab
> Subject: 380 i mean...
> 
>   8:28pm  up 5 days, 22:16,  16 users,  load average: 378.75, 245.91, 162.11
> 
> the problem was with infinite psroffs. i told balint to send us mail.
> 

I think I later saw ~470, but didn't manage to capture that.
This was a Sun 4/280, slightly busy.

badri@valhalla.ee.rochester.edu (Badri Lokanathan) (05/13/89)

In article <89May13.024652edt.11593@neat.ai.toronto.edu>,
rayan@AI.TORONTO.EDU (Rayan Zachariassen) writes:
>>   8:28pm  up 5 days, 22:16,  16 users,  load average: 378.75, 245.91, 162.11
>> 

A warning here: you could also get ridiculous numbers if ps U was not run
to create /etc/psdatabase (assuming you run 4.3 BSD), or boot off a
kernel that is not /vmunix (so that ps doesn't nlist the right kernel.)
-- 
"I care about my fellow man              {) badri@ee.rochester.edu
 Being taken for a ride,                //\\ {ames,cmcl2,columbia,cornell,
 I care that things start changing     ///\\\ garp,harvard,ll-xn,rutgers}!
 But there's no one on my side."-UB40   _||_   rochester!ur-valhalla!badri

carlo@merlin.cvs.rochester.edu (Carlo Tiana) (05/13/89)

> [a lot of interesting discussion about high load averages deleted]

Yeah, us too - my 3/50 yesterday had:

 12:51pm  up 17 days, 18:41,  2 users,  load average: NaN, NaN, NaN

and it was mostly from furiously reading news! Really!
Carlo.

carlo@cvs.rochester.edu

terryl@tekcrl.LABS.TEK.COM (05/14/89)

brian@ucsd.EDU (Brian Kantor) writes:
> A few years ago our Vax-780 'sdcc3' set the netwide load average record
> high of 128 (and may still hold it for a single-processor Unix system).
>
> Yesterday we got our network connections unjammed and our mail gateway
> 'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
> is on a machine which has NO users, just sendmails.  Thank goodness
> we're getting some faster hardware in soon.

     OK, it's reminiscing time again. WAY BACK (that's way back, Sherman!!!),
in medevial times, back when Berkeley's EECS Cory machine was a lowly 11/70,
the load average was regularly in the low 50's!!! Pretty much impossible to
get anything done with that kind of load average (and sometimes even logging
in would time out, so one couldn't even log in if one wanted to....).

     I wonder how a load average of 50 on an 11/70 compares with a load
average of 128 on an 11/780.......

chris@softway.oz (Chris Maltby) (05/16/89)

terryl@tekcrl.LABS.TEK.COM writes:
<      OK, it's reminiscing time again. WAY BACK (that's way back, Sherman!!!),
< in medevial times, back when Berkeley's EECS Cory machine was a lowly 11/70,
< the load average was regularly in the low 50's!!!

Memories, memories. 60 users on an 11/70 with 640kb of core. Times were
tough then...

<      I wonder how a load average of 50 on an 11/70 compares with a load
< average of 128 on an 11/780.......

Load averages seem to be pretty much an absolute, regardless of the hardware.
Anyway, an 11/70 was faster than a 780. Who would bother with a machine like
that. So here's the plug: the nice thing about machines like our Sequent
is that you get to divide the load average by the number of processors. To
get a load average of 128 you'd need 700 processes on the run queue (6 CPUs).
Given that we only have table space for 245 processes, we can never see
a load average greater than 40. Ho Hum.
-- 
Chris Maltby - Softway Pty Ltd	(chris@softway.sw.oz)

PHONE:	+61-2-698-2322		UUCP:		uunet!softway.sw.oz.au!chris
FAX:	+61-2-699-9174		INTERNET:	chris@softway.sw.oz.au

romain@pyramid.pyramid.com (Romain Kang) (05/16/89)

Let's take an absurd example:

main(argc, argv)
    char **argv;
{
    int nproc = atoi(*++argv);
    while (nproc-- > 0) {
        switch (fork()) {
            case 0:
                (void)setpriority(0, getpid(), 19);
                for (;;);
            case -1:
                pause();
            default:
                break;
        }
    }
    pause();
}

Can you hear the sales rep now?
"This box gives excellent response even with load averages over 500..."

Conversely, I'm sure much of the comp.unix.wizards readership has seen
page thrashers that render vanilla 4.2 BSD systems useless with load
averages under 1.0.

I can hear you screaming, "But that isn't what I'm running here!"
Is it any wonder that system benchmarking is considered a black art?

rbk@sequent.UUCP (Bob Beck) (05/18/89)

A simple way to generate *very high* load average without comsuming cycles is
to create a program that makes a "vfork-chain".  Ie, process does vfork,
child does vfork, etc, etc, until some number of processes exist, and last
one does (eg) sleep(3600).  When parent process waits for vfork child to exec
or exit, it sleeps at "high" priority and *is* counted as part of the load
average.  Thus you can add '1' to the load average per process in this
chain.  This is only true on BSD systems with vfork.  Kinda neat to drive
load average way up and not actually impact things much.

-- 
Bob Beck			uunet!sequent!rbk
Sequent Computer Systems	(503)626-5700

bh@brunix (Bent Hagemark) (05/18/89)

In article <1704@ucsd.EDU> brian@ucsd.EDU (Brian Kantor) writes:
>A few years ago our Vax-780 'sdcc3' set the netwide load average record
>high of 128 (and may still hold it for a single-processor Unix system).
>
>Yesterday we got our network connections unjammed and our mail gateway
>'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
>is on a machine which has NO users, just sendmails.  Thank goodness
>we're getting some faster hardware in soon.
>
>Truly a frightful experience.
>
>	Brian Kantor	UCSD Office of Academic Computing
>			Academic Network Operations Group  
>			UCSD C-010, La Jolla, CA 92093 USA
>			brian@ucsd.edu ucsd!brian BRIAN@UCSD


A few years ago now we were testing some mail system software
which put the load on the 11/750 we were using at the time
(since retired) well over 100.  I don't recall the exact
load average, but do remember that it was more than 2 digits
(to the left of the ".") as we used some load average utility
that assumed that the load would never be greater or equal
to 100.

The test involved 20 different machines to each send 20 messages through
this 750 simultaneously (or as simultaneous as one can do something
like that).

Lots of sendmails seem to like to find their way into the run
queue! :-)

jc@minya.UUCP (John Chambers) (05/18/89)

In article <273@indri.primate.wisc.edu>, bin@primate.wisc.edu (Brain in Neutral) writes:
> From article <1704@ucsd.EDU>, by brian@ucsd.EDU (Brian Kantor):
> > 
> > Yesterday we got our network connections unjammed and our mail gateway
> > 'UCSD' set what I think may be the record for a Vax-750: 85.3.  And this
> > is on a machine which has NO users, just sendmails.
> 
> I once got up over 50 for a short while on my VAXstation 2000 under
> similar circumstances.  It made my surface plot of the month's activity
> look a bit strange; large areas of slightly ripply plains surrounding
> the tower of Babylon.
> 
> > Truly a frightful experience.
> 
> That's for sure...

Not necessarily.  A couple of years back, I wrote a little program 
that added N to the load average without affecting performance.  The 
point was to explain to people why the load average was not necessarily 
a good measure of anything.

The program wasn't very tricky.  It just called nice(40), forked off 
N-1 copies, and each went into an infinite loop.  The result was N
cpu-hog processes, all running at the lowest priority, and using close
to 0 memory.  As a normal user, there was a limit of 20 processes,
but on machines where I knew the root password, I could (and did)
demonstrate load averages of 50 or 60 without any detectable impact
on response time for vi or emacs users.  The end-of-month statistics
on those machines drew a bit of attention....

-- 
John Chambers <{adelie,ima,mit-eddie}!minya!{jc,root}> (617/484-6393)

[Any errors in the above are due to failures in the logic of the keyboard,
not in the fingers that did the typing.]

tale@pawl.rpi.edu (David C Lawrence) (05/19/89)

In article <16202@sequent.UUCP> rbk@sequent.UUCP (Bob Beck) offers a
method for driving up the load average of vfork.  I tried a modified
version of his suggestion which is very simple, drives the load
average up and steps back down.  Simple perhaps isn't a good
enough word -- the header is as long as the programme.  Quite an
amusing thing to spring on an unsuspecting soul.  It of course shows
up in ps, but a couple of minutes of work could make it very
mysterious.  Load metres love it, too. :-)

/*                               -*- Mode: C -*- 
 * upload.c --- Tom-foolery for the load average
 * Author          : David C Lawrence
 * Created On      : Thu May 18 14:52:26 1989
 * Last Modified By: David C Lawrence
 * Last Modified On: Thu May 18 14:56:52 1989
 * Update Count    : 1
 * Status          : sinister
 */

#include <vfork.h>

int main()

{
  int forks=0;

  while (forks < 32) {
    (void)vfork();
    forks++;
  }
  sleep(20);
  _exit(0);
}
--
 (setq mail '("tale@pawl.rpi.edu" "tale@itsgw.rpi.edu" "tale@rpitsmts.bitnet"))
 (error "UUCP not spoken here.  Long bang paths unlikely to get mail replies.")