[comp.sys.att] Help with system lockup

friedl@vsi.UUCP (Stephen J. Friedl) (08/20/88)

Hi folks,

     I've a customer with a 3B2/300 running Sys V Rel 3.  We just
upgraded their machine to use 4MB of RAM and we're having some
strangeness, the source of which I'm not sure.

     The system runs much better with the new RAM, but on
occassion it just locks up for several seconds at a time.  This
is, as you might imagine, annoying, and the customer has politely
inferred that this is somehow our doing.

     Running pmon shows that the system is very busy, with `cpu
split information' showing just User and Kernel time (no Idle at
all).  At lockup, however, the system goes into `Wait' state, and
the `Wait time breakdown' shows 100% I/O.

     Turning to the other screens, the system memory usage is
approaching 100% (4MB).  I never saw a lockup while looking at
the memory screen, but I'm guessing that hitting 100% was causing
this.  We also occasionally get `cannot fork' errors on various
terminals when memory is tight.

     Questions: First, what exactly is `wait' time.  I have a
general idea but not the specific definition.  Second, in a
virtual-memory machine, why should hitting the wall on physical
memory be such a big hit?  We have plenty of swap space split on
two spindles, and I would have assumed that paging activity would
rise slowly, not this suddenly.  I believe this may be related
to the virtual memory parameters, but I'm not that smart...

     Please, any insights into this would be really helpful.

     Steve

-- 
Steve Friedl    V-Systems, Inc.  +1 714 545 6442    3B2-kind-of-guy
friedl@vsi.com     {backbones}!vsi.com!friedl    attmail!vsi!friedl
---------Nancy Reagan on the Three Stooges: "Just say Moe"---------

randy@chinet.UUCP (Randy Suess) (08/22/88)

In article <813@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes:
>Hi folks,
>
>     I've a customer with a 3B2/300 running Sys V Rel 3.  We just
>upgraded their machine to use 4MB of RAM and we're having some
>strangeness, the source of which I'm not sure.
>     The system runs much better with the new RAM, but on
>occassion it just locks up for several seconds at a time.  This
>is, as you might imagine, annoying, and the customer has politely
>inferred that this is somehow our doing.

	Seems to be a problem with the 300.  Symptoms are that
	the system just *stops* for anywhere from 2 to 15 minutes.
	Then, all of a sudden, it just starts back up.  Seems to
	happen more often when a couple of uucico's are running.
	It doesn't happen with the 310/400 motherboard.  There 
	is a fix, called IDISK that I can send you.  Until then, 
	the cure seems to be to reduce your NBUF parameter to less 
	than 350.  Also, remove the sticky bit on all programs.  
	(not much use with a paging system, anyway).  With 4 megs 
	memory on chinet, until I upgraded to a 310 mother board, a 
	NBUF parameter of 250 seemed to be the highest I could go.
-- 

Randy Suess                 * But don't underestimate raw, frothing,  *
randy@chinet                * manic hardware.           -barry shein  *

len@netsys.COM (Len Rose) (08/22/88)

I would suggest trying the IDISK patch for one thing.
The problem may not be memory at all.. On dual disk
systems this used to happen alot.

Len


-- 
Len Rose - Netsys,Inc. 
len@ames.arc.nasa.gov  or len@netsys (soon to be netsys.COM)