[net.unix-wizards] Intermittant lockup of VAX

wayne@cylixd.UUCP (Wayne Steinmetz) (09/24/85)

Here's the problem.  On a very intermittant occasion, our 785 will hang
anywhere from 5 to 15 minutes.  During this time, nothing entered at
either a terminal or the console will be processed even though characters
are echoed back showing what you typed.  After the elapsed time has passed,
all processing continues as if nothing had ever happened (including a
flood of responses to the commands you typed in while the system was hung).

We are running 4.1BSD on a VAX 11/780 converted to a 785.  Our root drive
is an RM03.  Both the root drive and the tape drive (a TU16 - ugh) share MBA0.
We are using SI 9751's on a SI 9900 controller as our peripheral drives.
We are also running with 12M of memory.

This problem started when we were just a 780 with 3 RM03's and the tape drive
on its own buss, so I don't think the hardware configuration is the problem.

If this type of problem has ever been experienced by anyone out there, we
sure would like to hear about.  Since it's so intermittant, it's very
difficult to get our local FE's in on the problem.


						STUMPED!

acheng@uiucdcs.CS.UIUC.EDU (09/25/85)

>/* Written  9:56 am  Sep 24, 1985 by wayne@cylixd.UUCP in uiucdcs:net.unix-wizar */
>/* ---------- "Intermittant lockup of VAX" ---------- */
>Here's the problem.  On a very intermittant occasion, our 785 will hang
>anywhere from 5 to 15 minutes.  During this time, nothing entered at
>either a terminal or the console will be processed even though characters
>are echoed back showing what you typed.  After the elapsed time has passed,
>all processing continues as if nothing had ever happened (including a
>flood of responses to the commands you typed in while the system was hung).
>

We have experienced something like that when we installed a tri-density
tape drive on one of our 750's.  Whenever the tape access was done
and rewinding, the whole system froze until the spinning back had
completed.  The bug was a hardware type in the tape controller.
The manufacturer later sent us a revision and it is all okay now.
Your problem may not be of tape drive but it is most likely
hardware.

----------------------------------------------------------------------
Albert Cheng
acheng@UIUC.ARPA	acheng@UIUC.CSNET	{ihnp4,pur-ee}!uiucdcs!acheng
Dept. of Computer Science, Univ. of Illinois-Urbana,
Rm. 240, 1304 W. Springfield, Urbana, IL 61801

%%% The above is the opinion of my own %%%
%%% and not necessarily that of the management. %%%

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (09/27/85)

> Here's the problem.  On a very intermittant occasion, our 785 will hang
> anywhere from 5 to 15 minutes.  During this time, nothing entered at
> either a terminal or the console will be processed even though characters
> are echoed back showing what you typed.  After the elapsed time has passed,
> all processing continues as if nothing had ever happened (including a
> flood of responses to the commands you typed in while the system was hung).

I don't know if this is it, but we have had trouble with FREEMEM
delays when some core hog (SmallTalk in our case) causes the paging
disk to be rearranged.  This hasn't lasted more than a minute or so
when I've noticed it here.

Down with kernel garbage collection!

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (09/28/85)

> We have experienced something like that when we installed a tri-density
> tape drive on one of our 750's.  Whenever the tape access was done
> and rewinding, the whole system froze until the spinning back had
> completed.  The bug was a hardware type in the tape controller.
> The manufacturer later sent us a revision and it is all okay now.
> Your problem may not be of tape drive but it is most likely
> hardware.

Older DEC magtape interfaces would generate two interrupts on rewind,
one when the controller had accepted the command and one when the
rewind was complete.  There have been many magtape device drivers that
don't handle this very well; I recall in the first release of VMS
one could take a TE-16 off-line manually while it was rewinding and
cause the whole operating system to freeze until the rewind completed.

mikel@codas.UUCP (Mikel Manitius) (10/06/85)

> > Here's the problem.  On a very intermittant occasion, our 785 will hang
> > anywhere from 5 to 15 minutes.  During this time, nothing entered at
> > either a terminal or the console will be processed even though characters
> > are echoed back showing what you typed.  After the elapsed time has passed,
> > all processing continues as if nothing had ever happened (including a
> > flood of responses to the commands you typed in while the system was hung).
> 
> I don't know if this is it, but we have had trouble with FREEMEM
> delays when some core hog (SmallTalk in our case) causes the paging
> disk to be rearranged.  This hasn't lasted more than a minute or so
> when I've noticed it here.
> 
> Down with kernel garbage collection!

Check to see if you have any high speed lines connected to an
interrupt driven port, or even a 1200 baud line could do it,
we once had the probelm at RPI with a Vax 11/780, an IBM 3081D
was spitting junk at our interrupt driven port and would completely
monopolize the CPU.
-- 
                                        =======
     Mikel Manitius                   ==----=====    AT&T
     (305) 869-2462 RNX: 755         ==------=====   Information Systems 
     ...{akguc|ihnp4}!codas!mikel    ===----======   SDSS Regional Support
     ...attmail!mmanitius             ===========    Altamonte Springs, FL
     My opinions are my own.            =======