jmr@motown.UUCP (John M. Ritter) (08/21/86)
[ *** ]
Well, here's my tale of woe...
I have a VAX 11/785, running System V 2.0v2, 3 RA-81 drives and 3 RA-60
drives. Also, there are 10 DZ-11 boards (count 'em) for 80 terminals and/or
printers. As there are RA drives, there's also the whole DEC UDA-50 mess.
The problem I have is, at irregular intervals, the system decides to lock
up some terminals. Whenever the terminals go, the always go in groups of 8,
so I can identify a single DZ board. Well, it's never the same board. The
ones that go may have anywhere from one to eight users at the time.
Naturally, all hardware passes all diagnostics test.
The processes on the system continue: even the users on the locked up
terminals are shown as active, although ALL their jobs just hang. It is
impossible to kill any of these user processes.
The only (fairly) consistent thing throughout this problem is system usage.
Although the number of users vary, (10 - 25 average through the day) the
system maintains a consistent level of usage: normally 70% usr, 30% sys,
0% waiting for i/o, 0% idle. We manage to keep it pretty busy all day
long.
Any ideas what could be causing this problem? If it were an intermittent
problem, I wouldn't be overly concerned, but I find myself re-booting the
system AT LEAST every other day.
All help and suggestions are greatly appreciated. Please reply via mail,
and if I ever get this thing figured out, I'll post the results.
------------------------------------------------------------------------------
"I enjoy working with human beings, and                         John M. Ritter
have stimulating relationships with them."                  Allied-Signal Inc.
                            - HAL 9000                Corporate Tax Department
 {bellcore,bunker,cuuxb,harpo,ihnp4,infopro,mergvax,princeton,sys1}!motown!jmr
------------------------------------------------------------------------------thk@uxrd1.UUCP (Tom Kiermaier ) (09/09/86)
The problem may be insufficient cooling. Voltage is reduced on the unibus when the temperature of the processor exceeds a set limit. Unfortunately, nothing tells you that is what's happening. There is no way diagnostics will catch this one. An easy test would be to direct more cool air to the machine and see if that clears up the problem. Tom Kiermaier
newsadm@uxrd1.UUCP (Netnews Administrator ) (09/09/86)
The problem may be caused by insufficient cooling. Voltage is reduced on the unibus when the temperature of the processor exceeds a set limit. Unfortunately, nothing tells you that is what's happening. There is no way that diagnostics will catch this one. Try directing more cool air to the machine. Vaxes are quite sensitive to high temperatures. Tom Kiermaier