lane@DUPHY4.DREXEL.EDU (Charles Lane) (03/16/88)
We've been having some problems lately with system hangups, and there
is some suspicion that an async decnet (DDCMP) link could be the cause
of it.
VAX#1 VAX#2 <-sync decnet->[vax]
750 DDCMP 9600bps 750 +->(many nodes)
<=ether= VMS 4.5 <------------------> VMS 4.4 <=ethernet====> (~5 nodes)
DZ11 DZ11
Vax#2 hangs up when the DDCMP line is connected, about once every 1-2
weeks, typically on a Friday afternoon. (why this is, no one
knows.... "the TGIF syndrome?") The line is not totally noise free,
and will lose sync a couple of times a day, typically. But when it
is going, the decnet link performs quite well.
The modems have been replaced by better, less error-prone modems, but
this seems to have no effect. VAX#1 is heavily loaded, but has never
crashed or hung without the cause being obvious, and non-decnet
related. VAX#2 is less heavily loaded.
VAX#2 *seems* to be stable when the line is just a `terminal line'
(although LOGINOUT is going constantly, since the other end is trying
to start up the DDCMP line). This indicates that the hardware is
probably not (completely) to blame.
One thing that comes to mind is that perhaps VAX#2 is running short of
some system resource, and adding the DDCMP is just enough to make
something fail. I've looked at memory & page- & swapfile usage, and
the margins seem to be quite adequate. Are there other
resources/quotas that could cause the problem?
So, is there anyone out there with some ideas or experience that could
help track down the problem? This link is fairly important to us, and
we really want to get it going....if the system hangups are caused by
something else, then eliminating the decnet link as a possible source
of problems is very desirable.
--Chuck Lane
cel@cithex.caltech.edu
cel@cithex.bitnetjeh@crash.cts.com (Jamie Hanrahan) (03/23/88)
In article <880315171454.001@DUPHY4.Drexel.Edu> cel@cithex.caltech.edu writes: > ... >Vax#2 hangs up when the [async] DDCMP line is connected, about once every 1-2 >weeks, typically on a Friday afternoon. You said you checked "memory", but I'm not sure if this means you've checked your nonpaged pool. Since you're running Ethernet your LRPSIZE should be set to 1504, and if any of your pool regions have expanded past their initial allocation into their "virtual" regions, set the initial allocation to whatever the expanded size is, plus maybe 25 to 50%. Do this in MODPARAMS.DAT and let AUTOGEN make any other required changes.