lane@DUPHY4.DREXEL.EDU (Charles Lane) (03/16/88)
We've been having some problems lately with system hangups, and there is some suspicion that an async decnet (DDCMP) link could be the cause of it. VAX#1 VAX#2 <-sync decnet->[vax] 750 DDCMP 9600bps 750 +->(many nodes) <=ether= VMS 4.5 <------------------> VMS 4.4 <=ethernet====> (~5 nodes) DZ11 DZ11 Vax#2 hangs up when the DDCMP line is connected, about once every 1-2 weeks, typically on a Friday afternoon. (why this is, no one knows.... "the TGIF syndrome?") The line is not totally noise free, and will lose sync a couple of times a day, typically. But when it is going, the decnet link performs quite well. The modems have been replaced by better, less error-prone modems, but this seems to have no effect. VAX#1 is heavily loaded, but has never crashed or hung without the cause being obvious, and non-decnet related. VAX#2 is less heavily loaded. VAX#2 *seems* to be stable when the line is just a `terminal line' (although LOGINOUT is going constantly, since the other end is trying to start up the DDCMP line). This indicates that the hardware is probably not (completely) to blame. One thing that comes to mind is that perhaps VAX#2 is running short of some system resource, and adding the DDCMP is just enough to make something fail. I've looked at memory & page- & swapfile usage, and the margins seem to be quite adequate. Are there other resources/quotas that could cause the problem? So, is there anyone out there with some ideas or experience that could help track down the problem? This link is fairly important to us, and we really want to get it going....if the system hangups are caused by something else, then eliminating the decnet link as a possible source of problems is very desirable. --Chuck Lane cel@cithex.caltech.edu cel@cithex.bitnet
jeh@crash.cts.com (Jamie Hanrahan) (03/23/88)
In article <880315171454.001@DUPHY4.Drexel.Edu> cel@cithex.caltech.edu writes: > ... >Vax#2 hangs up when the [async] DDCMP line is connected, about once every 1-2 >weeks, typically on a Friday afternoon. You said you checked "memory", but I'm not sure if this means you've checked your nonpaged pool. Since you're running Ethernet your LRPSIZE should be set to 1504, and if any of your pool regions have expanded past their initial allocation into their "virtual" regions, set the initial allocation to whatever the expanded size is, plus maybe 25 to 50%. Do this in MODPARAMS.DAT and let AUTOGEN make any other required changes.