dave@rosevax.Rosemount.COM (Dave Marquardt) (11/20/86)
I've got a problem. We have a VAX-11/785 running 4.2BSD, and every two weeks or so, telnet and rlogin stop working. This is a BIG problem, because many of our users log in via Bridge terminal servers. The Bridge box is also no longer able to talk to the VAX, and disconnects all sessions to that machine. In checking out the problem, I tried using telnet from the VAX to another machine, and got this message: Out of buffer space. This message occurs even after the Bridge box disconnects most of the sessions in progress because it can't talk to the VAX any more. This makes me suspect that some networking code is not freeing up memory correctly (or perhaps not at all). Has anyone else seen this and come up with a fix or work-around? Any help would be appreciated. Please send mail to me. Also, if it helps any, we're using Excelan's EXOS 204 Ethernet controller and the driver provided by Excelan. If you need any further information, send me mail. Also, if there is interest, I'll post a summary later. Thanks in advance, Dave Marquardt -- Dave Marquardt Mail: dave@rosevax.Rosemount.COM Rosemount, Inc. Telephone: 612/828-3057 "It's a multipurpose shape -- a box."
sweeney@rust.dec.com (Glenn Sweeney) (11/26/86)
In article <742@rosevax.Rosemount.COM> dave@rosevax.Rosemount.COM (Dave Marquardt) writes: >I've got a problem. We have a VAX-11/785 running 4.2BSD, and every two weeks >or so, telnet and rlogin stop working. This is a BIG problem, because many of >our users log in via Bridge terminal servers. The Bridge box is also no >longer able to talk to the VAX, and disconnects all sessions to that >machine. > >In checking out the problem, I tried using telnet from the VAX to another >machine, and got this message: > > Out of buffer space. This problem exists because the system is failing to properly release mbufs after they are used. A workaround to this problem is to edit the file /sys/h/mbuf.h and change the value of NMBCLUSTERS from 256 to 1024, re-sysgen the kernel, and reboot. Glenn Sweeney DECwest Engineering Bellevue, WA., 98007 (206) 865-8738 sweeney%decwet.DEC@decwrl.DEC.COM
chris@mimsy.UUCP (Chris Torek) (11/30/86)
>In article <742@rosevax.Rosemount.COM> dave@rosevax.Rosemount.COM (Dave Marquardt) writes: >>I've got a problem. We have a VAX-11/785 running 4.2BSD, and every >>two weeks or so, telnet and rlogin stop working. ... In checking out >>the problem, I tried using telnet from the VAX to another machine, >>and got this message: Out of buffer space. In article <235@rust.dec.com>, sweeney@rust.dec.com (Glenn Sweeney) writes: >This problem exists because the system is failing to properly release >mbufs after they are used. I doubt this: A stock 4.2BSD system will panic with `panic: exit: m_getclr' very soon after running out of mbufs. There is at least one other place in a 4.2 kernel that generates `ENOBUFS' errors, and that is in the PSN (nee IMP) code in /sys/netimp. The workaround (increasing NMBCLUSTERS) may still be useful on a very busy system, where it is indeed possible to run out of space (and soon panic). But if you have a connection to a PSN, see if perhaps your routing tables have become confused. `netstat -r' prints the kernel's routing tables. (Better yet, convert to 4.3BSD and fix all those lurking bugs at once. It seems significant that only a handful of fixes have been posted for 4.3BSD since its release, whereas 4.2 generated a veritable flood. . . .) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu