root@aut.UUCP (root) (04/09/90)
HELP! We have been having some problems with our Tower 32/650 running release 020101. Optional software insalled includes INFORMIX, X.25, ethernet. The symptoms of the (maybe more than one) problem appear as follows. A. `sar` shows 0 idle time (usually because the machine is buried running `rnews`). getty processes are not running on unused ports(those that had need to be respawned). `ps` shows INITSH processes which are the attempts to respawn the gettys. console messages say "process respawning too rapidly on ttyXX". `sar -A` shows fork/s = exec/s = 0.00 B. like scenario A but gettys do start, but ksh invocation fails and dumps core upon login. If /bin/sh is the login shell we have no problem logging in. We use an old version of ksh, we are getting ksh-88. C. Our ksh suddenly gets sick and all using it start getting messages like "No space", sometimes the ksh dumps core and throws us out of the system. Rebooting the machine and running fsck(almost always filesystem errors) clears up the problems. This happens only on the machine that is configured as described above, none of our customers has this problem. I don't think our problems are just caused by our old version of ksh, there is probably something mistuned in the kernal. I have the "Tower Tuning Guide" but I'm still not sure where my problem is. Anybody else had these problems? Any tips on how to debug these problems would be appreciated. The next time this happens, I will try to poke around with `crash`. Since this a possible Kernal memory problem (ksh no space message) what might I look at with `crash`? Thanks in advance.