tml@hemuli.UUCP (Tor Lillqvist) (03/05/89)
We are having strange problems on our HP9000/840 running HP-UX 2.1. Sometimes init suddenly starts ignoring users logging out, and doesn't start new getty processes, i.e. you cannot log in again. The utmp file says that the user is still logged in. This spreads to all lines as users log out. The only remedy is to find a line that hasn't been jammed, log in as root, and /etc/shutdown -r ... Somehow I suspect that the hpdbdaemon or hpimage processes have something to do with it, as this `feature' has appeared often lately when our HPIMAGE usage has increased as deadlines come closer and we keep testing our software ... HPIMAGE causes lots of grief in other ways, too, but that's another story. By the way, what is this "Fetch data for Tony Lukes" etc you see if you do "strings </usr/bin/hpimage"? -- Tor Lillqvist Technical Research Centre of Finland, Computing Services tml%hemuli.uucp@santra.hut.fi
tml@hemuli.UUCP (Tor Lillqvist) (03/08/89)
The problem I described in the referenced article is getting out of hands. Lately we have had to reboot our system once a day because of missing gettys. After the last boot I started this script so that I would notice when it happens again: #!/bin/sh TTY=ttyd21 SLEEP=120 while true; do PSLINE="`ps -ef | grep getty\ $TTY | grep -v grep`" if [ "$PSLINE" ]; then set $PSLINE kill $2 sleep $SLEEP else WHOLINE="`who | grep $TTY`" if [ -z "$WHOLINE" ]; then mail tml <<!EOF! No getty on line $TTY! !EOF! exit else sleep $SLEEP fi fi done I didn't have to wait long. After a couple of hours, it fired. When I went to the console, I noticed this message: HPDBdaemon (/usr/vdx ): One user process terminated abnormally HPDBdaemon (/usr/vdx ): while in critical section. HPDBdaemon (/usr/vdx ): Processes of this hpdb will be killed. Maybe I should describe the background a little: We are developing a software system consisting of 20+ processes, many using HPIMAGE (written in Pascal, ported from the HP1000...). We have noticed that often all the processes suddenly die of a SIGKILL (according to /usr/lib/acct/acctcom -f) without any reasonable cause, leaving only a couple of hpimages and hpdbdaemon, which soon disappear. Sometimes when I start the system from an xterm window, running ksh, even the ksh and xterm die! -- Tor Lillqvist Technical Research Centre of Finland, Computing Services tml%hemuli.uucp@santra.hut.fi