[comp.sys.hp] Lines jamming on 9000/840, HP-UX 2.1

tml@hemuli.UUCP (Tor Lillqvist) (03/05/89)

We are having strange problems on our HP9000/840 running HP-UX 2.1.
Sometimes init suddenly starts ignoring users logging out, and doesn't
start new getty processes, i.e. you cannot log in again.  The utmp
file says that the user is still logged in.  This spreads to all lines
as users log out.  The only remedy is to find a line that hasn't been
jammed, log in as root, and /etc/shutdown -r ...

Somehow I suspect that the hpdbdaemon or hpimage processes have
something to do with it, as this `feature' has appeared often lately
when our HPIMAGE usage has increased as deadlines come closer and we
keep testing our software ...  HPIMAGE causes lots of grief in other
ways, too, but that's another story.  By the way, what is this "Fetch
data for Tony Lukes" etc you see if you do "strings
</usr/bin/hpimage"?
-- 
Tor Lillqvist
Technical Research Centre of Finland, Computing Services
tml%hemuli.uucp@santra.hut.fi

tml@hemuli.UUCP (Tor Lillqvist) (03/08/89)

The problem I described in the referenced article is getting out of
hands.  Lately we have had to reboot our system once a day because of
missing gettys.  After the last boot I started this script so that I
would notice when it happens again:

#!/bin/sh

TTY=ttyd21
SLEEP=120
while true; do
	PSLINE="`ps -ef | grep getty\ $TTY | grep -v grep`"
	if [ "$PSLINE" ]; then
		set $PSLINE
		kill $2
		sleep $SLEEP
	else
		WHOLINE="`who | grep $TTY`"
		if [ -z "$WHOLINE" ]; then
			mail tml <<!EOF!
No getty on line $TTY!
!EOF!
			exit
		else
			sleep $SLEEP
		fi
	fi
done

I didn't have to wait long.  After a couple of hours, it fired.  When
I went to the console, I noticed this message:

HPDBdaemon (/usr/vdx ): One user process terminated abnormally
HPDBdaemon (/usr/vdx ): while in critical section.
HPDBdaemon (/usr/vdx ): Processes of this hpdb will be killed.

Maybe I should describe the background a little: We are developing a
software system consisting of 20+ processes, many using HPIMAGE
(written in Pascal, ported from the HP1000...).

We have noticed that often all the processes suddenly die of a SIGKILL
(according to /usr/lib/acct/acctcom -f) without any reasonable cause,
leaving only a couple of hpimages and hpdbdaemon, which soon
disappear.

Sometimes when I start the system from an xterm window, running ksh,
even the ksh and xterm die!
-- 
Tor Lillqvist
Technical Research Centre of Finland, Computing Services
tml%hemuli.uucp@santra.hut.fi