[comp.unix.questions] question - 4bsd ^Z handling and stopping a pipe

mkhaw@teknowledge-vaxc.UUCP (06/21/87)

I've found that when I display a message in elm and ^Z at the "more" prompt,
I'm hung.  This seems to happen because elm is using a pipe to run the pager
program.  If I use "ps" from another tty to look at the hung login, I see
a parent elm process with GARBAGE in the "ps" "COMMAND" field in device-wait
state, plus a child elm, whose child is a "sh -c more", whose child is the
"more" process that I stopped.  The "sh -c" and "more" process are in stopped
state.

The only way I've found to recover is to kill the "more" and "sh -c" both.

Are there any 4bsd gurus who can explain what a SIGTSTP handler needs to do
to prevent this?  Currently Elm's SIGTSTP handler just restores default
handling (SIG_DFL) and then does a kill(0, SIGTSTP).

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {hplabs|sun|ucbvax|decwrl|sri-unix}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

mouse@mcgill-vision.UUCP (der Mouse) (06/26/87)

In article <13907@teknowledge-vaxc.ARPA>, mkhaw@teknowledge-vaxc.ARPA (Michael Khaw) writes:
> I've found that when I display a message in elm and ^Z at the "more"
> prompt, I'm hung.  [analysis]

> Are there any 4bsd gurus who can explain what a SIGTSTP handler needs
> to do to prevent this?  Currently Elm's SIGTSTP handler just restores
> default handling (SIG_DFL) and then does a kill(0, SIGTSTP).

Then I would say that elm is broken.  It should be doing a
kill(getpid(),SIGTSTP) instead of kill(0,SIGTSTP).  The latter form
sends the signal to the entire process group, all of which have already
received it from the user's keystroke and may misinterpret another
SIGTSTP.

But that's not why I think your elm is hanging.  The failure mode I
expect is occurring is that more, or the shell you saw running with the
-c option, probably takes over the process group of the terminal.  Then
when you type ^Z, the more and its shell get stopped.  The elm,
however, never sees the signal, 'cause its process group is different
(remember, someone stole the tty).  Now nobody is listening to your
keystrokes!  More and its shell are suspended, elm is waiting for the
shell it forked to exit (as opposed to just stop), and your top-level
shell is waiting for elm to do something (anything).  Killing the more
and shell wake the elm back up and everybody recovers.

					der Mouse

				(mouse@mcgill-vision.uucp)