[comp.unix.wizards] stopped jobs don't always disappear

berke@csd2.UUCP (Wayne Berke) (11/13/86)

Most UNIX kernels that support job control have a provision
for cleaning up stopped jobs when the user insists on logging
out without taking care of them.  4.2 does this within exit()
by checking if any of the exit'er's children have been suspended.
If so, it sends each such child a SIGHUP followed by a SIGCONT.

A problem occurs in cases when programs fork/exec with the
parent ignoring SIGHUP during its wait().  If this job is suspended,
peculiar things happen when the user logs out.  When the login
shell exits, the kernel notices the stopped child and sends it a
SIGHUP which is ignored.  The subsequent SIGCONT simply changes the
process state from stopped to wait/blocked.  The child's child is never
sent any signal.  This results in both processes hanging around with
the parent blocked and waiting for a child which is suspended.  The
simple program below will exhibit this behavior:

	#include <signal.h>
	main()
	{
		int status;

		if (fork()==0) {
			execl("/bin/cat", 0);
			exit(0);
		} else {
			signal(SIGHUP, SIG_IGN);
			wait(&status);
		}
	}

Unfortunately, there is also a commonly used utility which also does
this, namely f77.  The f77 driver for our system (maybe this has been
changed in the 4.3 distribution?) has the parent ignore SIGHUP, SIGQUIT,
SIGINT, and SIGTERM each time it does a wait().  Thus, when a careless
user CTRL-Z's an f77 job and logs out after ignoring the csh's
"You have stopped jobs" message, the f77 and whatever child has been
spun off remain and take up space in the proc table until the system
crashes or someone explicitly sends them a SIGKILL.  I guess the answer
is not to ignore SIGHUP.  By contrast, cc only ignores SIGTERM and SIGINT
when it waits so this doesn't occur.

davel@hpisoa1.HP.COM (Dave Lennert) (11/24/86)

> Most UNIX kernels that support job control have a provision
> for cleaning up stopped jobs when the user insists on logging
> out without taking care of them.  4.2 does this within exit()
> by checking if any of the exit'er's children have been suspended.
> If so, it sends each such child a SIGHUP followed by a SIGCONT.
> 
> A problem occurs in cases when programs fork/exec with the
> parent ignoring SIGHUP during its wait().  If this job is suspended,
> peculiar things happen when the user logs out.  When the login
> shell exits, the kernel notices the stopped child and sends it a
> SIGHUP which is ignored.  The subsequent SIGCONT simply changes the
> process state from stopped to wait/blocked.  The child's child is never
> sent any signal.  This results in both processes hanging around with
> the parent blocked and waiting for a child which is suspended.

You're right.  Exit() should probably send SIGHUP & SIGCONT to the
entire process group rather than just the immediate stopped children.
However, if some children (other processes in the process group) were
not currently stopped, this would kill them unexpectedly.  (Remember
that in 4.2 based systems, SIGHUP is usually sent to only *some* 
processes when a user logsoff.)

The bottom line is:  If a process ignores SIGHUP, it better know what 
it's doing.


    Dave Lennert                ucbvax!hpda!davel               [UUCP]
    Hewlett-Packard - 47UX      ihnp4!hplabs!hpda!davel         [UUCP]
    19447 Pruneridge Ave.       
    Cupertino, CA  95014        (408) 447-6325                  [AT&T]