[net.unix-wizards] background processes under 4.2bsd

jas@druxy.UUCP (ShanklandJA) (02/19/84)

Question:  How do you start a background process at login time
that is guaranteed to terminate when the user logs out (or is
logged out)?

Recently, I was assisting my neighbor -- not a computer person --
who is writing a thesis on a 4.2bsd UNIX system.  He wanted his work
periodically backed up to a separate directory.  I wrote him a little
4 or 5 line csh script, invoked from his .login file, that would sleep
for 20 minutes, and then copy all files that had been modified in the
last 20 minutes into a backup directory.  The idea was that when he
logged off, the shell executing this script would be sent a SIGHUP
and die.  (That's the way it works under 5.0 USG, which is what I
work with.)

Wasn't long before he started getting nasty mail from the system
programmers about leaving lots of background processes going when
he logged off -- seems that was somehow tying up tty lines that were still
associated with processes of his, not to mention cluttering up the process
table, using a little CPU time, etc.  So we found out that csh nohup's
background processes, and wrote him a .logout file that just said
"kill %1".

Soon, he started getting more nasty mail from the system programmers.
As it turned out, it worked great when he logged himself out; but if
the phone line got dropped for some reason, or if csh logged him out
for sitting idle for too long, csh exited without executing .logout.
(I'd call this a bug, myself.)

So we started looking into ways to make a csh script die when it gets
a SIGHUP (the csh and 4.2bsd equivalent of 'trap 1').  We found 'onintr',
which resets interrupt handling to the default action.  Once I realized
that "interrupt" meant not just SIGINT, but all signals, we seemed to
be home free.  Sure enough, when we sent the background processes
a "kill -1" to test it, csh said something like:

[1] makebackup -- Hangup

Turns out that still didn't work.  For some reason, when the phone line
gets dropped, those processes STILL keep running; apparently, they
don't get a SIGHUP.  So my neighbor has had to abandon the backup script
I wrote for him, and wonders aloud what's so great about this UNIX system
I keep praising, anyway, when it can't even do a simple thing like kill
all his processes when he logs out.

As an aside, a few more apparent bugs:  if he types 'logout', and
his .logout file contains only 'kill %1', and there is no background
job running, csh complains, "%1: No such job" (or something to that
effect), AND WON'T LOG HIM OUT!  He has to start some sort of a
background job running so that csh can kill it and log him out.
(Just great, when the user involved knows very little about computers.)
Furthermore, if I invoke a csh script called "junk" that does a
"while (1) sleep 30" so that .logout will have a job to kill,
and remove the file "junk" before logging out, .logout apparently
never gets executed, and THOSE processes keep running forever.

Now:  WHAT KIND OF CHEESY CRAP IS THIS?  Is there a hardware problem
preventing delivery of SIGHUP when the line gets dropped (though
the login csh dies, all right -- just the background processes live
on)?  Is there some rational explanation for the two apparent bugs
I described in the preceeding paragraph?  Is there some easy way
to do what I've been trying to do?  Or is something that is simple
to the point of triviality under USG 5.0 impossible under 4.2bsd?

Don't misunderstand me:  I'm not going out of my way to dump on Berkeley
UNIX -- I'm writing this article in the hope that someone will come
up with a rational explanation for all this, and set my mind at ease.
But the way it looks right now, Berkeley UNIX is full of genuinely
inspired features, half of which work if the phase of the moon is right.
If that's the way it really is, I'll take plain, vanilla, reliable
5.0 USG any day.

Hoping to be set straight....

Jim Shankland
..!ihnp4!druxy!jas

chris@umcp-cs.UUCP (02/21/84)

[Recap: problem is to kill a background process automatically when
logged out.]

The problem is that those background processes are assigned separate
process groups.  This is a very nice feature (but seems to be hard
for people to understand).  The hangup signal sent when you get
logged out is sent to the process group of the terminal.  This is
not the same as the group for the background command (if it's still
in the background) and the bg command never sees the signal.

Possible solutions:

- Use getlogin() to see if you've been logged out.  This is how the
leave(1) program works.

- Use the Bourne shell as your login shell.  The Bourne shell doesn't
manipulate process groups (at least in 4.1), so merely trapping the
hangup signal will do the trick.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@CSNet-Relay

jonab@sdcrdcf.UUCP (Jonathan Biggar) (02/21/84)

In article <989@druxy.UUCP> jas@druxy.UUCP (ShanklandJA) writes:
>Question:  How do you start a background process at login time
>that is guaranteed to terminate when the user logs out (or is
>logged out)?

You could have a line like:

if ("`tty`" == "not a tty") exit

in your loop.  This will test stderr to see if it is still
attached to a terminal, and if not, will cause the program to exit.

This is only tested on 4.1, but I would assume it would work for
4.2 also.

Jon Biggar
{allegra,burdvax,cbosgd,hplabs,ihnp4,sdccsu3,trw-unix}!sdcrdcf!jonab

martin@nosccod.UUCP (02/24/84)

     In most cases, it is an advantage that background processes are not
interrupted when the login shell dies.  One very simple way to kill a
background job at logout, is to pass the
process id of the parent shell (login shell) to the job as an argument.
E.g. run 'prog $$ &' where $argv[1] contains the process id of the parent.
If the background job doesn't have much to do, periodically check
to see if the parent process is still around, and if the background process is orphaned, kill it.
Having the background job monitor the status of the login process is
time consuming, but it works.
Doug Martin        martin@nosc

ss@wivax.UUCP (Sid Shapiro) (02/25/84)

[Recap: problem is to kill a background process automatically when
logged out.]

Another solution (I have just been reading Kernighan and Pike's book,
and I have become inspired to write shell scripts!)
Use a shell script to do various pipes with ps, grep, sed, tail, etc.
to isolate the PID of the process you wish to kill and use kill to
kill it.  I use one and it works just fine.  I have the script 
called by .logout.  It is a Bourne shell script.  It took me less
than 10 minutes to write!


Sid Shapiro -- Wang Institute of Graduate Studies
    [cadmus, decvax, linus, apollo, bbncca]!wivax!ss
    ss.Wang-Inst@Csnet-Relay 
	  (617)649-9731

jas@druxy.UUCP (02/29/84)

This is a summary of answers to my question:  how do you start a
background process under 4.2bsd that is guaranteed to terminate
when you log out?

Recap:  any solution having to do with .logout is no good, because
csh inexplicably fails to look at .logout if the line is dropped.
(Actually, it's pretty explicable:  csh probably doesn't catch SIGHUP,
and never knows what hit it when the line drops.)  The background
process doesn't get SIGHUP'ed for two reasons:  first, csh nohup's
background processes.  Second, the background process is in a different
process group, and is never sent a SIGHUP when the line drops.

The plausible solutions fell into two categories.  The first involvee
putting the process into the background "by hand", more or less as follows:

signal( SIGINT, SIG_IGN );
signal( SIGQUIT, SIG_IGN );
if ( fork() )
    exit( 0 );
/* Invoke the "background" process here */

csh thinks of this as a foreground job that has terminated.  The child
process is in the same process group as the login shell, and thus gets
the SIGHUP when the line is dropped.

The second category involved polling the parent process id periodically.
This can even be done entirely from a shell, by doing a ps <ppid>, where
<ppid> is the process id of the login shell, and checking the exit
status of the ps.

Oh, yes -- a third solution is to make the Bourne shell your login shell.

Conclusions:  in my opinion, best summed up by the respondent who wrote,
"there is ... confusion between process groups and terminal groups."
When a job is put into the background, the connection between it and
the control terminal is broken to the extent of ignoring terminal signals.
But some connection remains, for having the background process continue
running interferes in some unspecified way with users wishing to log
onto the terminal whose line was dropped.

It is fine for background jobs to be put into a different process group.
It is even fine for background jobs not to die when the user logs out,
AS A DEFAULT.  It is NOT fine for there to be no way to override that
default (short of polling for existence of the login shell, which strikes
me as ugly) without resorting to C.  The problem is further aggravated
by csh's buggy handling of .logout.

Many thanks to all those who responded; I appreciate the information
you've provided.

Jim Shankland
..!ihnp4!druxy!jas