[comp.sys.sun] csh is nicer to its family than sh

simon%robots.oxford.ac.uk@nss.cs.ucl.ac.uk (Simon Turner) (03/11/89)

I recently ran into a property of the Bourne shell that confuses me
somewhat.  The scenario is this:

Run a process as a great-grandchild of a shelltool, that kills its parent
(the grandchild of the shelltool).  The "reaper" process used to have its
parent as PPID -- after it has killed it, it has PPID 1.  So far, so good.
The "reaper" now goes on to murder its former grandparent (nasty process,
this).  The death of the grandparent causes the shelltool to die (since
its child has just been killed), with the friendly message we all know so
well.  Now comes the bit that confuses me.

Once it has killed its parent, the reaper process has PPID 1.  As I
understand it, it should be a free agent -- the death of the shelltool
(just after the reaper murders its former grandparent) should have no
effect on the reaper.  If the grandparent was an interactive csh or tcsh,
this is the case -- after the shelltool dies, the reaper can carry on
raping and pillaging as long as it likes.  However, if the grandparent was
an interactive *sh*, the reaper dies when the shelltool exits.  Why?

This behaviour can be observed with the /bin/sh scripts at the end of the
article: "reaper" is the reaper (surprise!) and "reaper_p" is its parent.
"reaper" keeps a log of what it is doing in the file "./reaper_log".  Try
them two ways from within Suntools (or SunView I suppose 8-):

	a) Set $SHELL to "/bin/csh" and type "shelltool".  When the
		/bin/csh shelltool appears, enter it and type
		"reaper_p" to run the parent of the reaper.  Sit back
		and watch the fireworks.  After the shelltool has
		died and a few seconds have elapsed, look at the
		reaper's log file.  You should find that the reaper
		kept going quite happily after the shelltool died
		(indicated by the presence of the "I'm still here..."
		and "Finished" messages). 

	b) Set $SHELL to "/bin/sh" and perform the same operation.
		This time, when the dust has settled you should find
		that the reaper is dead (check it with ps x), but it
		never managed to write the "I'm still here..." or
		"Finished" messages -- it died just after it killed
		its former grandparent. 

I assume that this is something to do with sh remembering the PIDs of its
descendants, and killing them off if it is killed itself -- which csh
doesn't do.  However, it is possible to run a background job in sh and
then exit sh without killing the job (which *is* like csh).  I am sure
there is a perfectly sensible reason for it.  I couldn't find any mention
of it in TFM, but I must confess I didn't read them all from cover to
cover.  BTW, this is on various machines running 4.0.1.

Any ideas?  This is more out of interest than anything else, as I have
found a way round the problem that prompted this discovery.

Simon

Simon Turner
Robotics Research Group, Dept. of Engineering Science,
University of Oxford, 19 Parks Road, Oxford, England.
                                  ( JANET: simon@uk.ac.oxford.robots )
                  ( ARPA: simon%uk.ac.oxford.robots@uk.ac.ucl.cs.nss )

------ Cut here for "reaper" -------
#! /bin/sh

logfile=./reaper_log

# Find the PIDs of my parent and grandparent -- my grandparent is the
# child of the shelltool window
ppid=`ps l$$ | awk 'NR == 2 {print $4}'`
gppid=`ps l$ppid | awk 'NR == 2 {print $4}'`
echo "My PID is $$" >$logfile
echo "My parent is PID $ppid" >>$logfile
echo "My grandparent is PID $gppid" >>$logfile

# Kill my parent
kill -9 $ppid
echo "Killed my parent (PID $ppid)" >>$logfile

# What do I look like now?
new_ppid=`ps l$$ | awk 'NR == 2 {print $4}'`
echo "My PID is still $$, but my new PPID is $new_ppid" >>$logfile

# Kill my erstwhile grandparent (the shelltool's child process)
kill -9 $gppid
echo "Killed my former grandparent (PID $gppid)" >>$logfile

# Let that sink in
sleep 2
echo "I'm still here..." >>$logfile
sleep 5

# Finish
echo "Finished" >>$logfile

exit 0
---------- End of "reaper" ----------

------ Cut here for "reaper_p" ------
#! /bin/sh

# Run the reaper -- expect to die shortly!
reaper
exit 0
---------- End of "reaper_p" ---------

djk@uunet.uu.net (David James Keegel) (03/31/89)

by simon%robots.oxford.ac.uk@nss.cs.ucl.ac.uk (Simon Turner): 
] ------ Cut here for "reaper_p" ------
] #! /bin/sh
] 
] # Run the reaper -- expect to die shortly!
] reaper
] exit 0
] ---------- End of "reaper_p" ---------

If you change `reaper' to `nohup reaper', you should find that the
shelltool running sh will let the reaper survive.  Alternatively, you may
trap signals 1 (HUP) and 15 (TERM) inside reaper, to prevent its death.
Here is a new log file with most (but not CHLD) traps caught:

My PID is 6496
My parent is PID 6495
My grandparent is PID 6494
Killed my parent (PID 6495)
My PID is still 6496, but my new PPID is 1
Killed my former grandparent (PID 6494)
Reaper got signal 15
Reaper got signal 1
I'm still here...
Reaper got signal 1
Finished

I expect that the shell is sending signals to everyone in its process
group when it gets killed; this would include reaper even though it has
been `orphaned'. It was these signals which caused reaper to die.

			David Keegel    (djk@munnari.oz)