simon%robots.oxford.ac.uk@nss.cs.ucl.ac.uk (Simon Turner) (03/11/89)
I recently ran into a property of the Bourne shell that confuses me somewhat. The scenario is this: Run a process as a great-grandchild of a shelltool, that kills its parent (the grandchild of the shelltool). The "reaper" process used to have its parent as PPID -- after it has killed it, it has PPID 1. So far, so good. The "reaper" now goes on to murder its former grandparent (nasty process, this). The death of the grandparent causes the shelltool to die (since its child has just been killed), with the friendly message we all know so well. Now comes the bit that confuses me. Once it has killed its parent, the reaper process has PPID 1. As I understand it, it should be a free agent -- the death of the shelltool (just after the reaper murders its former grandparent) should have no effect on the reaper. If the grandparent was an interactive csh or tcsh, this is the case -- after the shelltool dies, the reaper can carry on raping and pillaging as long as it likes. However, if the grandparent was an interactive *sh*, the reaper dies when the shelltool exits. Why? This behaviour can be observed with the /bin/sh scripts at the end of the article: "reaper" is the reaper (surprise!) and "reaper_p" is its parent. "reaper" keeps a log of what it is doing in the file "./reaper_log". Try them two ways from within Suntools (or SunView I suppose 8-): a) Set $SHELL to "/bin/csh" and type "shelltool". When the /bin/csh shelltool appears, enter it and type "reaper_p" to run the parent of the reaper. Sit back and watch the fireworks. After the shelltool has died and a few seconds have elapsed, look at the reaper's log file. You should find that the reaper kept going quite happily after the shelltool died (indicated by the presence of the "I'm still here..." and "Finished" messages). b) Set $SHELL to "/bin/sh" and perform the same operation. This time, when the dust has settled you should find that the reaper is dead (check it with ps x), but it never managed to write the "I'm still here..." or "Finished" messages -- it died just after it killed its former grandparent. I assume that this is something to do with sh remembering the PIDs of its descendants, and killing them off if it is killed itself -- which csh doesn't do. However, it is possible to run a background job in sh and then exit sh without killing the job (which *is* like csh). I am sure there is a perfectly sensible reason for it. I couldn't find any mention of it in TFM, but I must confess I didn't read them all from cover to cover. BTW, this is on various machines running 4.0.1. Any ideas? This is more out of interest than anything else, as I have found a way round the problem that prompted this discovery. Simon Simon Turner Robotics Research Group, Dept. of Engineering Science, University of Oxford, 19 Parks Road, Oxford, England. ( JANET: simon@uk.ac.oxford.robots ) ( ARPA: simon%uk.ac.oxford.robots@uk.ac.ucl.cs.nss ) ------ Cut here for "reaper" ------- #! /bin/sh logfile=./reaper_log # Find the PIDs of my parent and grandparent -- my grandparent is the # child of the shelltool window ppid=`ps l$$ | awk 'NR == 2 {print $4}'` gppid=`ps l$ppid | awk 'NR == 2 {print $4}'` echo "My PID is $$" >$logfile echo "My parent is PID $ppid" >>$logfile echo "My grandparent is PID $gppid" >>$logfile # Kill my parent kill -9 $ppid echo "Killed my parent (PID $ppid)" >>$logfile # What do I look like now? new_ppid=`ps l$$ | awk 'NR == 2 {print $4}'` echo "My PID is still $$, but my new PPID is $new_ppid" >>$logfile # Kill my erstwhile grandparent (the shelltool's child process) kill -9 $gppid echo "Killed my former grandparent (PID $gppid)" >>$logfile # Let that sink in sleep 2 echo "I'm still here..." >>$logfile sleep 5 # Finish echo "Finished" >>$logfile exit 0 ---------- End of "reaper" ---------- ------ Cut here for "reaper_p" ------ #! /bin/sh # Run the reaper -- expect to die shortly! reaper exit 0 ---------- End of "reaper_p" ---------
djk@uunet.uu.net (David James Keegel) (03/31/89)
by simon%robots.oxford.ac.uk@nss.cs.ucl.ac.uk (Simon Turner): ] ------ Cut here for "reaper_p" ------ ] #! /bin/sh ] ] # Run the reaper -- expect to die shortly! ] reaper ] exit 0 ] ---------- End of "reaper_p" --------- If you change `reaper' to `nohup reaper', you should find that the shelltool running sh will let the reaper survive. Alternatively, you may trap signals 1 (HUP) and 15 (TERM) inside reaper, to prevent its death. Here is a new log file with most (but not CHLD) traps caught: My PID is 6496 My parent is PID 6495 My grandparent is PID 6494 Killed my parent (PID 6495) My PID is still 6496, but my new PPID is 1 Killed my former grandparent (PID 6494) Reaper got signal 15 Reaper got signal 1 I'm still here... Reaper got signal 1 Finished I expect that the shell is sending signals to everyone in its process group when it gets killed; this would include reaper even though it has been `orphaned'. It was these signals which caused reaper to die. David Keegel (djk@munnari.oz)