dsr@stl.stc.co.uk (David Riches) (02/14/90)
I have a program which spawns of a new program via the use of execl and communicates via pipes created using dup(). When this new program dies it seems to leave a Zombied process with <defunct> as its name in the process table. Q: Is the new program not dying properly? Q: How do I clean up the Z process? When the caller eventually dies then all the defunct processes are cleaned up. Since this is part of a design tool the nthere will be many defunct processes lying around before the tool is killed. Any answers will be much appreciated. Dave Riches PSS: dsr@stl.stc.co.uk ARPA: dsr%stl.stc.co.uk@earn-relay.ac.uk Smail: Software Design Centre, (Dept. 103, T2 West), STC Technology Ltd., London Road, Harlow, Essex. CM17 9NA. England Phone: +44 (0)279-29531 x2496
bogatko@lzga.ATT.COM (George Bogatko) (02/16/90)
I wrote this in earlier days, and it probably has screwed up some fine points, but I think it will explain what's happening: ________________________________________________________________________ 1. Overview_of_SIGCLD The SIGCLD or death of child signal is used to notify a parent that a child process has died. That is simple enough. What is not apparent from the manual pages is how death of child can be used to affect the status of the child process after it has died, or how, along with the wait system call, useful information can be obtained about that child process. Under normal circumstances, when the child process dies, it goes into the zombie state, which is to say that everything has been cleaned up except the useful information located in the process table, which remains until the parent itself dies. At that point, the child process information is cleaned up from the process table. If death of child is set by the parent to be ignored, then when the child process dies, the child does NOT enter this zombie state. Instead, all the information in the process table is cleaned up. 2. Overview_of_WAIT The wait() system call gathers information about zombie processes from the process table. If no such zombies exist, then the call returns -1. If a child exists but has not entered the zombie state, then the call blocks. If a child exists, but has entered the zombie state, then two things are returned: 1. The process id of the zombie. 2. Information about why the process stopped. Along with this, the zombie is cleaned out of the process table. If more than one zombie exists, then successive calls to wait() will retrieve the information for those zombies. 3. SIGCLD_and_WAIT In light of the above, consider what happens when death of child is being ignored, and the wait call has been invoked on one live child process, which subsequently dies and becomes a zombie. The result is that when the signal arrives during the blocked wait() The kernel first cleans up that zombie, and then wakes up the wait() call. Wait() now sees that there are no children to wait for, and returns -1. In the case of two live children, one of which dies and enters the zombie state, when the signal arrives for the blocked wait call, the kernal cleans up that one zombie, and then wakes up the wait() call, which now sees that there is still a live child being waited for, so it continues to block. The result of this is that if death of child is being ignored, the wait() system call will block until ALL the child processes are dead, and after that will return -1. (except on the 3b4000, running 5.3.1 - it releases no matter what). The case where death of child is elected to be caught is even more handy. EX: signal(SIGCLD, function_ptr) When a child dies, a zombie appears in process table, death of child is sent by the kernal, and the signal handling function is invoked. Presumably, in that signal handling function, a call to wait() can be made to see what process died, and why. In this way, child processes may be monitored and restarted if necessary. Notice that you don't have to be ready, or synchronize this in any way. You can take your time in issuing the wait() call. The zombie will hang around until you are ready to process it. 4. REFERENCES 1. UNIX Programmer's Reference Manual 2. The Design of the UNIX Operating System - Maurice J. Bach, pp. 210, top of page, and pp. 213-216 ******* I have a utility function that uses this to enable a sort of private 'inittab'. It will fork/exec a process (or just a plain function, with a 'fork' only) and monitor these children, respawning them if they died. (It also makes children immune to kill -9, since they don't have to notify the parent when they are dying). If you're interested, I'll send a copy. GB
btrue@emdeng.Dayton.NCR.COM (Barry.True) (02/16/90)
In article <2647@stl.stc.co.uk> dsr@stl.stc.co.uk (David Riches) writes: >I have a program which spawns of a new program via the use of execl >and communicates via pipes created using dup(). > >When this new program dies it seems to leave a Zombied process >with <defunct> as its name in the process table. > >Q: Is the new program not dying properly? > No. But in order to avoid a zombie process the parent must wait on the child. You might get around this by having a signal trap for the death of child signal which executes an ISR which issues a wait(). When the wait() is executed the zombie process created by the child's death will go away.
ravi@pds3 (Gorur R. Ravi) (02/17/90)
In article <2647@stl.stc.co.uk> dsr@stl.stc.co.uk (David Riches) writes: >I have a program which spawns of a new program via the use of execl >and communicates via pipes created using dup(). > >When this new program dies it seems to leave a Zombied process >with <defunct> as its name in the process table. > >Q: Is the new program not dying properly? Try this. It might work! signal(SIGCLD, SIGIGN); -- ================================= Gorur Ravi (ravi) Project Design Systems, Inc., 2231 Crystal Drive, Suite 1114,