venkat@sun.com (02/14/91)
I am sure this question was asked several times but I did not keep track of it. I have a daemon process which forks out to do various tasks and also notices the termination of its children by doing a signal(SIGCHLD, function) However when I do a ps, I find several defunct processes, apparently children of my daemon process. How do I prevent such processes being in the system? What exactly is a defunct process. I am running my processes on a Sun 4. Thanks. Venkat Rao International Imaging Systems sun!iis!venkat@uunet.uu.net
jik@athena.mit.edu (Jonathan I. Kamens) (02/14/91)
Here's something I wrote a while back to deal with the question of reaping child processes: Unfortunately, it's impossible to generalize how the death of child processes should behave, because the exact mechanism varies over the various flavors of Unix. Perhaps someone who's "in the know" (or at least more so than I am) about POSIX can tell us what the POSIX standard behavior (if there is any) for this is. First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix. That is, there is no flavor of Unix that I know of that will automatically flush child processes that exit, even if you don't do anything to tell it to do so. Second, allegedly, under some SysV-derived systems, if you do "signal(SIGCHLD, SIG_IGN)", then child processes will be cleaned up automatically, with no further effort in your part. However, people have told me that they've never seen this actually work; the best way to find out if it works at your site is to try it, although if you are trying to write portable code, it's a bad idea to rely on this in any case. If you can't use SIG_IGN to force automatic clean-up, then you've got to write a signal handler to do it. It isn't easy at all to write a signal handler that does things right on all flavors of Unix, because of the following inconsistencies: On some flavors of Unix, the SIGCHLD signal handler is called if one *or more* children have died. This means that if your signal handler only does one wait() call, then it won't clean up all of the children. Fortunately, I believe that all Unix flavors for which this is the case have available to the programmer the wait3() call, which allows the WNOHANG option to check whether or not there are any children waiting to be cleaned up. Therefore, on any system that has wait3(), your signal handler should call wait3() over and over again with the WNOHANG option until there are no children left to clean up. On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, it's safe on most SysV systems to assume when the signal handler gets called that you only have to clean up one signal, and assume that the handler will get called again if there are more to clean up after it exits. On older systems, signal handlers are automatically reset to SIG_DFL when the signal handler gets called. On such systems, you have to put "signal(SIGCHILD, catcher_func)" (where "catcher_func" is the name of the handler function) as the first thing in the signal handler, so that it gets reset. Unfortunately, there is a race condition which may cause you to get a SIGCHLD signal and have it ignored between the time your handler gets called and the time you reset the signal. Fortunately, newer implementations of signal() don't reset the handler to SIG_DFL when the handler function is called. The summary of all this is that on systems that have wait3(), you should use that and your signal handler should loop, and on systems that don't, you should have one call to wait() per invocation of the signal handler. Also, if you want to be 100% safe, the first thing your handler should do is reset the handler for SIGCHLD, even though it isn't necessary to do this on most systems nowadays. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
mike (02/14/91)
In an article, sun.com!iis!venkat writes: |I am sure this question was asked several times but I did not keep track |of it. I have a daemon process which forks out to do various tasks and |also notices the termination of its children by doing a | signal(SIGCHLD, function) |However when I do a ps, I find several defunct processes, apparently |children of my daemon process. How do I prevent such processes being in the |system? What exactly is a defunct process. I am running my processes on a |Sun 4. Thanks. A defunct process is a process that has exited but it's parent has not called wait() to get it's exit status. Somewhere in your function, you need to call wait(). -- Michael Stefanik | Opinions stated are not even my own. Systems Engineer, Briareus Corporation | UUCP: ...!uunet!bria!mike ------------------------------------------------------------------------------- technoignorami (tek'no-ig'no-ram`i) a group of individuals that are constantly found to be saying things like "Well, it works on my DOS machine ..."