rajarar@hubcap.clemson.edu (Bala Rajaraman) (04/13/90)
Hi, I have a question regarding <defunct> processes. The context of the question is as follows. I have a device simulator which is essentially used as a base for writing Operating Systems. The operating systems makes calls to procedures which simulate devices. The program uses a fork call to create a background process which simulates the delay and other features of different devices. The problem, I'm running into is that there may be a whole bunch of calls to devices causing a lot of forked processes. Not all the child processes are active all the time. However when a child process terminates it leaves something behind, which is called <defunct>. I can see that when I do a "ps -ax". These <defunct> processes cause the maximum quota of processes allowed to be exceeded. The program then fails since it can fork() no more. When child processes complete, they use exit() to terminate. I'm not sure if, I've gotten down all of the info needed for one of you UNIX wizards to see the problem. But in any case, I would be real grateful for an answer/ any additional info. I also need this in a real hurry. Thanks Bala rajarar@hubcap.clemson.edu Please E-mail all responses to: rajarar@hubcap.clemson.edu
penneyj@servio.UUCP (D. Jason Penney) (04/13/90)
In article <8716@hubcap.clemson.edu> rajarar@hubcap.clemson.edu (Bala Rajaraman) writes: [snip] >systems makes calls to procedures which simulate devices. The program >uses a fork call to create a background process which simulates the >delay and other features of different devices. > The problem, I'm running into is that there may be a whole bunch >of calls to devices causing a lot of forked processes. Not all the >child processes are active all the time. However when a child process >terminates it leaves something behind, which is called <defunct>. >I can see that when I do a "ps -ax". These <defunct> processes cause the >maximum quota of processes allowed to be exceeded. The program then >fails since it can fork() no more. [snip] Oh, the astonishing things that happen in pUnyx. What you're struggling with is the infamous "SIGCHLD" problem. To allow a process to complete its death, you must trap the SIGCHLD signal and then do a wait() to clear it. The specifics differ between BSD and SYSV: In BSD, SIGCHLD is generated whenever one OR MORE children need to be cleared. Thus, you should clear the child by using wait3() with the WNOHANG (and possibly WUNTRACED) flag(s). You should loop until the call to wait3() returns a non-positive value, meaning that all the children have been cleared. In SYSV, SIGCLD is "regenerated" at the time of signal handler exit if there are still children waiting to be cleared. In this environment it's sufficient to call wait() once and then return. Note that using signal() to install a handler should be shunned in favor of sigvec() if available, especially in the BSD case. The reason is that the signal handling for a trapped signal is set to SIG_IGN (I think, possibly SIG_DFL?) for the duration of the signal handler. This means that dying children may not be detected if they send their signal during a critical part of your signal handler. To contrast, sigvec() can "BLOCK" (queue up but don't deliver) signals until your signal handler exits. -- D. Jason Penney Ph: (503) 629-8383 Beaverton, OR 97006 uucp: ...uunet!servio!penneyj (penneyj@slc.com) "Talking about music is like dancing about architecture." -- Steve Martin
gwyn@smoke.BRL.MIL (Doug Gwyn) (04/14/90)
In article <413@servio.UUCP> penneyj@servio.UUCP (D. Jason Penney) writes: >Oh, the astonishing things that happen in pUnyx. What you're struggling with >is the infamous "SIGCHLD" problem. To allow a process to complete its >death, you must trap the SIGCHLD signal and then do a wait() to clear it. Wrong. It is not necessary to trap SIGCHLD to reap zombies. In fact I don't recommend relying on weird SIGCLD/SIGCHLD semantics for any purpose whatsoever. I emailed the original requestor a proper explanation and solutions.