kannan@cerc.wvu.wvnet.edu (R. Kannan) (09/20/89)
Hai
We are faced with a strange problem. Even though we have the
int ChildInt ()
{
wait((union wait *)0);
}
....
....
signal(SIGCHLD, ChildInt);
....
....
when we really try to fork at a very rapid rate, we end up with
some defunct (ZOMBIE ) processes.
Is a SUN 4 environment. COuld some one explain to us what
could be wrong? and possible solutions.
Thanks very much ,
kannan
cpcahil@virtech.UUCP (Conor P. Cahill) (09/20/89)
In article <230@cerc.wvu.wvnet.edu.edu>, kannan@cerc.wvu.wvnet.edu (R. Kannan) writes: > We are faced with a strange problem. Even though we have the > int ChildInt () { wait((union wait *)0); } > > signal(SIGCHLD, ChildInt); > > when we really try to fork at a very rapid rate, we end up with > some defunct (ZOMBIE ) processes. > Is a SUN 4 environment. COuld some one explain to us what > could be wrong? and possible solutions. It *sounds like* you are loosing some of the sigchlds. Since you are running under the sun os I would recommend using the BSD signal handling mechanisms which should handle the problem. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
guy@auspex.auspex.com (Guy Harris) (09/23/89)
>It *sounds like* you are loosing some of the sigchlds. Since you are >running under the sun os I would recommend using the BSD signal handling >mechanisms which should handle the problem. Since he's running under SunOS, unless he's building his code in the System V environment he *is* using the BSD signal handling mechanisms, even if he's using "signal()". (Yes, "signal()" in BSD, and the BSD environment of SunOS, has BSD rather than V7 semantics.) The problem is that there isn't any guarantee in BSD that one SIGCHLD is delivered for each child process. The SIGCHLD handler should loop until there are no zombies to be picked up. For example, this is the SIGCHLD handler used in the "script" command (simplified a bit): #include <sys/wait.h> finish() { union wait status; while (wait3(&status, WNOHANG, 0) > 0) ; } The WNOHANG makes sure it doesn't block waiting for children that haven't exited yet.