[comp.unix.questions] SYS V SIGCLD Handling

scjones@sdrc.UUCP (Larry Jones) (11/20/89)

Consider the following simple program:

#include <signal.h>

handler()
   {
   signal(SIGCLD, handler);
   }

main()
   {
   signal(SIGCLD, handler);
   if (fork() == 0) exit(0);
   }

When I run this on my Sys Vr3.0 system, it goes into recursive
death.  As soon as the handler calls signal to reestablish
itself, it is immediately reinvoked.  Am I doing something wrong,
is my Sys V broken, or is Sys V just generally broken?
----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                      scjones@SDRC.UU.NET
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150-2789             AT&T: (513) 576-2070
"You know how Einstein got bad grades as a kid?  Well MINE are even WORSE!"
-Calvin

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/20/89)

In article <957@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
>When I run this on my Sys Vr3.0 system, it goes into recursive
>death.  As soon as the handler calls signal to reestablish
>itself, it is immediately reinvoked.  Am I doing something wrong,
>is my Sys V broken, or is Sys V just generally broken?

I think whoever designed the SIGCLD facility wanted you to wait()
before reenabling the SIGCLD catcher.

I think the whole notion of signals is broken, but there's not
much point in getting into that..

cpcahil@virtech.uucp (Conor P. Cahill) (11/20/89)

In article <957@sdrc.UUCP>, scjones@sdrc.UUCP (Larry Jones) writes:
> #include <signal.h>
> handler()
>    {    signal(SIGCLD, handler);    }
> 
> main()
>    {  signal(SIGCLD, handler);  if (fork() == 0) exit(0);   }
> 
> When I run this on my Sys Vr3.0 system, it goes into recursive
> death.  As soon as the handler calls signal to reestablish
> itself, it is immediately reinvoked.  Am I doing something wrong,

Yes.  You need to wait for the child in the handler.  If you have more
than one possible child, you need to wait for however many of them
there could be (usually you use alarm() to time out the wait).


-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

dougm@queso.ico.isc.com (Doug McCallum) (11/21/89)

In article <1989Nov20.124459.26023@virtech.uucp> cpcahil@virtech.uucp (Conor P. Cahill) writes:
   In article <957@sdrc.UUCP>, scjones@sdrc.UUCP (Larry Jones) writes:
...
   > When I run this on my Sys Vr3.0 system, it goes into recursive
   > death.  As soon as the handler calls signal to reestablish
   > itself, it is immediately reinvoked.  Am I doing something wrong,

   Yes.  You need to wait for the child in the handler.  If you have more
   than one possible child, you need to wait for however many of them
   there could be (usually you use alarm() to time out the wait).

You only need to wait for one child in the signal handler.  With SYSV,
the SIGCLD handler will be called once for each child process that
dies.  By waiting for just one and returning, you will get called for
each exiting child with no need to timeout with an alarm.

Another difference between SYSV and BSD handling of SIGCLD/SIGCHLD is
that with SYSV if you set the signal handler to SIG_IGN, the zombie
will be handled without being waited for.

caf@omen.UUCP (Chuck Forsberg WA7KGX) (11/21/89)

In article <11656@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
:I think the whole notion of signals is broken, but there's not
:much point in getting into that..

OK, what should replace the notion of signals??

Chuck Forsberg WA7KGX          ...!tektronix!reed!omen!caf 
Author of YMODEM, ZMODEM, Professional-YAM, ZCOMM, and DSZ
  Omen Technology Inc    "The High Reliability Software"
17505-V NW Sauvie IS RD   Portland OR 97231   503-621-3406
TeleGodzilla:621-3746 FAX:621-3735 CIS:70007,2304 Genie:CAF

CCDN@levels.sait.edu.au (david newall) (11/21/89)

In article <957@sdrc.UUCP>, scjones@sdrc.UUCP (Larry Jones) writes:
> #include <signal.h>
>
> handler()
>    {
>    signal(SIGCLD, handler);
>    }
>
> main()
>    {
>    signal(SIGCLD, handler);
>    if (fork() == 0) exit(0);
>    }
>
> When I run this on my Sys Vr3.0 system, it goes into recursive death.

A similar program which caught SIGINT, running under SVR1, gave me similar
results.  I don't understand why.  It seems to contradict the manual:

         "A call to signal cancels a pending signal sig except for a
          pending SIGKILL signal."


Could someone please explain this apparent contradiction?


David Newall                     Phone:  +61 8 343 3160
Unix Systems Programmer          Fax:    +61 8 349 6939
Academic Computing Service       E-mail: ccdn@levels.sait.oz.au
SA Institute of Technology       Post:   The Levels, South Australia, 5095

cpcahil@virtech.uucp (Conor P. Cahill) (11/21/89)

In article <DOUGM.89Nov20182829@queso.ico.isc.com>, dougm@queso.ico.isc.com (Doug McCallum) writes:
> In article <1989Nov20.124459.26023@virtech.uucp> cpcahil@virtech.uucp (Conor P. Cahill) writes:
>    Yes.  You need to wait for the child in the handler.  If you have more
>    than one possible child, you need to wait for however many of them
>    there could be (usually you use alarm() to time out the wait).
> 
> You only need to wait for one child in the signal handler.  With SYSV,
> the SIGCLD handler will be called once for each child process that
> dies.  By waiting for just one and returning, you will get called for
> each exiting child with no need to timeout with an alarm.

The documentation states that SIGCLDs received while a process is in the
SIGCLD signal handler will be ignored.  That sounds like you could miss 
a SIGCLD or two.

In testing, I found that while the SIGCLD is ignored in the signal handler,
once the handler executes signal(SIGCLD,handler), if there are any child
processes that have exited and have not yet been waited for, another SIGCLD
is generated.  So you can get away with a single wait() under system V.

The following program will demonstrate the handling of SIGCLDs under systemV.

#include	<stdio.h>
#include	<signal.h>

main()
{
	int 	pid;
	void	signal_handler();
	signal(SIGCLD,signal_handler);

	if( (pid=fork()) == 0 )
	{
		sleep(2);
		printf("1st child exiting\n");
		exit(10);
	}
	fprintf(stdout,"Process %d started...\n", pid);
	fflush(stdout);
	if( (pid=fork()) == 0 )
	{
		sleep(2);
		printf("2nd child exiting\n");
		exit(10);
	}
	fprintf(stdout,"Process %d started...\n", pid);
	fflush(stdout);
	if( (pid=fork()) == 0 )
	{
		sleep(5);
		printf("3rd child exiting\n");
		exit(10);
	}
	fprintf(stdout,"Process %d started...\n", pid);
	fflush(stdout);
	sleep(20);

}

void
signal_handler()
{
	int pid;
	printf("in signal handler...\n");
	sleep(10);
	pid = wait((int*)0);
	printf("process %d exited\n",pid);
	signal(SIGCLD,signal_handler);
	printf("leaving signal handler...\n");
}


OUTPUT from this program:

	Process 29771 started...
	Process 29772 started...
	Process 29773 started...
	1st child exiting
	2nd child exiting
	in signal handler...
	3rd child exiting
	process 29773 exited
	in signal handler...
	process 29772 exited
	in signal handler...
	process 29771 exited
	leaving signal handler...
	leaving signal handler...
	leaving signal handler...

Note the multiple layers of the signal handler.  Apparrently the call to
signal() enables the receipt of further SIGCLDs.

If you change the wait code to look like:

	while( (pid = wait((int*)0)) != -1)
		fprintf(...

The output will be as follows:

	Process 29784 started...
	Process 29785 started...
	Process 29786 started...
	1st child exiting
	2nd child exiting
	in signal handler...
	3rd child exiting
	process 29786 exited
	process 29785 exited
	process 29784 exited
	leaving signal handler...

Note that all three processes are handled by the wait() loop and no further
SIGCLDs are received.
 
-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

chris@mimsy.umd.edu (Chris Torek) (11/22/89)

In article <1989Nov21.132942.29972@virtech.uucp> cpcahil@virtech.uucp
(Conor P. Cahill) writes:
>The documentation states that SIGCLDs received while a process is in the
>SIGCLD signal handler will be ignored.  That sounds like you could miss 
>a SIGCLD or two.

The documentation (almost) lies.

SIGCLD is, in System V, not a signal.  Oh, it has a signal number, and
can be generated by kill(2), and can be caught by signal(2), and so
forth.  But it is not a signal.  It is a weird thing that happens to
use some of the existing signal code so as to avoid inventing a new
facility, while at the same time being very un-signal-like in its
behaviour.  (Just like SIGCONT in BSD, actually....)

>In testing, I found that while the SIGCLD is ignored in the signal handler,
>once the handler executes signal(SIGCLD,handler), if there are any child
>processes that have exited and have not yet been waited for, another SIGCLD
>is generated.

Right.  Unlike all other signals, SIGCLD is `regenerative.'  (It is NOT
queued, despite reports of claims to that effect in other parts of
System V documentation.)  Instead, there is special code in the
signal() system call to check to see if the action for SIGCLD is being
altered.  If so, one of the following is true:

	a) SIGCLD is being set to SIG_IGN.
In this case, all zombie (exited) children of the current process are
flushed as if by a series of calls to wait(2).

	b) SIGCLD is being set to SIG_DFL.
In this case, do nothing.

	c) SIGCLD is being set to a catch function.
In this case, one (1) SIGCLD signal is immediately posted (via psignal(),
if the kernel function has the same name as the one in BSD).

Meanwhile, there is a second special case in the kernel, in exit():  If
u.u_signal[SIGCLD]==SIG_IGN, the newly dead process is discarded
immediately.  Otherwise, a SIGCLD signal is sent (possibly optional on
u.u_signal[SIGCLD]!=SIG_DFL---this would simply be an optimisation).

There is no third special case.  *All* signals are reset to SIG_DFL on
actual delivery, including SIGCLD.

Thus, the `proper' way to catch SIGCLD in SysV is:

	void catch(int sig) {
		int w, status;
		do { w = wait(&status); } while (w==-1 && errno==INTR);
		<<< do something with w and status >>>
		(void) signal(SIGCLD, catch);/* recurses if necessary */
	}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn) (11/23/89)

In article <870@omen.UUCP> caf@omen.UUCP (Chuck Forsberg WA7KGX) writes:
-In article <11656@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
-:I think the whole notion of signals is broken, but there's not
-:much point in getting into that..
-OK, what should replace the notion of signals??

There are lots of possibilities; e.g. CSP.

bvs@light.uucp (Bakul Shah) (11/30/89)

In article <11667@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <870@omen.UUCP> caf@omen.UUCP (Chuck Forsberg WA7KGX) writes:
>-In article <11656@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>-:I think the whole notion of signals is broken, but there's not
>-:much point in getting into that..
>-OK, what should replace the notion of signals??
>
>There are lots of possibilities; e.g. CSP.

CSP (Communicating Sequential Processes) to replace signals?
Care to elaborate?

At one time software interrupts seemed like a good replacement
for signals but now I think a cleaner solution is to just use a
separate thread to handle signals.  A `kill()' is a message to
this thread and the signal gets `handled' by receiving this
message.  A `signal()' is also a message to the same thread
indicating what to do when a specific signal is received.  You
wouldn't get nested signals (atleast not easily) but may be that
is a favor!  If you insist on longjumping out of a signal
handler, things get more complicated.  Still, almost all of the
signal functionality can be emulated with threads, messages and
shared variables.

-- Bakul Shah
   bvs@BitBlocks.COM
   ..!{ames,sun,ucbvax,uunet}!amdcad!light!bvs