[comp.unix.wizards] How do you catch a signal without terminating the process ?

manohar@csl.dl.nec.com (mun o her) (06/06/91)

Server1 = Original Server ( something similar to inetd)
Server2 = Effective current Server.


Assume that we are writing a program to implement a concurrent server. 
Assume that Server1 spawns a Server2 when the client wants the connection
and Server1 goes back to the accept state. When the Server2 terminates as a 
result of the termination of client. The  Server1 must be notified of its
death so that it can execute wait( to get rid of zombie of Server2). If Signal
Handler is used to catch SIGCHLD the accept (Server1)  is interrupted and the
PROBLEM is that the 'whole program' dies. The intention is to resume accept
state of Server1. The question is How do you PREVENT Server1 from dieing 
and resume accept state.


Any suggestion will be greatly appreciated.


-- 
My Life is like a boat in a dry river. I didnot want to drag the boat out, I 
wanted to give it a river.
          ----Boris Pasternak's(of Dr. Zhivago's fame) letter to Mayor Kowaski
Manohar S. Gudavalli              ||Internet: manohar@csl.dl.nec.com

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (06/07/91)

In article <1991Jun5.215644.20581@csl.dl.nec.com>, manohar@csl.dl.nec.com (mun o her) writes:

> Server1 = Original Server ( something similar to inetd)
> Server2 = Effective current Server.

> Assume that we are writing a program to implement a concurrent
> server.  Assume that Server1 spawns a Server2 when the client wants
> the connection and Server1 goes back to the accept state.  When the
> Server2 terminates as a result of the termination of client.  The
> Server1 must be notified of its death so that it can execute wait( to
> get rid of zombie of Server2).  If Signal Handler is used to catch
> SIGCHLD the accept (Server1) is interrupted

This is normal, or at least not abnormal, depending on your system.

> and the PROBLEM is that the 'whole program' dies.

This is a bug in the server.

> The intention is to resume accept state of Server1.  The question is
> How do you PREVENT Server1 from [dying] and resume accept state.

Fix the bug.

There is nothing about SIGCHLD that should kill off the server.  I
assume you have set up a handler for SIGCHLD; the handler should wait
for the dead child.  (The details of this differ from system to system,
and you don't say what your system is, so I can't elaborate.)

Most likely the code assumes that any error return from accept() should
be fatal.  This is the bug; if accept() returns with errno indicating
EINTR, you should simply go back and retry the accept().

If that's not it I'd need more information.

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

klm@gozer.UUCP (Kevin L. McBride) (06/13/91)

In article <1991Jun7.145102.24125@thunder.mcrcim.mcgill.edu> mouse@thunder.mcrcim.mcgill.edu (der Mouse) writes:
>In article <1991Jun5.215644.20581@csl.dl.nec.com>, manohar@csl.dl.nec.com (mun o her) writes:
>
>> Server1 = Original Server ( something similar to inetd)
>> Server2 = Effective current Server.
>
>> Assume that we are writing a program to implement a concurrent
>> server.  Assume that Server1 spawns a Server2 when the client wants
>> the connection and Server1 goes back to the accept state.  When the
>> Server2 terminates as a result of the termination of client.  The
>> Server1 must be notified of its death so that it can execute wait( to
>> get rid of zombie of Server2).  If Signal Handler is used to catch
>> SIGCHLD the accept (Server1) is interrupted
>
>This is normal, or at least not abnormal, depending on your system.
>
>> and the PROBLEM is that the 'whole program' dies.
>
>This is a bug in the server.
>
>> The intention is to resume accept state of Server1.  The question is
>> How do you PREVENT Server1 from [dying] and resume accept state.
>
>Fix the bug.
>
>There is nothing about SIGCHLD that should kill off the server.  I
>assume you have set up a handler for SIGCHLD; the handler should wait
>for the dead child.  (The details of this differ from system to system,
>and you don't say what your system is, so I can't elaborate.)
>
>Most likely the code assumes that any error return from accept() should
>be fatal.  This is the bug; if accept() returns with errno indicating
>EINTR, you should simply go back and retry the accept().

Another likely possibility (that bit me in the ass recently), is that
the signal handler is doing a signal() (or something similar) to
re-instate itself as the signal handler BEFORE doing the wait() on the
child corpse.

This will, of course, cause the immediate re-issuance of the interrupt
and your process will die from a memory fault or segment violation due
to a stack blasting infinite recursion.

Remember, never re-enable an interrupt until the condition that caused
the interrupt has been cleared.

After 14 years one would think I'd have learned, but we all @#$% up
occasionally :-)

--
Kevin L. McBride    DoD      // Just say NO to the war on your freedom which,
President          #0348    //  by the way, is being fought with YOUR money.
MSCG, Inc.              \\ //   Let them know you've had enough.
uunet!wang!gozer!klm     \X/    Vote Libertarian.

gwyn@smoke.brl.mil (Doug Gwyn) (06/14/91)

In article <1991Jun13.160901.3715@gozer.UUCP> klm@gozer.UUCP (Kevin L. McBride) writes:
>Remember, never re-enable an interrupt until the condition that caused
>the interrupt has been cleared.

UNIX signals are not interrupts.
Worse, SIGCHLD/SIGCLD is not even a UNIX signal, it's an abomination.

klm@gozer.UUCP (Kevin L. McBride) (06/25/91)

In article <16410@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <1991Jun13.160901.3715@gozer.UUCP> klm@gozer.UUCP (Kevin L. McBride) writes:
>>Remember, never re-enable an interrupt until the condition that caused
>>the interrupt has been cleared.
>UNIX signals are not interrupts.

Yes, I am aware of this.  I was generalizing.  However, SIGCLD/SIGCHLD
behaves in this context as if it were an interrupt in that it will be
reissued immediately if not cleared before being re-enabled.

>Worse, SIGCHLD/SIGCLD is not even a UNIX signal, it's an abomination.

Agreed, albeit a useful one.

--
Kevin L. McBride    DoD      // Just say NO to the war on your freedom which,
President          #0348    //  by the way, is being fought with YOUR money.
MSCG, Inc.              \\ //   Let them know you've had enough.
uunet!wang!gozer!klm     \X/    Vote Libertarian.

klm@gozer.UUCP (Kevin L. McBride) (06/28/91)

This is a repost.  My upstream feed apparently dropped it on the floor the
first time 'round.  If you see this twice, close one eye. :-)

In article <16410@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <1991Jun13.160901.3715@gozer.UUCP> klm@gozer.UUCP (Kevin L. McBride) writes:
>>Remember, never re-enable an interrupt until the condition that caused
>>the interrupt has been cleared.
>
>UNIX signals are not interrupts.

Yes, I am aware of this.  I was generalizing.  However, SIGCLD/SIGCHLD
behaves in this context as if it were an interrupt in that it will be
reissued immediately if not cleared before being re-enabled.

>Worse, SIGCHLD/SIGCLD is not even a UNIX signal, it's an abomination.

Agreed, albeit a useful one.

--
Kevin L. McBride    DoD      // Just say NO to the war on your freedom which,
President          #0348    //  by the way, is being fought with YOUR money.
MSCG, Inc.              \\ //   Let them know you've had enough.
uunet!wang!gozer!klm     \X/    Vote Libertarian.