[comp.sys.sgi] How to kill tasks after a user abort?

lansd@dgp.toronto.edu (Robert Lansdale) (03/05/90)

	I have been running my 3d renderer successfully in parallel
on our 4D/140 for several months now, but I have yet to get the ^C
signal handling (SIGINT) to terminate the tasks gracefully. 

	My renderer is set up so the user may abort the display process
(which runs in parallel using tasks distributed with taskcreate()) by
typing ^C, at which point the signal handler shuts the display code down
and kills any stray tasks (I've noticed that random tasks get killed when
SIGINT is seen - is this documented anywhere??). The signal handler then
does a long jump back to the command line parser. The parent process is
usually asleep while the tasks are running (it distributes work to the
tasks and goes to sleep in a uspsema() statement).

	So how do I assure that the parent process gets woken and becomes
the first process to enter the signal handler? I've (carefully) read through
the signal man page several times and tried a number of variations of the 
SIGCLD signal catcher, but I still find that the parent process is getting
stuck occansionally in the printf() routine when it is about to print that 
the display code is shutdown (printf() -> Psema() -> blockproc() ...). This
printf() is part of the signal handling code.

	Any pointers would be most appreciated. Thanks.

--> Rob Lansdale

----------------------------------------------------------------------------
Robert Lansdale - (416) 978-6619       Dynamic Graphics Project	
Internet: lansd@dgp.toronto.edu        Computer Systems Research Institute
UUCP:   ..!uunet!dgp.toronto.edu!lansd University of Toronto
Bitnet:	  lansd@dgp.utoronto           Toronto, Ontario M5S 1A4, CANADA

moss@BRL.MIL ("Gary S. Moss", VLD/VMB) (03/05/90)

Rob Lansdale writes:
<	So how do I assure that the parent process gets woken and becomes
< the first process to enter the signal handler? I've (carefully) read through
< the signal man page several times and tried a number of variations of the 
< SIGCLD signal catcher, but I still find that the parent process is getting
< stuck occansionally in the printf() routine when it is about to print that 
< the display code is shutdown (printf() -> Psema() -> blockproc() ...). This
< printf() is part of the signal handling code.

Never ever call printf() or any of the <stdio.h> functions from a signal
handler; unless perhaps that's the only usage of stdio in the program.  It
will lead to potentially weird and fatal behavior if the signal interrupts
a stdio function, causing the signal handler to re-enter printf() while the
buffering mechanism is not stable.  Signal handlers should do as little as
possible as far as calling library routines which maintain internal states
which could be interrupted by the signal, malloc() is another good example.

If possible, set a global variable from your handler which can be detected
and acted on from the main runstream.  This can be difficult, depending on
the structure of your program, and therefore is best built into the design
from the start.

jwag@moose.sgi.com (Chris Wagner) (03/06/90)

In article <9003051013.aa18526@VMB.BRL.MIL>, moss@BRL.MIL ("Gary S.
Moss", VLD/VMB) writes:
> Rob Lansdale writes:
> <	So how do I assure that the parent process gets woken and becomes
> < the first process to enter the signal handler? I've (carefully) read
through
> < the signal man page several times and tried a number of variations of the 
> < SIGCLD signal catcher, but I still find that the parent process is getting
> < stuck occansionally in the printf() routine when it is about to print that 
> < the display code is shutdown (printf() -> Psema() -> blockproc() ...). This
> < printf() is part of the signal handling code.
> 
> Never ever call printf() or any of the <stdio.h> functions from a signal
> handler; unless perhaps that's the only usage of stdio in the program.  It
> will lead to potentially weird and fatal behavior if the signal interrupts
> a stdio function, causing the signal handler to re-enter printf() while the
> buffering mechanism is not stable.  Signal handlers should do as little as
> possible as far as calling library routines which maintain internal states
> which could be interrupted by the signal, malloc() is another good example.
> 
> If possible, set a global variable from your handler which can be detected
> and acted on from the main runstream.  This can be difficult, depending on
> the structure of your program, and therefore is best built into the design
> from the start.

This is always good advice. In addition, since printf and friends are
automatically
being protected from multiple shared processes simultaneously accessing
them, a signal
can easily interrupt a printf, while it has the lock protecting the
printf buffers.

As for the more general question, using shared processes is no different
than normal
Unix processes - only the parent will get SIGCLD if a child dies, and
the children will
not get signaled if the parent dies (unless the parent is a process
group leader).
Since most applications writers seem to prefer that if the 'master'
thread dies, that
all its children/salve threads die, in an upcoming release we are adding
a function
that more tightly binds share group members together.

As for SIGINT, which threads get the signal depends on what the disposition
of the signal was when you did the sproc - at that point each slave gets
its own
copy (just like fork()) of the signal mask and disposition.  Thus signal
handling is not currently a shared resource...

If you wish that the master/parent handle a signal differently than the
slaves, then
first set up the handler as you want the slaves to receive them, spawn
the slaves, then in the
parent change the handler to the one you want for just the parent...

Chris Wagner (jwag@sgi.com)