coleman@cam.nist.gov (Sean Sheridan Coleman X5672) (02/12/91)
Please explain to me why a SIGCONT is sent to a process after SIGTERM is sent to my process. It doesn't compute because TERM means to terminate the the process. I catch SIGCONT because I do some reconnecting for serial drivers after my process is stopped from a cntl-Z. Below is a piece of the code and a some output from the program. Here I stop the program with a ^Z and restart using fg %1. SIGCONT is sent in this situation correctly. <deputy /home/central/coleman/real_prog/net.dir/net_log> % net l logfile ^Z Signal caught is 18 Stopped (signal) <deputy /home/central/coleman/real_prog/net.dir/net_log> % jobs [1] + Stopped (signal) net l logfile <deputy /home/central/coleman/real_prog/net.dir/net_log> % fg %1 net l logfile Signal caught is 19 ^C Signal caught is 2 From another window, I used kill -TERM to kill this process. SIGTERM is received first but then SIGCONT is sent for no known reason. <deputy /home/central/coleman/real_prog/net.dir/net_log> % !ne net l logfile Signal caught is 15 Signal caught is 19 No devices are available to use for logging Here is the signal handler: Note: device,device_file and device_name are global sig_handler(sig) int sig; { extern int device; extern FILE *device_file; extern char *device_name; char *strip_add_dev_name(); printf(" Signal caught is %d\n",sig); switch(sig) { case SIGINT: case SIGTERM: unlock_dev(strip_add_dev_name(ttyname(device),0)); exit(1); case SIGTSTP: unlock_dev(strip_add_dev_name(ttyname(device),0)); close(device); kill(0,SIGSTOP); break; case SIGCONT: if(device_file != NULL) { rewind(device_file); device = get_device(device_file); } else { if((device = chk_device(device_name)) < 0) { printf("No devices are available to use for logging\n"); exit(1); } } default: break; } } Thanks Sean Coleman coleman@bldrdoc.gov NIST Boulder, CO
richard@locus.com (Richard M. Mathews) (02/12/91)
coleman@cam.nist.gov (Sean Sheridan Coleman X5672) writes: >Please explain to me why a SIGCONT is sent to a process after >SIGTERM is sent to my process. It doesn't compute because TERM >means to terminate the the process. I catch SIGCONT because I >do some reconnecting for serial drivers after my process is >stopped from a cntl-Z. Below is a piece of the code and a >some output from the program. This is yet another example of C Shell brain damage. The shell thinks it is going to do you a favor. When it sends SIGTERM or SIGHUP it follows it with a SIGCONT. The "problem" that the shell is trying to solve is that a signal sent to a stopped process won't get processed until the process resumes -- since you apparently wanted the process to die, the shell sends a SIGCONT just to make sure the process will be able to get to your signal right away. Wrong answer. It doesn't solve the problem in general for all signals, and it creates about as much confusion as it tries to avoid. The solution is don't catch SIGCONT. Your SIGTSTP handler knows when the program resumes anyway because the line after the "kill" which caused it to suspend itself will not be reached until the program is resumed. Taking the code you have for SIGCONT, and putting it there is the "normal" way to do things. Richard M. Mathews Freedom for Lithuania richard@locus.com Laisve! lcc!richard@seas.ucla.edu ...!{uunet|ucla-se|turnkey}!lcc!richard
conger@hpcupt1.cup.hp.com (Edward Conger) (02/13/91)
/ hpcupt1:comp.unix.wizards / coleman@cam.nist.gov (Sean Sheridan Coleman X5672) / 9:06 am Feb 11, 1991 / >Please explain to me why a SIGCONT is sent to a process after >SIGTERM is sent to my process. It doesn't compute because TERM >means to terminate the the process. The distinction is that sending a signal to a process (usually|often) is implemented by setting a bit in a flag word associated with the "victim process". The action of *send*ing the signal doesn't terminate the process, rather, it says, "when next you run in the kernel (either via a system call or a timeslice (usually ~ 1/100 sec)), you should go handle this signal." In the case of SIGTERM, the default behaviour is to TERMinate. Now suppose the victim process is stopped (either by job control, SIGSTOP, or via debugging), it will NOT see the bit set in the flag word until it runs again. The SIGCONT gets it unstopped and it runs long enough to terminate. Your mileage (and implementation) may vary, but this is the general gist of the problem. >Thanks >Sean Coleman >coleman@bldrdoc.gov >NIST >Boulder, CO >---------- Hope this helps, -Ed. =========================================================================== The above is an official statement of MeMyself & I Inc. It should not be interpreted to be an official statement of any other likely targets, including, but not limited to, Hewlett-Packard Co., ACME Rockets, ACME Rubber Bands, ACME Consolidated Mining Engineering, or the Home for Damaged Coyotes.
src@scuzzy.in-berlin.de (Heiko Blume) (02/14/91)
richard@locus.com (Richard M. Mathews) writes: >The solution is don't catch SIGCONT. Your SIGTSTP handler knows when >the program resumes anyway because the line after the "kill" which >caused it to suspend itself will not be reached until the program is >resumed. which fails miserably when you get the uncatchable SIGSTOP. *yes*, i do use SIGSTOP, there are programs that disable the SIGTSTP feature, and i won't let those go unsuspended. -- Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93 public source archive [HST V.42bis]: scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp uucp scuzzy!/src/README /your/home
bhoughto@pima.intel.com (Blair P. Houghton) (02/14/91)
In article <67880001@hpcupt1.cup.hp.com> conger@hpcupt1.cup.hp.com (Edward Conger) writes: >The SIGCONT gets it unstopped and it runs long enough to >terminate. >Your mileage (and implementation) may vary, but this is the general gist of >the problem. Not the least of those variances is that signals may be queued, so that the SIGCONT may simply be waking the process up only to watch it go to sleep again (unless the SIGTERM can somehow butt into the queue). --Blair "Dave? Dave's not here..."
allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (02/15/91)
As quoted from <7103@fs1.cam.nist.gov> by coleman@cam.nist.gov (Sean Sheridan Coleman X5672): +--------------- | Please explain to me why a SIGCONT is sent to a process after | SIGTERM is sent to my process. It doesn't compute because TERM +--------------- Being suspended, it wouldn't execute the signal handler unless it were continued. Also, I think the exit processing in the kernel needs this. (So why does the SIGCONT handler run after the SIGTERM handler? Because signal handlers are invoked in signal-number order.) ++Brandon (BSD folks feel free to correct me.) -- Me: Brandon S. Allbery VHF/UHF: KB8JRR on 220, 2m, 440 Internet: allbery@NCoast.ORG Packet: KB8JRR @ WA8BXN America OnLine: KB8JRR AMPR: KB8JRR.AmPR.ORG [44.70.4.88] uunet!usenet.ins.cwru.edu!ncoast!allbery Delphi: ALLBERY
torek@elf.ee.lbl.gov (Chris Torek) (02/18/91)
In article <2519@inews.intel.com> bhoughto@pima.intel.com (Blair P. Houghton) writes: >Not the least of those variances is that signals may be queued .... Signals are not queued. As far as I know, there is only one piece of one Unix manual that claims otherwise (that being the System V SIGCLD documentation), and it lies. Signals are never queued. System V SIGCLD signals use a different trick that causes properly-coded wait routines to be called once per exited child, but which causes improperly-coded wait routines to recurse indefinitely. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
bhoughto@hopi.intel.com (Blair P. Houghton) (02/19/91)
In article <10007@dog.ee.lbl.gov> torek@elf.ee.lbl.gov (Chris Torek) writes: >In article <2519@inews.intel.com> bhoughto@pima.intel.com >(Blair P. Houghton) writes: >>Not the least of those variances is that signals may be queued .... > >Signals are not queued. Something's stacking them up. I've run into situations more than once where I've tried to stop a process and the stop has hung, usually due to something else's being stuck (an NFS access, e.g.) I've sent the stop again, and when the block clears I see the process stop. When I tell the process to continue, the first thing it does is stop itself again. Who's doing it? The kernel or csh(1)? The tty driver? Or is it just a matter of a stuck process queue? I can't imagine all the kills not being done by the time I've typed in the command to continue... --Blair "It also happens under VMS, but I'll keep mention of that 'Fine' system to a minimum..."
torek@elf.ee.lbl.gov (Chris Torek) (02/22/91)
>In article <10007@dog.ee.lbl.gov> I wrote: >>Signals are not queued. In article <2588@inews.intel.com> bhoughto@hopi.intel.com (Blair P. Houghton) writes: >Something's stacking them up. Well, not really: >I've run into situations more than once where I've tried to >stop a process and the stop has hung, usually due to >something else's being stuck (an NFS access, e.g.) I have no idea what `the stop has hung' means. >I've sent the stop again, and when the block clears I see the >process stop. When I tell the process to continue, the >first thing it does is stop itself again. >Who's doing it? The kernel or csh(1)? The tty driver? Or >is it just a matter of a stuck process queue? The most likely cause is the program itself. (This also depends on which stop signal you use.) A number of programs contain code resembling the following: /* broken function to catch SIGTSTP (^Z) */ void catch_stop() { /* put the terminal modes back */ clean_tty(); /* reenable stops */ signal(SIGTSTP, SIG_DFL); /* bug */ /* stop ourselves */ kill(0, SIGTSTP); /* bug */ /* resumes here */ signal(SIGTSTP, catch_stop); dirty_tty(); } This particular catcher is full of race conditions. The most interesting problem, however, is the kill(0, SIGTSTP); This sends another stop signal to the entire process group. If there are two processes in a pipe, both using code like this, they end up sending each other barrages of stop signals. With the code shown above, it is possible (though unlikely) to have two processes in a pipeline `trade off' stops, so that you run: % foo | bar ^Z Stopped % fg Stopped % fg Stopped % fg Stopped % In each case either foo or bar `wins the race', stops both, then when you foreground either foo or bar wins again, and stops both, and . . . . This does not happen with only one process, though a different sort of race can lead to two stops from two different signals, despite the SIGCONT description below: If the process takes a SIGTSTP, and then stops on a SIGSTOP during, e.g., the clean_tty() call above, it can then send itself a SIGTSTP as soon as it is resumed. The (4.3++) kernel implementation of signals is simply four bit vectors: - signals currently pending (p_sig) - signals currently held (p_sigmask) - signals being caught (p_sigcatch) - signals being ignored (p_sigignore) Some of these fields exist only to optimise signal dispatching. The most important thing is that a signal is delivered to a process with: p->p_sig |= mask; and a process takes a signal if: (p->p_sig & ~p->p_sigmask) != 0 The signal it takes is the lowest-numbered one that is pending. When a signal is taken the corresponding bit is removed from p->p_sig. Typically, the same bit is set in p->p_sigmask, blocking further delivery of that particular signal. (The new mask comes about from the signal mask in the [4.2-style] sigvec or [POSIX-style] sigaction.) When a SIGCONT signal is sent (not delivered, merely sent!), p->p_sig has `stopsigmask' (the masks for SIGTSTP, SIGSTOP, SIGTTIN, and SIGTTOU) removed, so one SIGCONT clears up to four stops. This can clear stops that have never been taken. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab EE div (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov