[comp.unix.wizards] signals like interrupts?

edler@cmcl2.NYU.EDU (Jan Edler) (07/13/87)

Often the comment has been made that "Berkeley signals operate just
like hardware interrupts", or something like that.  I would like to
discuss two particular ways in which it isn't very true.  The problems
are poor handling of synchronous signals and queuing of signals.

It seems to me that signals should be considered to fall into one of
three categories, depending on whether they are sent
	- Synchronously, as a result of program execution
	  (e.g. SIGSEGV, SIGILL),
	- Asynchronously by the system, as a service to the user
	  (e.g. SIGALRM, SIGINT, SIGCHLD), or
	- Asynchronously by another process
	  (e.g. SIGTERM, SIGUSR1, SIGUSR2).
The problem (not Berkeley's; it goes all the way back to early UNIX) is
that the kernel tries to consider all signals in the same way.  Except
for the silent prohibition on ignoring or blocking SIGKILL and SIGSTOP,
you can ignore or block any signal.  You can also send any signal with
the kill system call.

On machines like the pdp11, vax, and 68000, one of the distinctions
between interrupts and exceptions is that the latter can't be blocked.
I believe that, in general, signals in the first category above should
not be ignorable or blockable.  I don't even know what the semantics
should BE when they are ignored!  The only counter-example I can think
of is a signal generated by an exception that can be reasonably
ignored, such as integer overflow on the vax.

Another characteristic of these machines is that peripherals don't
generate exceptions, so when you get a trap through the
illegal-memory-reference vector, you know that you really did get one,
and it wasn't caused by some peripheral board interrupting at the wrong
level or something like that.  I believe that signals of the first
category above (and probably those in the second category as well)
should not be sendable by kill() or killpg().  I know someone will say
"but I like to test my program's SIGSEGV handler by sending SIGSEGV
from the shell", but while this may be a convenience when debugging
relatively small programs I think it is a hindrance to construction of
reliable large systems.

Now let us consider queuing of signals.  It is true that none of the
machines listed above has specific hardware to queue interrupts, but if
the same kind of interrupt is pending from more than one device at the
same time they are effectively queued by the interrupt priority
mechanism.  These machines also have a handshaking protocol between the
sender of an interrupt (i.e. a device controller) and the receiver
(i.e. the cpu).  When a controller asserts a pending interrupt, it
remains asserted until taken by the cpu.  It is up to the controller to
handle the case where the interrupt has not been taken by the cpu
before the next interrupt needs to be sent.  The controller may lose a
second cause for interrupt, or it may just lose data instead, or it may
set a status bit of some kind indicating that an interrupt was lost, or
it may keep an internal count and assure that the proper number of
interrupts are taken by the cpu.  So the problem is dealt with only
indirectly by the architecture, by putting it off onto the device
controllers.  The result is that many controllers probably don't handle
it very well, although it is usually not a big problem given the nature
of the devices.

The point is that while it wouldn't be accurate to say that "interrupts
are queued" on these machines, it is also not true that they share the
problem faced by lost UNIX signals, because they have a priority system
and support a kind of handshaking between the sender and receiver that
is not really available to the UNIX programmer.  So the statement that
signals should be "just like interrupts" is not a very good argument
against queuing signals.

Incidentally, the main reason I want to queue signals is that I want to
use them for asynchronous I/O completions.  The fact that they also
come in handy for SIGCHLD and as a more general reliable ipc mechanism
is just fine.

Jan Edler
NYU Ultracomputer project
edler@nyu.edu
cmcl2!edler
(212) 998-3o nI

allbery@ncoast.UUCP (Brandon Allbery) (07/19/87)

As quoted from <17691@cmcl2.NYU.EDU> by edler@cmcl2.UUCP:
(grouped hardware signals (SIGILL, SIGSEGV, etc.), then kernel notifications
(SIGALRM, etc.), then software notifications (SIGTERM, SIGUSR1, SIGUSR2)...)
+---------------
| level or something like that.  I believe that signals of the first
| category above (and probably those in the second category as well)
| should not be sendable by kill() or killpg().  I know someone will say
| "but I like to test my program's SIGSEGV handler by sending SIGSEGV
| from the shell", but while this may be a convenience when debugging
| relatively small programs I think it is a hindrance to construction of
| reliable large systems.
+---------------

Cases where trapping hardware signals is desirable:

(1) my malloc() debugger package (soon in comp.sources.misc) dumps the malloc
tables when an invalid memory access (SIGBUS, SIGSEGV) occurs; this allows
the programmer to determine if it was a bogus pointer reference.

(2) I often send SIGBUS to a process which traps SIGQUIT and then goes into
an infinite loop.  I want the core dump, else I would use SIGKILL.  Remember:
_no_ program is ever totally bug-free!

As for queueing:  what's wrong with the System V technique?  Send a message,
then a SIGUSR1; it doesn't matter how many SIGUSR1's are lost as long as you
get one of them, and the IPC msg mechanism does the queueing.  You can send
data asynchronously this way as well.  The SIGUSR1 handler can deal with the
messages.  (To forestall sV/BSD wars:  it's my considered opinion that each
version of UNIX has good and bad features.  And some things were botched by
both (sockets vs. so-called "streams" vs. Ritchie's streams in V8).)
-- 
[Copyright 1987 Brandon S. Allbery, all rights reserved] \ ncoast 216 781 6201
[Redistributable only if redistribution is subsequently permitted.] \ 2400 bd.
Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc
{{ames,harvard,mit-eddie}!necntc,{well,ihnp4}!hoptoad,cbosgd}!ncoast!allbery
<<The opinions herein are those of my cat, therefore they must be correct!>>

edler@cmcl2.NYU.EDU (Jan Edler) (07/20/87)

In article <3254@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:
>Cases where trapping hardware signals is desirable:
>
>(1) my malloc() debugger package (soon in comp.sources.misc) dumps the malloc
>tables when an invalid memory access (SIGBUS, SIGSEGV) occurs; this allows
>the programmer to determine if it was a bogus pointer reference.

Clearly there is a need to trap hardware-generated signals; what I am
denying is that there is any need to block or ignore them.

>(2) I often send SIGBUS to a process which traps SIGQUIT and then goes into
>an infinite loop.  I want the core dump, else I would use SIGKILL.  Remember:
>_no_ program is ever totally bug-free!

Perhaps what is needed is a non-catchable terminate-with-coredump signal.
Remember: the program might be catching SIGBUS too.  Also, the existence
of a user-sendable SIGBUS makes it hard to do implementation-specific
extensions such as passing extra arguments to the handler in a reliable way.

>As for queueing:  what's wrong with the System V technique?  Send a message,
>then a SIGUSR1; it doesn't matter how many SIGUSR1's are lost as long as you
>get one of them, and the IPC msg mechanism does the queueing.  You can send
>data asynchronously this way as well.  The SIGUSR1 handler can deal with the
>messages.

It all depends on what your application is.  For most traditional uses of
signals, it doesn't matter whether they are queued or not.  For message
passing over a separate ipc channel, it is also unnecessary to queue signals.
But if you need to be able to count them, or if the signals are accompanied
by extra information (e.g. asynchronous i/o completion info), then requiring
use of a separate ipc channel is at least awkward.

Jan Edler
NYU Ultracomputer project
edler@nyu.edu
...!cmcl2!edler
(212) 998-3353

jal@ausmelb.OZ (Joe Longo) (07/21/87)

On the subject of signal handling under UNIX: We've recently been
considering a scheme that allows multiple handlers to be called for every
signal. This allows independent library routines to process signals, such
as SIGINT, without knowing of, or interfering with, other signal handlers.

E.g.: an I/O library might want to catch signals so that it can clean up its
files, while a screen library might want to catch signals to
reposition the cursor in a known place.

The method used would be relatively simple. We would create a function
that would look like:
	
	int usignal(signo, action, function)
	int         signo;
	int                action;
	int                       (*function)();

where
	signo      is the signal number,
	action     is either "add to queue" or "delete from queue".
	function   is a pointer to the user function that's added or removed.

When a signal comes in, each of the functions requested is called in turn.
[There are other details I won't go into (e.g., rules on
returning from the function; default action on signals; etc..)]

The advantage to this approach is that it eliminates the excessive system
calls involved in setting your own handler, saving the old one, then
reseting the old handler every time the library function in called:

	oldfunc = signal(signo, newfunc);
	<library code>
	signal(signo, oldfunc);

This represents a significant overhead in the case of numerous
calls to the library function.

Question: Is there a "standard" or preferred approach to this problem?
	  I can't believe I'm the first or only person to have tried to
	  solve it. If so, could someone suggest the preferred approach?

If others have done it, can they tell me what "traps"
(no pun intended) they know of with this type of scheme?

Thanks.
-- 
 ---------------------------------------------- -----=-----
Regards,					----===----
						---=====---
Joe Longo,					--=== ===--
Melbourne.					-==== ====-
						a u s t e c

					ACS:  jal@ausmelb.oz
Austec International Ltd,		UUCP: ...!seismo!munnari!ausmelb!jal
344 St Kilda Rd,			ARPA: jal%ausmelb.oz.au
Melbourne, Victoria, 3004.		Fax: 699 9870	Telex: 38559
AUSTRALIA				Phone: +61 3 699 4511
D

amh@cheviot.newcastle.ac.uk (Andrew Hilborne) (07/22/87)

> In article <3254@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:
>
> I often send SIGBUS to a process which traps SIGQUIT and then goes into
> an infinite loop.  I want the core dump, else I would use SIGKILL.  Remember:
> _no_ program is ever totally bug-free!
>
> Perhaps what is needed is a non-catchable terminate-with-coredump signal.
>
Programs should not normally trap SIGQUIT - this was originally
designed to do just what you want.  Unfortunately a bug in one version
of UN*X meant that a SETUID program owned by root could core-dump a
file owned by root and writeable to another.  This was a security flaw
and programs took to trapping SIGQUIT as well.

allbery@ncoast.UUCP (Brandon Allbery) (07/28/87)

As quoted from <2267@cheviot.newcastle.ac.uk> by amh@cheviot.newcastle.ac.uk (Andrew Hilborne):
+---------------
| > In article <3254@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:
| > I often send SIGBUS to a process which traps SIGQUIT and then goes into
| > an infinite loop.  I want the core dump, else I would use SIGKILL.  Remember:
| >
| > Perhaps what is needed is a non-catchable terminate-with-coredump signal.
| >
| Programs should not normally trap SIGQUIT - this was originally
| designed to do just what you want.  Unfortunately a bug in one version
| of UN*X meant that a SETUID program owned by root could core-dump a
| file owned by root and writeable to another.  This was a security flaw
| and programs took to trapping SIGQUIT as well.
+---------------

Actually, I was trapping SIGQUIT because I needed two keyboard signals to
trigger various actions.  (This is System III, System V, and Xenix; don't
tell me about SIGIO, and _especially_ not about that hog of hogs, O_NDELAY
(or select() or FIONREAD-driven, for BSD) reads!)  The usual arrangement was
terminate on SIGINT and write a log file on SIGQUIT.  Needless to say, this
makes catching a runaway looper difficult unless one can kill -11 the program.
-- 
 Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc
  {{harvard,mit-eddie}!necntc,well!hoptoad,sun!cwruecmp!hal}!ncoast!allbery
ARPA: necntc!ncoast!allbery@harvard.harvard.edu  Fido: 157/502  MCI: BALLBERY
   <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>