[net.unix-wizards] improved 4.2BSD signal

gwyn%brl-vld@sri-unix.UUCP (12/20/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

/*
	signal -- old system call emulation for 4.2BSD (VAX version)
		(adapted from BRL UNIX System V emulation for 4.2BSD)

	last edit:	20-Dec-1983	D A Gwyn

	NOTE:  Although this module is VAX-specific, it should be
	possible to adapt it to other fairly clean implementations of
	4.2BSD.  The difficulty lies in avoiding the automatic restart
	of certain system calls when the signal handler returns.  I use
	here a trick first described by Donn Seeley of UCSD Chem. Dept.
*/

#include	<errno.h>
#include	<signal.h>

extern int	sigvec();
extern int	sigsetmask();

extern		etext;
extern int	errno;

/* # bytes to skip at the beginning of C ret_eintr() function code: */
#define	OFFSET	2			/* for VAX .word reg_mask */

/* PC will be pointing at a syscall if it is to be restarted: */
typedef unsigned char	opcode;		/* one byte long */
#define	SYSCALL	0xBC			/* VAX CHMK instruction */

static int	(*handler[NSIG])() =	/* "current handler" memory */
	{
	BADSIG				/* initially, unknown state */
	};

static int	catchsig();
static int	ret_eintr();

int	(*
signal( sig, func )			/* returns previous handler */
	)()
	register int	sig;		/* signal affected */
	register int	(*func)();	/* new handler */
	{
	register int	(*retval)();	/* previous handler value */
	struct sigvec	oldsv;		/* previous state */
	struct sigvec	newsv;		/* state being set */

	if ( func >= (int (*)())&etext )	/* "lint" hates this */
		{
		errno = EFAULT;
		return BADSIG;		/* error */
		}

	/* cancel pending signals */
	newsv.sv_handler = SIG_IGN;
	newsv.sv_mask = newsv.sv_onstack = 0;
	if ( sigvec( sig, &newsv, &oldsv ) != 0 )
		return BADSIG;		/* error */

	/* the first time for this sig, get state from the system */
	if ( (retval = handler[sig-1]) == BADSIG )
		retval = oldsv.sv_handler;

	handler[sig-1] = func;	/* keep track of state */

	if ( func == SIG_DFL )
		newsv.sv_handler = SIG_DFL;
	else if ( func != SIG_IGN )
		newsv.sv_handler = catchsig;	/* actual sig catcher */

	if ( func != SIG_IGN		/* sig already being ignored */
	  && sigvec( sig, &newsv, (struct sigvec *)0 ) != 0
	   )
		return BADSIG;		/* error */

	return retval;			/* previous handler */
	}

/*ARGSUSED*/
static int
catchsig( sig, code, scp )		/* signal interceptor */
	register int	sig;		/* signal number */
	int		code;		/* code for SIGILL, SIGFPE */
	register struct sigcontext	*scp;	/* -> interrupted context */
	{
	struct sigvec	oldsv;		/* previous state */
	struct sigvec	newsv;		/* state being set */

	/* at this point, sig is blocked */

	/* most UNIXes usually want the state reset to SIG_DFL */
	if ( sig != SIGILL && sig != SIGTRAP )
		{
		newsv.sv_handler = SIG_DFL;
		newsv.sv_mask = newsv.sv_onstack = 0;
		(void)sigvec( sig, &newsv, &oldsv );
		}

	(void)sigsetmask( scp->sc_mask );	/* restore old mask */

	/* at this point, sig is not blocked, usually have SIG_DFL;
	   a longjmp may safely be taken by the user signal handler */

	(void)(*handler[sig-1])( sig );	/* user signal handler */

	/* must now avoid restarting certain system calls */
	if ( *(opcode *)scp->sc_pc == (opcode)SYSCALL )
		scp->sc_pc = (int)ret_eintr + OFFSET;

	/* return here restores interrupted context */
	}


static int
ret_eintr()				/* substitute for system call */
{
	errno = EINTR;
	return -1;
}

gwyn%brl-vld@sri-unix.UUCP (12/21/83)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

	Received: From Ucla-Locus.ARPA by BRL-VLD via smtp;  20 Dec 83 13:34 EST
	Date:           Tue, 20 Dec 83 09:39:00 PST
	From:           Bob English <bob@UCLA-LOCUS>
	To:             Doug Gwyn (VLD/VMB) <gwyn@brl-vld>
	Subject:        improved 4.2BSD signal(2) library routine
	In-reply-to:    Your message of Tue, 20 Dec 83 5:31:28 EST

	It seems to me that a signal that arrived just before the system call
	would cause the system call to be skipped.  In particular, system calls
	such as time (not in 4.2, I know) could return EINTR.  If the signal
	were truly interruptible, this wouldn't be a problem, but your routine
	doesn't check.  I don't,however, know of a better solution that doesn't
	involve changing the kernel (or making all your i/o's go through select???).

	The real problem is that the kernel doesn't show the user the
	true state of things: if a system call was interrupted and is
	about to be re-invoked, it should give the signal handler an
	explicit indication of it.  Otherwise you get quantum
	programming (no one knows the true state of the universe).

	--bob--

Yes, it is true that the system call pointed at by the PC may not have
been interrupted with the PC set back to deliberately restart the call.
This is a small, but finite, chance taken the way I wrote the routine.
It could be improved considerably by looking just past the CHMK for
the numeric syscall code (3 for read, and so forth), since there are
only a few syscalls that are restarted (READ/WRITE/OPEN on slow device,
WAIT when it is really waiting, IOCTL I believe).  Unfortunately, just
because one is about to execute one of these does not mean that it
would've been interruptible, so even with extra checking you cannot be
sure that EINTR is appropriate.  However, at least this would ensure
that EINTR is only returned from the TYPES of syscalls that one expects.

You are correct in observing that what is really needed is a way to
tell for sure that one is about to restart the syscall upon return from
the user interrupt handler.  I browsed through the 4.2BSD kernel
sources for quite a while but I did not succeed in finding a sure-fire
solution that did not involve modifying the kernel.  I would be happy
to hear of a reasonably efficient method (no, I cannot read /dev/mem).

kre@mulga.SUN (Robert Elz) (12/26/83)

A warning to anyone who uses either of the routines posted
to the net to simulate the old signal handling style on 4.2.

Those routines assume that whenever a signal is received with
the pc pointing at a CHMK (trap) instruction, that a system call
must have been interrupted.  It is possible (though not likely)
that the process was just about to execute the sys call when the
signal occurred, and the system call could be one of those
which is not supposed to be interrupted (EINTR could never
have normally occurred).  Thus, some system calls might end
up returning EINTR when they should not.

This is not likely, and is certainly no worse than many other
race conditions that I guarantee that any program attempting to
catch and handle signals in pre 4.2 bsd unix will have other
problems, more likely to occur, and with worse effects.
(Exculding progs using the 4.1 jobs library, with which, it was almost
possible to survive in simple cases, with a lot of care).

As an example of the type of problem that the old signal mechanism
causes, which can't be avoided ...

We have a rather sluggish, local Australian 68K system.
On that, its easy to log yourself out by pressing the 'DEL'
(interrupt) key twice in succession.  The reason: of course,
the SIGINT has been delivered to your shell (Bourne shell, but
tht is irrelevant) but it hasn't been swtch'd to yet (or perhaps
it has, but hasn't had time to reset the handling of SIGINT).
The second interrupt signal finds that the handler for
SIGINT is SIG_DFL, and the shell is killed.  Bye bye!

While some of you may be able to accomodate such effects,
and explain them to your users, without being thrown out of
the room, bodily, I cannot.  Since this is a problem of
definition of the signal routines, ONLY incompatibility
with existing programs can fix it.  (I do admit, that a
way could have been found to allow the old handling in
parallel with the new, for ease of transition, but that
tends to mean lack or transition, & I'm glad it wasn't done).
And like the filesystems, once you are going to make
something incompatible, its best to make it VERY incompatible,
and attempt to get it right, once and for all.

Note, there are still some problems with the signal handling.
eg: a sys call that does a slow write won't be restarted
if some data has been transferred; and there's no way to determine
how much data was written before the sys call was interrupted.
(This problem occurs in the old signal handling routines too).
However, this is now basically a problem of implementation, it
should be able to be fixed sometime, without changing the
definitions again.

Thank heaven that there is someone out there with the
bravery to correct the problem, and try to get it done right.

Robert Elz,	Comp Sci, Univ of Melbourne.		decvax!mulga!kre

A merry Christmas and a happy new year to you all.

gwyn%brl-vld@sri-unix.UUCP (01/03/84)

From:      Doug Gwyn (VLD/VMB) <gwyn@brl-vld>

It is easy enough to patch the previously posted signal(2) emulation
to at least make sure that EINTR is only returned for system calls
that are supposed to be able to return this indication; I can send the
amended routine to anyone who wants it.  It is true that some reads
or writes that should not be interruptible (disk file i/o) may return
EINTR on rare occasion, although that should not bother stdio users.

The real botch in the new signal handling, as I see it, is that there
is no reliable way (short of snooping in one's u_ area on architectures
that allow that) to determine if one is in the middle of a system call
when the signal is fielded.  It appears that the new facilities were
designed with a specific model of how programs do i/o and use signals,
and that model does not fit every application.  There are many times
when I DO NOT WANT terminal i/o restarted after a signal, especially
after a keyboard-generated signal.  This is difficult to arrange when
terminal i/o gets automatically restarted by the system!

I can't be as enthusiastic over people's "courage" to try to improve
things when they don't take sufficient care in the new design.

wls@astrovax.UUCP (William L. Sebok) (01/11/84)

> .......  There are many times
> when I DO NOT WANT terminal i/o restarted after a signal, especially
> after a keyboard-generated signal.  This is difficult to arrange when
> terminal i/o gets automatically restarted by the system!

There are several problems of this sort in vnews under 4.1 BSD.  The problem
in signal 2 (SIGINT) I fixed with a kluge in the version of visual.c I posted
a couple of months ago.  The problem with Signal 1 (SIGHUP) is still
outstanding.  When vnews receives a hangup signal instead of exiting it goes
into an endless loop, eating up cpu time.  Maybe that is fixed in 4.2. The
SIGHUP signal is handled in code shared with readnews.  Under 4.1 readnews did
not have to worry about this issue because it used the old signal mechanism.
If the 4.2 readnews has not been changed it will now also have this bug.  I
will know in a few days when I try to bring up 4.2 here.
-- 
Bill Sebok			Princeton University, Astrophysics
{allegra,akgua,burl,cbosgd,decvax,ihnp4,kpno,princeton,vax135}!astrovax!wls