[comp.unix.wizards] No more signals

tomp@hcx1.UUCP (06/16/87)

I am trying to implement the SIGWINCH (window size change) signal and ran into
an interesting problem: no more signals are left. Our kernel is a combination
of both Berkeley and AT&T. Our machine has a 32 bit word size. The system keeps
a mask of the signals and checks it in such places as system call entry/exit.
I think that making up some sort of 64 bit structure to hold the mask would be 
a performance hit because the signal mask is checked in critical places. The
other solution is to overload an existing signal, but this isn't a general
solution.

Anyways, has anyone out there ran into (or even thought about) this problem. If
so, I would like to get ideas about how this problem will be addressed in the
future. It is going to affect everyone pretty soon as Berkeley only has one
signal left (AT&T has about 8 left).
------------------------------------------------------------------------------
Tom Peck			"Hurry now. You don't want to be late."
Harris Computer Systems		"Late? Late for what?"
2101 W. Cypress Creek Rd.	"No, late as in 'The Late Arthur Dent' late."
Ft. Lauderdale, FL. 33309
uucp: tomp@hcx1.harris.com

mkhaw@teknowledge-vaxc.UUCP (06/18/87)

In article <48300005@hcx1> tomp@hcx1.SSD.HARRIS.COM writes:
>I am trying to implement the SIGWINCH (window size change) signal and ran into
...
>future. It is going to affect everyone pretty soon as Berkeley only has one
>signal left (AT&T has about 8 left).

I just checked on a Sun-3:  SIGWINCH is 28, SIGLOST is 29, SIGUSR1 is 30,
and SIGUSR2 is 31; i.e., one flavor of 4bsd has no signals left (as you put
it), but it does it with SIGWINCH defined.

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {hplabs|sun|ucbvax|decwrl|sri-unix}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

edw@ius2.cs.cmu.edu (Eddie Wyatt) (06/18/87)

In article <48300005@hcx1>, tomp@hcx1.SSD.HARRIS.COM writes:
> 
> I am trying to implement the SIGWINCH (window size change) signal and ran into
> an interesting problem: no more signals are left. Our kernel is a combination
> of both Berkeley and AT&T. Our machine has a 32 bit word size. The system keeps
> a mask of the signals and checks it in such places as system call entry/exit.
> I think that making up some sort of 64 bit structure to hold the mask would be 
> a performance hit because the signal mask is checked in critical places. The
> other solution is to overload an existing signal, but this isn't a general
> solution.
> 
> Anyways, has anyone out there ran into (or even thought about) this problem. If
> so, I would like to get ideas about how this problem will be addressed in the
> future. It is going to affect everyone pretty soon as Berkeley only has one
> signal left (AT&T has about 8 left).
> ------------------------------------------------------------------------------
> Tom Peck			"Hurry now. You don't want to be late."
> Harris Computer Systems		"Late? Late for what?"
> 2101 W. Cypress Creek Rd.	"No, late as in 'The Late Arthur Dent' late."
> Ft. Lauderdale, FL. 33309
> uucp: tomp@hcx1.harris.com

  I also find the number of user defined signals provided to be a limitation.
A better exception handling feature would be for the OS to provide generic
signals (up to max(unsigned int) of them).  The idea being to separate
the hardware concept of an interrupt handler from the software concept
of an exception handler (sound a little like Ada?).

   A posibility, that I haven't tried, is to overload an interrupt's meaning.
The task that generates the interrupt must some how provide extra information
to the actual interrupt handler so it can determine the intented meaning of
the interrupt.  After the system interrupt handle has determined the
meaning, it calls the appropriate exception handler.

  Posibilities for pushing extra information around are: 1. if the
task is interrupting itself (kill) then set a global data structure
to indicate the extra information. 2. if the task is interrupting another
task then one could use pipes (yuk), shared memory (shmctl, shmget), or
common files (double yuk).

  This all sounds like a good library facility to provide, hmmmm.

  Another Unix function that I see having problems is "select".  If you
decide to muck around with the kernal and increase the max number of 
file descriptions to more than the number of bits in an integer then
your going to have problems using select to do multiplexing since
it uses an int as a bit mask for the files examined.

-- 
					Eddie Wyatt

e-mail: edw@ius2.cs.cmu.edu

rml@hpfcdc.HP.COM (Bob Lenk) (06/20/87)

>I think that making up some sort of 64 bit structure to hold the mask would be 
>a performance hit because the signal mask is checked in critical places. The
>other solution is to overload an existing signal, but this isn't a general
>solution.

I don't think there's much choice.  If you want more than 32 signals you
need to implement a data structure with more than 32 bits to represent
them.  The performance hit can probably be minimized by doing most of
the checking in the (less frequently called) code where data structures
are modified (eg. signals are sent or signal handling/masking is
somehow modified) and then setting a flag that is easily checked on the
common paths.  Of course the 4.[23]BSD signal interface can't survive
such a change, since sigvec, sigsetmask, sigblock, and sigpause (and
sigreturn) all have an int in the interface.  The interface in the
current draft of the IEEE P1003.1 standard used a defined type to avoid
this limitation.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml

kre@munnari.UUCP (06/20/87)

In article <48300005@hcx1>, tomp@hcx1.SSD.HARRIS.COM writes:
> a mask of the signals and checks it in such places as system call entry/exit.
> ... the signal mask is checked in critical places.

I think you'll find that the vast majority of the checks (executed)
are actually "is there a signal", the code doesn't care which signal
is pending, only whether one is.

You could implement the signal pending and mask stuff anyway that
you like from this point of view, only the signal sending, and delivery
code care about this part of things, and they're not time critical,
so indexing into an array of bits would be reasonable.

However, there's a problem with any change here .. he user interface
assumes that the signal bits can be implemented by a bit mask of 32
bits (actually an "int" which we hope like hell has NSIG (32) bits
in it).

This is one of the interfaces left that has to change.  When its
changed there are going to be more howls of "Berkeley broke it again"
but fortunately this time, the changes need not break any working
Sys V (or V7) compliant code, only stuff that uses the extended
signal facilities.

No easy answer.

kre

simon@its63b.ed.ac.uk (Simon Brown) (06/22/87)

In article <48300005@hcx1> tomp@hcx1.SSD.HARRIS.COM writes:
>
>I am trying to implement the SIGWINCH (window size change) signal and ran into
>an interesting problem: no more signals are left. Our kernel is a combination
>of both Berkeley and AT&T. Our machine has a 32 bit word size. The system keeps
>a mask of the signals and checks it in such places as system call entry/exit.
>I think that making up some sort of 64 bit structure to hold the mask would be 
>a performance hit because the signal mask is checked in critical places. The
>other solution is to overload an existing signal, but this isn't a general
>solution.
>
>Anyways, has anyone out there ran into (or even thought about) this problem. If
>so, I would like to get ideas about how this problem will be addressed in the
>future. It is going to affect everyone pretty soon as Berkeley only has one
>signal left (AT&T has about 8 left).

Of course, it's really only a problem because of the stupidity and short-
sightedness of both the BSD and AT&T people :-(. Under Version-7 one quickly
ran out of signals ('cos there were only 16 of them), and so when people started
implementing things on 32-bit machines, one would have thought that they would
have realized that this was a big problem that required a completely new way
of doing things, not just a case of "oh, well 16 wasn't enough, but I'm sure
that 32 will be!". For the same reason, going up to 64 is NOT a good solution
(though I'm sure that'll be what will happen).

	%{
	    Simon!
	%}

-- 
----------------------------------
| Simon Brown 		         | UUCP:  seismo!mcvax!ukc!its63b!simon
| Department of Computer Science | JANET: simon@uk.ac.ed.its63b
| University of Edinburgh,       | ARPA:  simon%its63b.ed.ac.uk@cs.ucl.ac.uk
| Scotland, UK.			 |
----------------------------------	 "Life's like that, you know"

simon@its63b.ed.ac.uk (Simon Brown) (06/22/87)

In article <1714@munnari.oz> kre@munnari.oz (Robert Elz) writes:
>In article <48300005@hcx1>, tomp@hcx1.SSD.HARRIS.COM writes:
>> a mask of the signals and checks it in such places as system call entry/exit.
>> ... the signal mask is checked in critical places.
>
>I think you'll find that the vast majority of the checks (executed)
>are actually "is there a signal", the code doesn't care which signal
>is pending, only whether one is.
>
>You could implement the signal pending and mask stuff anyway that
>you like from this point of view, only the signal sending, and delivery
>code care about this part of things, and they're not time critical,
>so indexing into an array of bits would be reasonable.

One really nasty bug that really needs fixing fast is that the so-called
"safe signal" mechanism is anything but! If I send 5 SIGINT signals to a
process, there is no guarentee that that process will receive them all - ok,
it will certainly receive at least one, but whether it receives 2,3,4 or 5
is pretty much left to chance!

>
>However, there's a problem with any change here .. he user interface
>assumes that the signal bits can be implemented by a bit mask of 32
>bits (actually an "int" which we hope like hell has NSIG (32) bits
>in it).
>
Yes, but you can always build a little backward-compatibility library to
allow people to still use bitmasks if they want to.

Also, if your "int" doesn't have 32 bits, then you should be using a "long"
instead, for this - like 2.9BSD does.


	%{
	    Simon!
	%}


-- 
----------------------------------
| Simon Brown 		         | UUCP:  seismo!mcvax!ukc!its63b!simon
| Department of Computer Science | JANET: simon@uk.ac.ed.its63b
| University of Edinburgh,       | ARPA:  simon%its63b.ed.ac.uk@cs.ucl.ac.uk
| Scotland, UK.			 |
----------------------------------	 "Life's like that, you know"

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/22/87)

In article <1207@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
-  Another Unix function that I see having problems is "select".  If you
-decide to muck around with the kernal and increase the max number of 
-file descriptions to more than the number of bits in an integer then
-your going to have problems using select to do multiplexing since
-it uses an int as a bit mask for the files examined.

No, it originally did that, but the current (4.3BSD) select() operates
with "fd sets", which in general are several words (enough for one bit
for each fd).

I seem to recall that the trend of the POSIX signals subcommittee was
toward a similar mechanism for signal masks.

lm@cottage.WISC.EDU (Larry McVoy) (06/22/87)

In article <1207@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
$   Another Unix function that I see having problems is "select".  If you
$ decide to muck around with the kernal and increase the max number of 
$ file descriptions to more than the number of bits in an integer then
$ your going to have problems using select to do multiplexing since
$ it uses an int as a bit mask for the files examined.

Not anymore it doesn't.  They use an array.  You can have as many bits as
there are fds.  This is in 4.3+NFS Wisconsin Unix - I dunno about the rest 
of the world.

Larry McVoy 	        lm@cottage.wisc.edu  or  uwvax!mcvoy

mark@applix.UUCP (06/22/87)

In article <1207@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:

   Another Unix function that I see having problems is "select".  If you
 decide to muck around with the kernal and increase the max number of 
 file descriptions to more than the number of bits in an integer then
 your going to have problems using select to do multiplexing since
 it uses an int as a bit mask for the files examined.

Vax Ultrix supports 64 file descriptors. The bit mask select receives is a
pointer to an integer (an array of integers?).  Furthermore, the first
parameter of select is the width of the bit mask, i.e., up to the value
returned by getdtablesize(2).

Now, does anyone know how the bits are laid out?  I would appreciate it if
someone would show me setbit, clearbit, and testbit algorithms that are
not machine- or width-specific.

Tnx.
-- 
                                    Mark Fox
       Applix Inc., 112 Turnpike Road, Westboro, MA 01581, (617) 870-0300
                    uucp:  seismo!harvard!m2c!applix!mark

guy%gorodish@Sun.COM (Guy Harris) (06/23/87)

> Now, does anyone know how the bits are laid out?  I would appreciate it if
> someone would show me setbit, clearbit, and testbit algorithms that are
> not machine- or width-specific.

Under 4.3BSD, and systems that have correctly picked up the 4.3
changes, there are macros to set, clear, and test bits, so you don't
have to know how the bits are laid out.  I presume Ultrix picked them
up.  The manual page should mention FD_SET, FD_CLR, FD_ISSET, and
FD_ZERO; those macros are in <sys/types.h>.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

mkhaw@teknowledge-vaxc.ARPA (Michael Khaw) (06/23/87)

In article <21812@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes:
+Under 4.3BSD, and systems that have correctly picked up the 4.3
+changes, there are macros to set, clear, and test bits, so you don't
+have to know how the bits are laid out.  I presume Ultrix picked them
+up.  The manual page should mention FD_SET, FD_CLR, FD_ISSET, and
+FD_ZERO; those macros are in <sys/types.h>.

These macros are not in my Ultrix 1.2 <sys/types.h>

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {hplabs|sun|ucbvax|decwrl|sri-unix}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

chris@mimsy.UUCP (Chris Torek) (06/24/87)

In article <1207@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>... If you decide to muck around with the kernal and increase the
>max number of file descriptions to more than the number of bits in
>an integer then your going to have problems using select to do
>multiplexing since it uses an int as a bit mask for the files
>examined.

Funny you should mention that. . . .

If you muck around with the 4.2BSD kernel, you are wasting your
time; the 4.3BSD kernel is considerably cleaner.  The only big
uglies left to fix are the VM system and the file system interface.
In particular, select works on more than 32 file descriptors.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

edw@ius2.cs.cmu.edu (Eddie Wyatt) (06/24/87)

In article <7188@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> Funny you should mention that. . . .
> 
> If you muck around with the 4.2BSD kernel, you are wasting your
> time; the 4.3BSD kernel is considerably cleaner.  The only big
> uglies left to fix are the VM system and the file system interface.
> In particular, select works on more than 32 file descriptors.
> -- 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
> Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

Heres the man page declaration of select:
	
	Suns 3.2

     nfds = select(width, readfds, writefds, exceptfds, timeout)
     int width, *readfds, *writefds, *exceptfds;
     struct timeval timeout;

Holy shit they changed the man page on me,

	from our 4.2 with MACH extrensions

     nfound = select(nfds, readfds, writefds, exceptfds, timeout)
     int nfound, nfds;
     fd_set *readfds, *writefds, *exceptfds;
     struct timeval *timeout;

Oh well, I'm happy now :-)

-- 
					Eddie Wyatt

e-mail: edw@ius2.cs.cmu.edu

terrorist, cryptography, DES, drugs, cipher, secret, decode, NSA, CIA, NRO.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/25/87)

In article <492@its63b.ed.ac.uk> simon@its63b.ed.ac.uk (Simon Brown) writes:
>If I send 5 SIGINT signals to a
>process, there is no guarentee that that process will receive them all ...

This is a good point; per-signal counters would be best.  One would also
need a way to reset (flush) the counter for a signal.  Once you start
taking care of things like this, you end up implementing something very
much like System V semaphore IPC.

Signals were originally intended as a way to abnormally change the flow
of execution of a UNIX process.  Trying to press them into service for
general IPC will probably result in a suboptimal design.

edw@ius2.cs.cmu.edu (Eddie Wyatt) (06/26/87)

In article <492@its63b.ed.ac.uk>, simon@its63b.ed.ac.uk (Simon Brown) writes:
> 
> One really nasty bug that really needs fixing fast is that the so-called
> "safe signal" mechanism is anything but! If I send 5 SIGINT signals to a
> process, there is no guarentee that that process will receive them all - ok,
> it will certainly receive at least one, but whether it receives 2,3,4 or 5
> is pretty much left to chance!
> 

  This really isn't a bug as far as I know.  The documentation for signal
and sigvec specifically say that the interrupt causing the interrupt
handler to be called is mask out during the time the interrupt handler
is executing and then restored  at the end of the call.  If any of
those interrupts come during the time the interrupt handler is
called then they are lost, simple is that.  You can always
have the interrupt handle reinstate the interrupt (sigsetmask) if
you really want to.

-- 
					Eddie Wyatt

e-mail: edw@ius2.cs.cmu.edu

terrorist, cryptography, DES, drugs, cipher, secret, decode, NSA, CIA, NRO.

mangler@cit-vax.Caltech.Edu (System Mangler) (06/28/87)

In article <492@its63b.ed.ac.uk>, simon@its63b.ed.ac.uk (Simon Brown) writes:
> One really nasty bug that really needs fixing fast is that the so-called
> "safe signal" mechanism is anything but! If I send 5 SIGINT signals to a
> process, there is no guarentee that that process will receive them all - ok,

Signals weren't designed to be used like that, but if you have to use
them for IPC, there's really only about one safe yet portable way to
wait for a signal, and that's to use longjmp.

int ready, caught;
jmp_buf jbuf;

catch()
{
	caught = 1;
	if (ready)
		longjmp(jbuf, 1);
}

main() {

	/* do a bunch of work, then we want to wait for a signal to proceed */
	if (!caught && setjmp(jbuf) == 0) {
		ready = 1;	/* must be atomic */
		while (!caught)
			pause();
	}
	ready = caught = 0;
	/* can now tell outside world that we're ready for another signal */

This longjmp stuff is necessary because the signal could arrive
between the time you check for it and the time you pause().
(If anyone knows another way in the absence of "reliable signals",
PLEASE tell me).

You can only reliably deal with one signal (of any type) at a time;
if you get two at once, the longjmp of one will abort the signal
catch routine of the other (if getting a signal in the middle of
longjmp doesn't screw up the stack).

Despite the limitations, signals can be useful for IPC, because
they're a lot faster than pipes, and they set runrun, which gets
the signaled process off to a running start.  On 4.[23] BSD,
flock() is just slightly faster, but you really have to go through
contortions to use it for IPC.	4.3 BSD dump uses it, and it's a mess.

If you want multiple occurances of a signal to be distinct, what
you really need is message passing.  Think of a signal as a zero-
length message with a 4- or 5-bit type field, with a one-entry
queue for each type.

Don Speck   speck@vlsi.caltech.edu  {seismo,rutgers,ames}!cit-vax!speck

brett@wjvax.UUCP (Brett Galloway) (07/10/87)

In article <1216@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
>In article <492@its63b.ed.ac.uk>, simon@its63b.ed.ac.uk (Simon Brown) writes:
>> ... If I send 5 SIGINT signals to a
>> process, there is no guarentee that that process will receive them all - ok,
>> it will certainly receive at least one, but whether it receives 2,3,4 or 5
>> is pretty much left to chance!
>  This really isn't a bug as far as I know.  The documentation for signal
>and sigvec specifically say that the interrupt causing the interrupt
>handler to be called is mask out during the time the interrupt handler
>is executing and then restored  at the end of the call.  If any of
>those interrupts come during the time the interrupt handler is
>called then they are lost, simple is that ...

I was under the impression that signals were *blocked*, not
ignored, during the signal handler.  Thus, if a signal arrives
during the time the interrupt handler is called, it is blocked
until the handler returns, at which time the handler is invoked
*again*.  Signal blocking is what makes Berkeley signals
"reliable".

However, Berekely signals are not queued (unlike, as I understand
it, SIGCLD in SYSV).  This is why the user often needs a secondary
mechanism (such as wait3(2) for SIGCHLD or select(2) for SIGIO)
to process a signal.  As is evident, Berkeley signals operate
conceptually just like hardware interrupts.

-- 
-------------
Brett Galloway
{pesnta,twg,ios,qubix,turtlevax,tymix,vecpyr,certes,isi}!wjvax!brett

davel@hpisoa1.HP.COM (Dave Lennert) (07/24/87)

> However, Berekely signals are not queued (unlike, as I understand
> it, SIGCLD in SYSV).  This is why the user often needs a secondary
> mechanism (such as wait3(2) for SIGCHLD or select(2) for SIGIO)
> to process a signal.

Technically, SIGCLD is not queued in SYSV.

SIGCLD is resent by the kernel whenever a signal handler for SIGCLD
is reinstalled and there are unwaited for zombie children.  This
is different than queueing all instances of SIGCLD being sent.

For example, if a process sends SIGCLD via kill(2), it is not "queued".
Also, if several children die, each generating SIGCLD, and the
parent SIGCLD handler waits for all of them on its first entry
(processing the "first" SIGCLD) there will not be a reentry when
the handler is reinstalled since there are no longer any waiting
zombies.

-Dave Lennert   HP   ihnp4!hplabs!hpda!davel