[comp.unix.programmer] List of routines safe to use in signals?

rang@cs.wisc.edu (Anton Rang) (12/11/90)

  Is there a list of library routines which can safely be used within
a signal handler under any of the "standard" UNIX versions (preferably
POSIX, but I'd like BSD 4.3 and SVR4 info as well)?  Presumably this
is in the POSIX standard, but if anybody has an online copy of such a
list, I'd appreciate it.

  So far, I know to avoid functions which use static areas of memory
for their return value (e.g. getpwent), and I avoid the standard I/O
routines (e.g. fprintf) unless I know that the file won't have other
I/O going on at the time.  I also save and restore errno, which is
perhaps overkill....

  Anyway, is such a list available on the net somewhere?  Lacking
that, is there a reasonable heuristic to guess what's safe?

	Anton
   
+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+

meissner@osf.org (Michael Meissner) (12/12/90)

In article <RANG.90Dec10231747@nexus.cs.wisc.edu> rang@cs.wisc.edu
(Anton Rang) writes:

|   Is there a list of library routines which can safely be used within
| a signal handler under any of the "standard" UNIX versions (preferably
| POSIX, but I'd like BSD 4.3 and SVR4 info as well)?  Presumably this
| is in the POSIX standard, but if anybody has an online copy of such a
| list, I'd appreciate it.
| 
|   So far, I know to avoid functions which use static areas of memory
| for their return value (e.g. getpwent), and I avoid the standard I/O
| routines (e.g. fprintf) unless I know that the file won't have other
| I/O going on at the time.  I also save and restore errno, which is
| perhaps overkill....
| 
|   Anyway, is such a list available on the net somewhere?  Lacking
| that, is there a reasonable heuristic to guess what's safe?

Here is the complete list:

	signal (or sigaction)  to set up the signal handler again.

That's it.  You may think something's safe, but it may be in the
middle of a malloc call or what have you.  Without having the library
do a sigblock around every single critical area (and then being slow
as molosses in january) there is no safe list of routines.  Either
rewrite your code not to call ANYTHING from a signal handler, or be
prepared to have it fail occasionally in unpredicatible areas.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

boyd@necisa.ho.necisa.oz (Boyd Roberts) (12/12/90)

In article <RANG.90Dec10231747@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>
>  Anyway, is such a list available on the net somewhere?  Lacking
>that, is there a reasonable heuristic to guess what's safe?
>

Assume that nothing is safe when it comes to library routines
being called from within signal handlers.  There should be a
FAQ entry that encourages the practice of doing as little
as possible in signal handlers.

Set a flag in the signal handler and check for it elsewhere in
the code.  Keep it simple and you won't get burned.


Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

cgy@cs.brown.edu (Curtis Yarvin) (12/12/90)

In article <1960@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd Roberts) writes:
>In article <RANG.90Dec10231747@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>>
>>  Anyway, is such a list available on the net somewhere?  Lacking
>>that, is there a reasonable heuristic to guess what's safe?
>
>Assume that nothing is safe when it comes to library routines
>being called from within signal handlers.  There should be a
>FAQ entry that encourages the practice of doing as little
>as possible in signal handlers.

Hey, excuse me if I'm being a complete newbie.  But how can a reentrant
library routine, say "strcpy", cause problems in this case?

>Boyd Roberts			boyd@necisa.ho.necisa.oz.au
>``When the going gets weird, the weird turn pro...''

Curtis

"I tried living in the real world
 Instead of a shell
 But I was bored before I even began." - The Smiths

jik@athena.mit.edu (Jonathan I. Kamens) (12/12/90)

In article <59190@brunix.UUCP>, cgy@cs.brown.edu (Curtis Yarvin) writes:
|> Hey, excuse me if I'm being a complete newbie.  But how can a reentrant
|> library routine, say "strcpy", cause problems in this case?

  The obvious answer is, "How do you *know* the routine is reentrant?"

  Now, in response to that, you can say, "Why would anyone have any reason to
write a strcpy function that isn't reentrant?" but that question misses the
point.  You can't assume anything about library functions that isn't in the
spec for the function.  If the man page (or, more recently, the ANSI spec, or
the POSIX spec, or whatever) for the function says that it's reentrant,
then it is, and you can count on it being reentrant.  However, if that is not
specified, then you cannot make any assumptions about whether or not the
function is reentrant.

  Obviously, a signal handler that calls strcpy isn't going to have any
trouble on the vast majority of systems currently in existence, if not on all
systems.  But there's no way to know that for sure.  Furthermore, there's the
"slippery slope" problem which says that if you first assume that strcpy is
reentrant, you may continue to make that assumption with more and more
functions, until suddenly you'll find yourself calling printf :-).

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

src@scuzzy.in-berlin.de (Heiko Blume) (12/13/90)

jik@athena.mit.edu (Jonathan I. Kamens) writes:
>If the man page (or, more recently, the ANSI spec, or
>the POSIX spec, or whatever) for the function says that it's reentrant,
>then it is, and you can count on it being reentrant.

you *DO* believe in man pages, not to speak of specifications?????

>However, if that is not
>specified, then you cannot make any assumptions about whether or not the
>function is reentrant.

you still can't make any assumptions if it is specified. you 
can only *hope* it'll work, unless you've got the source and
are able to *proof* it. if you try it with n different scenarios,
then you can say "it worked these n times". nothing more.

don't get me wrong, but there are so many things that are in the
man page, but just don't work.

-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home

jik@athena.mit.edu (Jonathan I. Kamens) (12/14/90)

In article <1990Dec13.022804.7712@scuzzy.in-berlin.de>, src@scuzzy.in-berlin.de (Heiko Blume) writes:
|> don't get me wrong, but there are so many things that are in the
|> man page, but just don't work.

  Then they should be fixed as they are found.

  A man page for a library function is a specification for that function. 
Degenerating a little bit, a specification for a function is also a
specification for that function.  The definition and purpose of a
specification in computer programming is that it tells you, as a programmer,
what YOU need to do and what the FUNCTION guarantees to do when you call it.

  Saying that we shouldn't listen to what's in man pages because there are
mistakes in man pages is like saying that whenever you put an answer into a
crossword puzzle and peek at the answer key and it's different, you should
assume that the answer key is wrong since there have been mistakes in answer
keys in the past.

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/14/90)

In article <1990Dec13.205957.25208@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes:
> In article <1990Dec13.022804.7712@scuzzy.in-berlin.de>, src@scuzzy.in-berlin.de (Heiko Blume) writes:
> |> don't get me wrong, but there are so many things that are in the
> |> man page, but just don't work.
>   Then they should be fixed as they are found.

Exactly. Like NFS should be fixed to report EDQUOT on write(), not
close().

>   A man page for a library function is a specification for that function. 
> Degenerating a little bit, a specification for a function is also a
> specification for that function.  The definition and purpose of a
> specification in computer programming is that it tells you, as a programmer,
> what YOU need to do and what the FUNCTION guarantees to do when you call it.

Exactly. Like close() is guaranteed to return either 0 or -1 with EBADF
or EINTR. Whether the file is local or over NFS, close() *guarantees*
not to return anything else.

>   Saying that we shouldn't listen to what's in man pages because there are
> mistakes in man pages is like saying that whenever you put an answer into a
> crossword puzzle and peek at the answer key and it's different, you should
> assume that the answer key is wrong since there have been mistakes in answer
> keys in the past.

Exactly. Saying that the man page doesn't list all the possible error
returns is like saying that... well, you get the idea. So you should
always assume that what the man page says about error returns is right.

(This is a mild flame, btw. Jon, are you sure you're being consistent?)

---Dan

jik@athena.mit.edu (Jonathan I. Kamens) (12/14/90)

In article <26828:Dec1404:06:4290@kramden.acf.nyu.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

I'll answer your last question first:

|> (This is a mild flame, btw. Jon, are you sure you're being consistent?)

  Yes, I am quite sure I'm being consistent.

|> In article <1990Dec13.205957.25208@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes:
|> > In article <1990Dec13.022804.7712@scuzzy.in-berlin.de>, src@scuzzy.in-berlin.de (Heiko Blume) writes:
|> > |> don't get me wrong, but there are so many things that are in the
|> > |> man page, but just don't work.
|> >   Then they should be fixed as they are found.
|> 
|> Exactly. Like NFS should be fixed to report EDQUOT on write(), not
|> close().

  Whenever there is an inconsistency between a man page and a library function
or system call, then the question that must be asked is, which one is wrong --
the man page or the function?

  I refuse to argue with you again about which one is wrong in the case of
close() causing EDQUOT.  I see no reason for you to bring it up again, except
just to be moderately obnoxious.  I think everyone who read that discussion
knows that my opinion is that the man page is wrong, and that your opinion is
that the function/kernel is wrong.

  I think you are intelligent enough to realize that when there is an
inconsistency between a man page and a library function, the first decision
that must be made is which one is wrong, and therefore which one needs fixing.
If you are, indeed, intelligent enough to realize that, then I can only
interpret the posting to which I am responding as a petty attempt to get me to
re-enter an old argument for no reason.  If you aren't intelligent enough to
realize that, then I'm sorry for overestimating you.

|> (This is a mild flame, btw. Jon, are you sure you're being consistent?)

  This is a mild flame, btw.  Can't you just let a dead dog lie?

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

cameron@usage.csd.oz (Cameron Simpson) (12/14/90)

In <1990Dec13.205957.25208@athena.mit.edu>, by jik@athena.mit.edu (Jonathan I. Kamens):
| In <1990Dec13.022804.7712@scuzzy.in-berlin.de>, src@scuzzy.in-berlin.de (Heiko Blume):
| |> don't get me wrong, but there are so many things that are in the
| |> man page, but just don't work.
| 
|   Then they should be fixed as they are found.
| 
|   A man page for a library function is a specification for that function. 

True, but often a man page specifies more than what is portable.
For instance, free() may be defined as accepting NULL happily.
Or (my personal peeve) realloc() may say the old pointer is invalid
even if the realloc fails.

The thing is that there is often (nay, never) a distinction made in the
manual about what is specific to the local implementation and what is
generic (generic BSD, generic Sys5, generic ANSI C, etc etc etc). It is
often difficult to decide how much of a manual page one should trust when
writing code to run on different implementations.
	- Cameron Simpson
	  cameron@spectrum.cs.unsw.oz.au

meissner@osf.org (Michael Meissner) (12/15/90)

In article <1960@necisa.ho.necisa.oz> boyd@necisa.ho.necisa.oz (Boyd
Roberts) writes:

| In article <RANG.90Dec10231747@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
| >
| >  Anyway, is such a list available on the net somewhere?  Lacking
| >that, is there a reasonable heuristic to guess what's safe?
| >

	...

| Set a flag in the signal handler and check for it elsewhere in
| the code.  Keep it simple and you won't get burned.

Note that unless you are careful, even setting a flag may not be safe,
if it takes more than one instruction to store the value.  Off the top
of my head, this can happen because of:

   1)	Using an int flag on an 8-bit micro, which requires 2 or more
	instructions to store the various pieces.

   2)	Using a char flag on a RISC machine which has no byte
	addressing modes, and does store bytes by loading the word,
	and/or-ing the value into place, and storing the word.

   3)	Using an int flag on a CISC machine which doesn't align
	things, the flag may span page boundaries, and you catch
	things in the middle of a page fault.

ANSI specifies that the vendor must provide 'sig_atomic_t' in which it
is guaranteed to be safe to store static/global flags from signal
handlers.  Using 'char' will proably work on most of the machines (but
of course not all).
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

jik@athena.mit.edu (Jonathan I. Kamens) (12/17/90)

In article <995@usage.csd.unsw.oz.au>, cameron@usage.csd.oz (Cameron Simpson) writes:
|> The thing is that there is often (nay, never) a distinction made in the
|> manual about what is specific to the local implementation and what is
|> generic (generic BSD, generic Sys5, generic ANSI C, etc etc etc). It is
|> often difficult to decide how much of a manual page one should trust when
|> writing code to run on different implementations.

  Agreed.  That is why I mentioned the POSIX standards as well.

  Personally, I would never rely on anything in a man page that I was not
absolutely, positively certain was true of ALL implementations of the function
described in the man page.  That means that I would most likely not assume
that a function is safe to use in a signal handler just because the man page
on one OS type says that it is.  Of course, since I don't recall ever seeing a
man page that actually mentions whether or not a function is safe to use in a
signal handler, the point is sort of moot.  Standards like POSIX may deal with
such things, but I don't think most man pages to :-).

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

gwc@root.co.uk (Geoff Clare) (12/18/90)

In <RANG.90Dec10231747@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:

>  Is there a list of library routines which can safely be used within
>a signal handler under any of the "standard" UNIX versions (preferably
>POSIX, but I'd like BSD 4.3 and SVR4 info as well)?  Presumably this
>is in the POSIX standard, but if anybody has an online copy of such a
>list, I'd appreciate it.

I've seen a lot of replies to this question saying effectively that the only
safe thing you can do in a signal handler is set a flag of type sig_atomic_t.

That may be what the 'C' standard, says but this is comp.unix.programmer,
not comp.lang.c!  And Anton specifically asked about POSIX in his question!
There is a long list of interfaces that POSIX guarantees are safe to use
in signal handlers.  I've been waiting to see if anyone else posts the list,
but nobody has, so I'll do it.  I don't have the text of POSIX.1 on line,
but I do have the latest draft of POSIX.3, and the list should be the same
(I haven't checked it).  Here it is:

	_exit, access, alarm, cfgetispeed, cfgetospeed, cfsetispeed,
	cfsetospeed, chdir, chmod, chown, close, creat, dup, dup2,
	execle, execve, fcntl, fork, fstat, getegid, geteuid, getgid,
	getgroups, getpgrp, getpid, getppid, getuid, kill, kill, link,
	lseek, mkdir, mkfifo, open, pathconf, pause, pipe, read, rename,
	rmdir, setgid, setpgid, setsid, setuid, sigaction, sigaddset,
	sigdelset, sigemptyset, sigfillset, sigpending, sigprocmask,
	sigprocmask, sigsimember, sigsuspend, sleep, stat, sysconf,
	tcflow, tcflush, tcgetattr, tcgetpgrp, tcsendbreak, tcsetattr,
	tcsetpgrp, time, times, umask, uname, unlink, ustat, utime
	wait, waitpid, and write.

-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, Hayne Street, London EC1A 9HH, England.   Tel: +44-71-315-6600

src@scuzzy.in-berlin.de (Heiko Blume) (12/18/90)

jik@athena.mit.edu (Jonathan I. Kamens) writes:

>In article <1990Dec13.022804.7712@scuzzy.in-berlin.de>, src@scuzzy.in-berlin.de (Heiko Blume) writes:
>|> don't get me wrong, but there are so many things that are in the
>|> man page, but just don't work.

>  Then they should be fixed as they are found.

sure, but what would you say if you can't mkfs a 512Byte Filesystem?
that's kind of weird for me, didn't anyone try that yet? (btw: i didn't
get any responses to my posting apart from 'why do you want to do that?').

>  A man page for a library function is a specification for that function. 
>Degenerating a little bit, a specification for a function is also a
>specification for that function.  The definition and purpose of a
>specification in computer programming is that it tells you, as a programmer,
>what YOU need to do and what the FUNCTION guarantees to do when you call it.

that's what i thought in the beginning. right now i have a real hard time
with the man pages for sig*(2). perhaps i should correct myself a bit:
it's not that there are many bugs in the man pages, it's what there
*isn't* in the man pages! for example what does sigsuspend(2) *really*
do? say, i block SIGCHLD. A child dies. The signal get's posted
to the parent. since it's blocked it's "pending". ok, what happens when
i call sigsuspend() then, with a mask all zeroes. the man page says it
suspends the process until the *delivery* of a signal not masked by
sigsuspend()'s argument. guess what, although the signal is pending,
the process gets blocked! WHY? [anyway, i'll elaborate on this in
another posting due soon...]

>  Saying that we shouldn't listen to what's in man pages because there are
>mistakes in man pages is like saying that whenever you put an answer into a
>crossword puzzle and peek at the answer key and it's different, you should
>assume that the answer key is wrong since there have been mistakes in answer
>keys in the past.

well, i didn't mean it that drastic, but i think there's at least a lot
missing in the man pages. especially something as complicated as the
(posix) signals should be documented minutely and with examples, or
at least not with sentences like:

"...when a signal is caught during a read(2), a write(2), an open(2),
or an ioctl(2) system call DURING a sigpause system call, or during a
wait(2) system call that does not return immediately due to the existance
of a previously stopped or zombie process, the signal-catching handler
will be executed."

during *WHAT* ??? (that's from sigset(2)). also, as far as i get it,
the BSD sigpause(2) is equivalent to posix sigsuspend(2), so what
has sigpause(2) to do in a posix environment anyway?


can someone give me a pointer to some documentation that *clearly*
explains posix signals and job control?!?!
-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home

src@scuzzy.in-berlin.de (Heiko Blume) (12/18/90)

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>> The definition and purpose of a
>> specification in computer programming is that it tells you, as a programmer,
>> what YOU need to do and what the FUNCTION guarantees to do when you call it.

>Exactly. Like close() is guaranteed to return either 0 or -1 with EBADF
>or EINTR. Whether the file is local or over NFS, close() *guarantees*
>not to return anything else.

yeah!

close(2):

[ENOLINK]   Fildes is on a remote machine and the link to that machine
            is no longer active. (system V Release 3.2).

sounds adequate for NFS, but is for RFS as far as i know.
-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home

src@scuzzy.in-berlin.de (Heiko Blume) (12/18/90)

jik@athena.mit.edu (Jonathan I. Kamens) writes:
> [...]  when there is an
>inconsistency between a man page and a library function, the first decision
>that must be made is which one is wrong, and therefore which one needs fixing.

since you said that the man page is the specification for the function,
then it's always the function that 'needs fixing' if there is a 
inconsistency :-) Unless you change the specification in the first
place, of course. But who is in a position to do that? me? you? god? AT&T?
kinda religion problem in many cases, anyway.
-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home

coren@osf.org (Robert Coren) (12/18/90)

In article <1990Dec18.013608.8234@scuzzy.in-berlin.de>,
src@scuzzy.in-berlin.de (Heiko Blume) writes:
|> ...what does sigsuspend(2) *really*
|> do? say, i block SIGCHLD. A child dies. The signal get's posted
|> to the parent. since it's blocked it's "pending". ok, what happens when
|> i call sigsuspend() then, with a mask all zeroes. the man page says it
|> suspends the process until the *delivery* of a signal not masked by
|> sigsuspend()'s argument. guess what, although the signal is pending,
|> the process gets blocked! WHY? [anyway, i'll elaborate on this in
|> another posting due soon...]

The specification of sigsuspend()'s behavior with respect to signals
that are *already* pending when a program calls sigsuspend() do seem
to be a little vague. I don't have POSIX 1003.1 handy at the moment,
but I took a look at the OSF/1 manpage, and it seems to contain at
least an implication that such a signal is delivered when you call
sigsuspend() (assuming the mask specified in sigsuspend() unmasks the
pending signal). I also took a quick look at the OSF/1 *code*, and
this is in fact what it does.

It may be that your system's implementation interpreted the spec for
sigsuspend()) differently, or it may just be wrong.

Another possibility (admittedly remote) comes to mind. I would assume
that you have set up a handler for SIGCHLD, since you seem to be
interested in its delivery. Is this in fact the case? Have you
confirmed that the SIGCHLD signal is in fact pending (try using
sigpending())? The reason I ask this is that the default behavior for
SIGCHLD is SIG_IGN, in which case it would have vanished silently when
it was posted.

|> ...as far as i get it,
|> the BSD sigpause(2) is equivalent to posix sigsuspend(2), so what
|> has sigpause(2) to do in a posix environment anyway?

A system that implements sigsuspend() maintains sigpause() for
compatibility with "older" systems; they are essentially equivalent. A
potential difference is that the mask given to sigpause() is an int,
where as that given to sigsuspend() is a sigset_t; a system that
implements more than 32 signals (as AIX does, I believe) can define
sigset_t in such a way that sigsuspend() can mask any combination of
signals, but sigpause() can only mask signals with numbers in the
range 1 <= s <= 32.
	Robert

src@scuzzy.in-berlin.de (Heiko Blume) (12/20/90)

coren@osf.org (Robert Coren) writes:


>In article <1990Dec18.013608.8234@scuzzy.in-berlin.de>,
>src@scuzzy.in-berlin.de (Heiko Blume) writes:
>|> ...what does sigsuspend(2) *really*
>|> do?

>The specification of sigsuspend()'s behavior with respect to signals
>that are *already* pending when a program calls sigsuspend() do seem
>to be a little vague. I don't have POSIX 1003.1 handy at the moment,
>but I took a look at the OSF/1 manpage, and it seems to contain at
>least an implication that such a signal is delivered when you call
>sigsuspend() (assuming the mask specified in sigsuspend() unmasks the
>pending signal). I also took a quick look at the OSF/1 *code*, and
>this is in fact what it does.

>It may be that your system's implementation interpreted the spec for
>sigsuspend()) differently, or it may just be wrong.

seems so. but since interactive (i use that) just puts out the 2.2.1
release with 'posix fixes' (whatever that means), i *hope* that will
be fixed! i'll wait for that before i continue working on this problem.

>Another possibility (admittedly remote) comes to mind. I would assume
>that you have set up a handler for SIGCHLD, since you seem to be
>interested in its delivery. Is this in fact the case? Have you
>confirmed that the SIGCHLD signal is in fact pending (try using
>sigpending())? The reason I ask this is that the default behavior for
>SIGCHLD is SIG_IGN, in which case it would have vanished silently when
>it was posted.

in fact i'm trying to teach bash-1.05 posix job control which of course
catches SIGCHLD etc. i already tried

sigset_t pending,nullmask,savemask;
nullmask = (sigset_t) 0;
sigpending(&pending);
if(pending) {
	sigprocmask(SIG_SETMASK,&nullmask,&savemask);
	/* should get pending signals here ?! */
	sigprocmask(SIG_SETMASK,&savemask,(sigset_t *)0);
}
	/* ugly window for signals to arrive */
else
	sigsuspend(&nullmask);

which is A Real Ugly Thing, and doesn't work all the time anyway.
also, if the process hangs in sigsuspend() i can kill -CLD it as often
as i like, it doesn't unblock.
I wonder how the csh works...

>|> ...as far as i get it,
>|> the BSD sigpause(2) is equivalent to posix sigsuspend(2), so what
>|> has sigpause(2) to do in a posix environment anyway?

>A system that implements sigsuspend() maintains sigpause() for
>compatibility with "older" systems; they are essentially equivalent. A
>potential difference is that the mask given to sigpause() is an int,
>where as that given to sigsuspend() is a sigset_t; a system that
>implements more than 32 signals (as AIX does, I believe) can define
>sigset_t in such a way that sigsuspend() can mask any combination of
>signals, but sigpause() can only mask signals with numbers in the
>range 1 <= s <= 32.

ah, that's an argument.

something else i noticed: there's an

int p_chold; /* deferred signal bit mask; sigset(2)
			  * turns these bits on while signal(2)
			  * does not.
			  */

in proc.h. what does that tell me? should i use sigset instead of signal?
or do these bits only get set on sigset(SIGBLA,SIG_HOLD); ???

confused,

	Heiko
-- 
      Heiko Blume <-+-> src@scuzzy.in-berlin.de <-+-> (+49 30) 691 88 93
                    public source archive [HST V.42bis]:
        scuzzy Any ACU,f 38400 6919520 gin:--gin: nuucp sword: nuucp
                     uucp scuzzy!/src/README /your/home