[comp.lang.c] longjmp

pardo@june.cs.washington.edu (David Keppel) (03/30/88)

I got a copy of "Software Engineering in C" by Peter A. Darnell and
Philip E. Margolis from Springer-Verlag.  I haven't had a chance to
use it extensively yet, but it seems like a pretty good book.  Like
the new edition of K&R, it too, precedes the ANSI standard and has
in it lots of ANSI-related things.  To its credit, it was published
early enough that it doesn't have anything in it about "noalias" :->

In the appendix on setjmp()/longjmp()  (page 421, A.10.2) the grey
book (:-) says:

    If longjmp() is invoked from a nested signal handler (that
    is, from a function invoked as a result of a signal raised
    during the handling of another signal), the behavior is
    undefined.  In all other interrupt and signal handling
    situations, longjmp() should execute correctly.

My question is:  why should this be so?

Seems like this could be a pretty common occurence, say if you had
a (going Un*x here) ^Z handler which was then hit with a ^C, and
like "vi", the semantics of ^C are to jump out to some main loop.

I also don't understand why jumping out of a nested signal handler
is different from jumping out of a non-nested handler.  Somebody
here suggested that things might be funny on machines where the
invocation record of the signal handler is pushed onto a special
system stack, but I haven't yet figured out a way that this makes
a difference.

Please get answers to me any way you can!

     ;-D on  ( Thanks in advance and in retrospect, too )  Pardo

    pardo@cs.washington.edu	..!ucbvax!uw-beaver!uw-june!pardo

pardo@june.cs.washington.edu (David Keppel) (04/05/88)

According to both my Springer Verlag C book and to Chris Torek, the
dpANS document says that longjmp() out of nested signal handlers is
undefined.  I would like to know why this is; the reason is very
non-obvious to me.

This seems (possibly) like a really common occurance.  For instance in
"vi", ^C is supposed to take you back to the main loop from wherever you
are, and it uses a longjmp() to do this (I think).  When "autowrite" is
set, sending a ^Z causes changes to be written before "vi" is suspended.
Sending ^C during ^Z is a well-defined operation and appears to be
implemented by longjmp() out of nested signal handers, and would be
broken by the new standard.

This is at minimum a bothersome restriction from the programmers' view.
I assume that there is some good reason for this restrcition either
from the compiler-writer's view or because of some hardware
organizations.  Somebody locally suggested that it might have something
to do with machines that use a seperate stack for signal handlers, but
so far neither of us has had any ideas about why this (or anything
else, for that matter) would cause such a restriction.

Doug Gwyn suggests that this might have something to do with trying
to wrap multiple levels of trampoline code around longjmp(), but
doesn't have the gory details available.

So far only Doug Gwyn and Chris Torek have responded to my first
posting.  If you understand, please write!

	    ;-D on  ( Coming soon: ping-pong code )  Pardo

    pardo@june.cs.washington.edu   ..!ucbvax!uw-beaver!uw-june!pardo

nw@amdahl.uts.amdahl.com (Neal Weidenhofer) (04/06/88)

In article <4548@june.cs.washington.edu>, pardo@june.cs.washington.edu (David Keppel) writes:
> I also don't understand why jumping out of a nested signal handler
> is different from jumping out of a non-nested handler.
> 
>     pardo@cs.washington.edu	..!ucbvax!uw-beaver!uw-june!pardo

It isn't.  Using longjump from a signal handler ALWAYS results in
undefined behavior.

My favorite example is to consider the case of the signal being
raised while the program is in the middle of malloc(3) (for UN*X
types--something equivalent if you're using VMS or some other
OS).  There is NO WAY that your program is going to continue to
run correctly after control has been forcibly removed from some
routine while its internal tables are in an inconsistent state.

This is why dpANS limits signal handlers to setting a flag and
returning.  Most compilers and/or OS's are going to have to do
some work even to get this right.

The opinions expressed above are mine (but I'm willing to share.)

			Regards,
				Neal Weidenhofer
Nothin' ain't woth nothin'      ...{hplabs|ihnp4|ames|decwrl}!amdahl!nw
   But it's free.               Amdahl Corporation
				1250 E. Arques Ave. (M/S 316)
				P. O. Box 3470
				Sunnyvale, CA 94088-3470
				(408)737-5007

ix426@sdcc6.ucsd.EDU (Tom Stockfisch) (04/06/88)

In article <26739@amdahl.uts.amdahl.com> nw@amdahl.uts.amdahl.com (Neal Weidenhofer) writes:
>...  Using longjump from a signal handler ALWAYS results in
>undefined behavior.

This *greatly* reduces the value of signals.
I like to longjmp() back to a main processing loop on SIGINT.  If 
all I do is set a flag and return,
then I have to check that flag in a million other places to see if a user
is trying to get my attention.

The situation is even worse if the behavior is undefined for longjmp()ing
out of a signal handler for SIGFPE.  If you RETURN from a SIGFPE handler
the behavior is *badly* undefined -- you could be in an infinite loop.

>My favorite example is to consider the case of the signal being
>raised while the program is in the middle of malloc(3) (for UN*X
>types--something equivalent if you're using VMS or some other
>OS).  There is NO WAY that your program is going to continue to
>run correctly after control has been forcibly removed from some
>routine while its internal tables are in an inconsistent state.

You can avoid this by protecting all sections of code that shouldn't
be abandoned half-way thru:
	
	signal( SIGINT, on_sigint );
	...
	protect();
		...
		p =	malloc(SIZE);
		update_linked_list();
		...
	unprotect();
	...
	-----------
	on_sigint.c
	-----------
	static bool	interrupt_ok =		TRUE;
	static bool	interrupt_pending =	FALSE;

	void
	protect()
	{
		interrupt_ok =	FALSE;
	}

	void
	unprotect()
	{
		interrupt_ok =	TRUE;
		if ( interrupt_pending )
		{
			interrupt_pending =	FALSE;
			on_sigint();
		}
	}

	void
	on_sigint()
	{
		if ( !interrupt_ok )
		{
			interrupt_pending =	TRUE;
			return;
		}
		fprintf( stderr, "\n<interrupt>\n" );
		longjmp( main_loop );
	}

>This is why dpANS limits signal handlers to setting a flag and
>returning.  Most compilers and/or OS's are going to have to do
>some work even to get this right.
>...
>				Neal Weidenhofer

I thought unix implementers these days use a similar scheme to
protect sensitive portions of kernel and library function code.
I suppose it would be asking to much to require all C implementations
to do this.  I don't mind enclosing malloc() calls with protect()/unprotect(),
and I usually have some uninterruptable code of my own next to
these calls, but I wouldn't want to have to surround every getc(), printf(),
etc.

Anyway, if anyone is still listening, what versions of unix *do not*
sensitive system code from interrupts?
-- 

||  Tom Stockfisch, UCSD Chemistry   tps@chem.ucsd.edu

blarson@skat.usc.edu (Bob Larson) (04/06/88)

In article <26739@amdahl.uts.amdahl.com> nw@amdahl.uts.amdahl.com (Neal Weidenhofer) writes:

>Using longjump from a signal handler ALWAYS results in
>undefined behavior.
>My favorite example is to consider the case of the signal being
>raised while the program is in the middle of malloc(3) (for UN*X
>types--something equivalent if you're using VMS or some other
>OS).

>  There is NO WAY that your program is going to continue to
>run correctly after control has been forcibly removed from some
>routine while its internal tables are in an inconsistent state.

Just use mkon$p or mkonu$ (or the condition statment in a pl1 routine)
to create a handler for the cleanup$ condition.  Any non-local goto
("longjump") through the stack frame will cause the handler to be
called.  (Unless, of course, your not running primos.  :-) (There are
occasional advantages to huge microcode supported calling sequences.)
[Please note before flaming me: this to contrast the messages that
always assume vax/unix.  If you must flame, do so by mail.  I do agree
with the statement above if "portable" is inserted between "NO" and
"WAY".]

--
Bob Larson	Arpa: Blarson@Ecla.Usc.Edu	blarson@skat.usc.edu
Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson
Prime mailing list:	info-prime-request%fns1@ecla.usc.edu
			oberon!fns1!info-prime-request

greg@csanta.UUCP (Greg Comeau) (04/07/88)

In article <4609@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes:
>
>According to both my Springer Verlag C book and to Chris Torek, the
>dpANS document says that longjmp() out of nested signal handlers is
>undefined.  I would like to know why this is; the reason is very
>non-obvious to me.

The problem that occurs with longjmp() is not within the function itself,
but with any side-effects that may occur because of the longjmp().
(This BTW, only happens to be spelled out in dpANSI, but has been true
even without that being said since its first implementations).  For instance,
within library functions that you don't have source code to (or even your own
routines that you do have source code to), a given routine may be setting an
external variable for later use or maybe let's say as a semaphore.  If a
longjump occurs after the setting of the variable but before it's put to any
use, then you're in trouble.

Another quite obtuse reason is that the corresponding setjmp() may be
called in a line of code where it was part of a subexpression.  Yickie
poo for that one!

And of course, their is the always problamatic code that longjmp's
back to a routine that had made use of some register variable, variable
that were made into registers by the compiler or jumping to a routine
that has already returned!

This is all about the dangers of longjmp in general though and does not
address nested signal handlers specifically, although this does present
problems for them.  Especially when it involves global variables or
ensuring that some event has completed before processing the second
signal (regardless of whether it is the same or not).

The main gist though is that a signal handler should do what it has to do
as quickly as possible.  Also of concern is the way signal are handled on
a given machine.  Whatever particular things it does to handle the interrupt
must be able to be reversable for normal return of the interrupt.
Allowing longjmp's to occur in a nested handler could make the signal
cleanup a real mess.

henry@utzoo.uucp (Henry Spencer) (04/08/88)

> I thought unix implementers these days use a similar scheme to
> protect sensitive portions of kernel and library function code.

Kernel, yes, library, no.

> I suppose it would be asking to much to require all C implementations
> to do this.  I don't mind enclosing malloc() calls with protect()/unprotect(),
> and I usually have some uninterruptable code of my own next to
> these calls, but I wouldn't want to have to surround every getc(), printf(),
> etc.

The implementors don't want to have to do that either, which is why they
generally don't.

> Anyway, if anyone is still listening, what versions of unix *do not*
> sensitive system code from interrupts?

Essentially all of them shield kernel code, but they do it by postponing
the interrupt, not by dancing around it in software after it happens.
Nobody that I know of makes any real attempt to shield library functions.
For one thing, it can be quite difficult to get all the details right.
For another thing, THIS PROBLEM HAS ALWAYS EXISTED, and programs that
ignore it are broken already.  X3J11 did not see any compelling reason
to tackle the difficult problem of unbreaking them.
-- 
"Noalias must go.  This is           |  Henry Spencer @ U of Toronto Zoology
non-negotiable."  --DMR              | {allegra,ihnp4,decvax,utai}!utzoo!henry

levy@ttrdc.UUCP (Daniel R. Levy) (04/08/88)

In article <3669@sdcc6.ucsd.EDU>, ix426@sdcc6.ucsd.EDU (Tom Stockfisch) writes:
# In article <26739@amdahl.uts.amdahl.com> nw@amdahl.uts.amdahl.com (Neal Weidenhofer) writes:
# >...  Using longjump from a signal handler ALWAYS results in
# >undefined behavior.
# 
# This *greatly* reduces the value of signals.
# I like to longjmp() back to a main processing loop on SIGINT.  If 
# all I do is set a flag and return,
# then I have to check that flag in a million other places to see if a user
# is trying to get my attention.

On the other hand, this can be desired.  You might be in an iterative procedure
where internal data is in a consistent state only in a few well-defined places.
Even if this isn't so, a careful look at any protracted, iterative sequence
of code that you want to be able to "bust out of" with a signal will reveal
"high traffic" areas where you can put the check; you don't need to put the
check in a "million other places."
-- 
|------------Dan Levy------------|  Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
|         an Engihacker @        |  	<most AT&T machines>}!ttrdc!ttrda!levy
|     AT&T Data Systems Group    |  Disclaimer?  Huh?  What disclaimer???
|--------Skokie, Illinois--------|

tps@chem.ucsd.edu (Tom Stockfisch) (04/09/88)

In article <2555@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>In article ix426@sdcc6.ucsd.EDU (Tom Stockfisch) THAT'S ME writes:
># In article <26739@amdahl.uts.amdahl.com> nw@amdahl.uts.amdahl.com (Neal Weidenhofer) writes:
># >...  Using longjump from a signal handler ALWAYS results in
># >undefined behavior.
># 
># This *greatly* reduces the value of signals.
># I like to longjmp() back to a main processing loop on SIGINT.
>
>... a careful look at any protracted, iterative sequence
>of code that you want to be able to "bust out of" with a signal will reveal
>"high traffic" areas where you can put the check
>----------Dan Levy-------

I like your idea -- I'll give it a try.  But then I have this question:
in an ANSI C conforming program, am I allowed to do a longjmp() from
a handler for SIGFPE (floating point exception)?
I do this in numeric programs in the same way
as I've been handling SIGINT.  On some systems at least, the results
of *returning* from a SIGFPE handler is undefined.
For instance, on our Celerity machine there is the following warning
int the man entry for signal(3):

     "The assembler optimizations produce code that is not suit-
     able for restoration from floating point exceptions. The
     results of trying to restore from floating point exception
     into assembler-optimized code are not defined.

     "Restoration from floating point exception occurs if there is
     a signal handler for SIGFPE that executes a return state-
     ment, or if SIGFPE is being ignored. In these cases assem-
     bler optimization should be disabled.

     "If the default signal handling is used for SIGFPE, or if the
     SIGFPE signal handler always calls exit(3) or longjmp(3),
     assembler optimization can be used."

So it would seem SIGFPE could not be caught portably at all.  If you
don't return, you violate ANSI, and if you don't exit() or longjmp(),
the results are undefined.
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/10/88)

In article <147@chem.ucsd.EDU> tps@chem.ucsd.edu (Tom Stockfisch) writes:
>So it would seem SIGFPE could not be caught portably at all.  If you
>don't return, you violate ANSI, and if you don't exit() or longjmp(),
>the results are undefined.

Somewhere this issue got on the wrong track.  The current dpANS for C
permits longjmp() in the presence of signals as well as from a signal
handler; however, if invoked from a NESTED signal handler the behavior
is undefined.  This is simply a loophole to permit implementors who
would have a hard time making that work correctly to punt on it.  It
is hoped that implementors who can make it work correctly would do so.

SIGFPE can be caught portably, but be aware that there is no requirement
that a SIGFPE be generated, even for division by zero.

mishkin@apollo.uucp (Nathaniel Mishkin) (04/11/88)

In article <122@csanta.UUCP> greg@csanta.UUCP (Greg Comeau) writes:
>In article <4609@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes:
>The problem that occurs with longjmp() is not within the function itself,
>but with any side-effects that may occur because of the longjmp().
>(This BTW, only happens to be spelled out in dpANSI, but has been true
>even without that being said since its first implementations).  For instance,
>within library functions that you don't have source code to (or even your own
>routines that you do have source code to), a given routine may be setting an
>external variable for later use or maybe let's say as a semaphore.  If a
>longjump occurs after the setting of the variable but before it's put to any
>use, then you're in trouble.
>
>Another quite obtuse reason is that the corresponding setjmp() may be
>called in a line of code where it was part of a subexpression.  Yickie
>poo for that one!

Apollo defines a package called PFM (Process Fault Manager) to deal with
this sort of problem.  The two primitives relevant to this discussion
are "pfm_$cleanup" and "pfm_$signal".  They are analogous to "setjmp"
and "longjmp" except they are "stacked".  Basically, you use them like:

    boolean SomeImportantStateVariable = false;

    foo()
    {
        pfm_$cleanup_rec crec;

        status = pfm_$cleanup(crec);
        if (status.all != pfm_$cleanup_set) {           /* first time? */
            SomeImportantStateVariable = false;         /* No, restore state */
            pfm_$signal(status);                        /* resignal */ 
        }
        else {
            SomeImportantStateVariable = true;
            /* ... Do some important stuff ... */
            SomeImportantStateVariable = false;
            pfm_$rls_cleanup(crec);
        }
    }

(Lisp people should recognize this as something like UNWIND-PROTECT.)

"pfm_$cleanup" returns the first time with the constant value "cleanup
set" (ala "setjmp" returning 0).  It returns the second time with the
integer value "thrown" by a "pfm_$signal".  "pfm_$signal" causes a long
jump to the site of the most recent "pfm_$cleanup".  "pfm_$signal" can
be called explicitly (like "setjmp").  Also, The various Unix signals
are automatically turned into calls to "pfm_$signal" if no signal handler
exists.  Cleanup handlers (the term for the "then" clause of the above
"if" statement) can either choose to resignal by calling "pfm_$signal"
or eat the signal and continue process as that level.  Generally, you're
supposed to resignal unless you recognize the signal that was thrown.
         
I won't make any argument for this being syntactically "pretty", but
it is at least conceptually the right thing.  (I'd like language support
for exception handling, but I don't get to pick these things.)

As part of making Apollo's Network Computing System (NCS) portable, I
had to deal with making a portable subset of PFM.  NCS supports a remote
procedure call (RPC) facility and depends on the above cleanup mechanism.
When you make a call to a remote procedure, if the target of the call
doesn't respond, NCS raises an exception using PFM.  The remote call
looks syntactically like a local call (you're calling a local stub)
so even if we thought it was the right thing to indicate call failure by
returning some "error status" (we don't), we can't since we don't get to
pick the signature of the remote procedure.  Checking global status
variables (like "errno") for failure indications is also forbidden
if you ever want your software to work in an environment where there
are multiple threads of control per address space (we do).

It turns out that implementing the necessary parts of PFM on vanilla
Unix systems via "setjmp/longjmp" was pretty trivial.  (We're talking
300 lines of code here.)  Really, all PFM amounts to is a consistent
and disciplined use of "setjmp/lonjmp".  I think I can post the source
(to some appropriate group) if anyone expresses interest.

By the way, the vanilla Unix implementation of PFM does in fact depend
on the "yicky poo" use of a "setjmp" inside an expression.  "pfm_$cleanup"
is:

    #define pfm_$cleanup(crec) \
        pfm_$_cleanup(setjmp(crec.buf), &crec)

So far, I've found two compilers/runtimes that don't handle this right.
For them, I write "pfm_$cleanup" as:

    #define pfm_$cleanup(crec) ( \
        pfm_$global_setjmp_value = setjmp(crec.buf), \
        pfm_$_cleanup(pfm_$global_setjmp_value, &crec) \
    )

at the cost of introducing a yicky poo global variable.  (Hey, I can't
fix *all* of Unix's problems at once!)
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc.
                       Chelmsford, MA
                       {decvax,mit-eddie,umix}!apollo!mishkin

nw@amdahl.uts.amdahl.com (Neal Weidenhofer) (04/15/88)

In article <3669@sdcc6.ucsd.EDU>, ix426@sdcc6.ucsd.EDU (Tom Stockfisch) writes:
> In article <26739@amdahl.uts.amdahl.com> nw@amdahl.uts.amdahl.com (Neal Weidenhofer) writes:
> >...  Using longjump from a signal handler ALWAYS results in
> >undefined behavior.
> 
> This *greatly* reduces the value of signals.
> I like to longjmp() back to a main processing loop on SIGINT.  If 
> all I do is set a flag and return,
> then I have to check that flag in a million other places to see if a user
> is trying to get my attention.

The point is that this has ALWAYS been true.  If you
(permanently) exit a subroutine in a random place, global
and/or static variables may be left in an inconsistent state so
that future use of routines that depend on those variables may
fail in mysterious ways.

The fact that it hasn't happend to you yet only means that it
works most of the time, not that it's correct.

> The situation is even worse if the behavior is undefined for longjmp()ing
> out of a signal handler for SIGFPE.  If you RETURN from a SIGFPE handler
> the behavior is *badly* undefined -- you could be in an infinite loop.

Yes, this can be a problem on some architectures.  You may have
to choose the lesser of two evils (which, as some friends of
mine are fond of pointing out, is still evil.)

> >My favorite example is to consider the case of the signal being
> >raised while the program is in the middle of malloc(3) (for UN*X
> >types--something equivalent if you're using VMS or some other
> >OS).  There is NO WAY that your program is going to continue to
> >run correctly after control has been forcibly removed from some
> >routine while its internal tables are in an inconsistent state.
> 
> You can avoid this by protecting all sections of code that shouldn't
> be abandoned half-way thru:

[Long example deleted.]

This example will probably work, provided that you identify ALL
instances of vulnerable code.  Note that it works by having the
signal handler just set a flag most of the time anyway.

Consider the problem faced by X3J11 though.  It's our job to
specify the language in such a way that conforming programs will
ALWAYS work on conforming implementations.  There is just no way
that a language standard could specify a scheme such as this and
when it needs to be used.

> >This is why dpANS limits signal handlers to setting a flag and
> >returning.  Most compilers and/or OS's are going to have to do
> >some work even to get this right.
> >...
> >				Neal Weidenhofer
> 
> I thought unix implementers these days use a similar scheme to
> protect sensitive portions of kernel and library function code.
> I suppose it would be asking to much to require all C implementations
> to do this.  I don't mind enclosing malloc() calls with protect()/unprotect(),
> and I usually have some uninterruptable code of my own next to
> these calls, but I wouldn't want to have to surround every getc(), printf(),
> etc.

The kernel is typically protected by the hardware delaying some
or all interrupts until the kernel returns control to the user.

For a library routine though, stop and think for a moment.  How
would you accomplish it?

> ||  Tom Stockfisch, UCSD Chemistry   tps@chem.ucsd.edu

The opinions expressed above are mine (but I'm willing to share.)

			Regards,
				Neal Weidenhofer
Where have all the              ...{hplabs|ihnp4|ames|decwrl}!amdahl!nw
     graveyards gone?           Amdahl Corporation
Gone to flowers every one.      1250 E. Arques Ave. (M/S 316)
				P. O. Box 3470
				Sunnyvale, CA 94088-3470
				(408)737-5007

kent@happym.UUCP (Kent Forschmiedt) (04/19/88)

[ various discussion about promises (or lack of them) regarding
  the condition of one's data segment after longjmp() from a
  signal handler ]
>The kernel is typically protected by the hardware delaying some
>or all interrupts until the kernel returns control to the user.
>
>For a library routine though, stop and think for a moment.  How
>would you accomplish it?
----

Maybe I'm old fashioned, but I usually mask out signals in critical 
sections, and/or keep status flags to detect errors and reentry... 

I usually use a single signal handler as a dispatcher. This makes 
detection and handling of multiple signals easy.  It is always (in 
the implementations I use) possible to get another (NOT the same 
one) signal in before the handler has a chance to mask it out.
When it is really important, I use semaphores and dispatch queues
to keep it all under control.  That sort of thing usually requires
cooperation from otherwise unrelated parts of the application.


Anyway, the point I want to make is that none of that is new, nor
is any of this:

When a system call is interrupted by a signal, the function returns 
an error status and errno == EINTR.  That is orderly and well 
defined. If you care whether your program works (and sometimes we 
don't), your code must look for error conditions returned from 
system calls. 

Any other library function, however, is the same as any user code. 
It is running in user mode, and may be asynchronously interrupted. 
Such interrupts may occur any old time, not just between source 
lines. That has always been the case, with every architecture and 
compiler that I have ever seen, so this whole issue of global and 
static data being questionable after a longjmp() from a signal 
handler is old hat.

So it's the same old story.  Codification of existing practice and 
all that.  If there is a potential problem, you must design your 
code to deal with it, or live with unreliable programs.  Various 
systems provide variously useful ways of dealing with it.  I don't 
know whether dpANS meddles much with signal control semantics - I 
doubt it. 

-- 
--
	Kent Forschmiedt -- uucp: tikal!camco!happym!kent
	Happy Man Corporation  206-282-9598

rml@hpfcdc.HP.COM (Bob Lenk) (04/20/88)

> Somewhere this issue got on the wrong track.  The current dpANS for C
> permits longjmp() in the presence of signals as well as from a signal
> handler; however, if invoked from a NESTED signal handler the behavior
> is undefined.

This is not off the track at all.  Quoting from section 4.7.1.1, page 121,
lines 16-18

	If the signal occurs other than as the result of calling the
	abort or raise function, the behavior is undefined if the
	signal handler calls any function in the standard library
	other than the signal function itself or ...

There are no exceptions for longjmp() or exit() (despite their mention
earlier on the same page).  This is correct, because their behavior is
indeed potentially incorrect if the signal has interrupted a call like
malloc(), atexit(), or one of the stdio functions.

Of course there are other times that calling exit() or longjmp() or
various other functions may indeed be safe, but there's no way using
just X3J11 primitives to determine that (in fact there's no way to
know for sure that a given signal is the result of abort() or raise();
consider a UNIX environment where another process has done a kill()
just before yours did the abort() or raise()).

POSIX addresses this less restictively, and provides primitives to
mask signals so that applications can know what a signal is (or is not)
interrupting.  Of course C must run on a wider range of implementations
than POSIX.

POSIX probably should require that longjmp() from a nested signal
handler work; it currently does not.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml
		rml%hpfcla@hplabs.hp.com

pardo@june.cs.washington.edu (David Keppel) (05/09/88)

Question about the (portable) definition of longjmp():

I'm writing (yet another) lightweight process manager.  I malloc() off
a bunch of little stacks.  I'd like to be able to terminate on
abnormal conditions by doing a longjmp() from any given stack to the
original ("heavyweight") stack and cleaning up the package.

It seems "obvious" to me that I can't just go longjmp()ing across
random stacks, but I wonder if

+ it is possible to portably jump back to the original stack from any
  arbitrary stack (e.g., longjmp() is guaranteed to be implemented so
  this will work)
+ whether I can force this to be portable (e.g., the dynamic link of
  the first frame on any stack always points to the "heavy" stack)
+ or this is definitely going to break on some machine(s).

Please reply directly, as I'm sure people would rather be reading
about "goto" (:-).  I'll summerize (or is that winterize?) if there's
sufficient interest.

	;-D on  ( All tanked up and nowhere to go )  Pardo

		..!ucbvax!uw-beaver!uw-june!pardo
		    pardo@june.cs.washington.edu

rtm@christmas.UUCP (Richard Minner) (10/11/90)

In article <13914:Oct920:48:3290@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <:_A6T46@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>> signals. In fact, calling longjmp from within signals is evil. The only
>> thing you should do within a signal routine is set a flag... anything
>> else is a bug waiting to happen.
>Correct.

(Note: my comma key is DEAD; please excuse the misuse of semi-colons.)

I'm hoping; from Dan's followup to alt.religion.computers; that this 
really is a religious issue; perhaps like goto's (i.e. it's evil in
that it is easy to do stupid things with it).  You see; I just did
this evil thing (longjmp in handler) for the very first time yesterday;
and I wanted to know if I really will go to hell for it and if so; why?

I tried using a flag; honest; but I'm catching SIGSEGV@ and the stupid
thing just got stuck in a loop and never could get to where I tested
the flag.  (Perhaps I missed something there; but the def of signal says
to me that "if func (the handler) executes a return statement ... the
program will resume execution at the point it was interrupted".  In my
case this appears to be right back at the "invalid access to storage".
Someone please correct me if I'm mistaken here.)  So; what I did is
basically:
	catch_sigsegv();	/* set handler */
	if (setjmp(env) == 0)
		[code that might cause SIGSEGV]
	else
		[caught SIGSEGV; set error etc.]
	release_sigsegv();	/* restore SIGSEGV */

and	void handler(sig) int sig; { longjmp(env [comma] 1); }

Anyway; it seems to work fine and I don't see where the hidden bug is.
Is this really a rotten thing to do?

------
@ Why catch SIGSEGV and do anything but abort()?  It has to do with
  mmap()'ing a file under Sun UNIX and running out of disk space.
  If you want details; send mail to the address below.


-- 
Richard Minner  || {uunet,sun,well}!island!rtm     (916) 736-1323 ||
                || Island Graphics Corporation     Sacramento, CA ||