[net.lang.c] setjmp/longjmp

ddb@mrvax.DEC (DAVID DYER-BENNET MRO1-2/L14 DTN 231-4076) (10/24/84)

I will propose a semantics for setjmp/longjmp that is compiler-independent,
implementable in a not-too-bad worst-case way on all machines with which
I am familiar, and reasonably consistent with current implementations.

I see in previous articles enough information to convince me that we can't
ask that all local variables by restored in longjmp to their values at the
time of the setjmp, or to their last value in the routine containing the
setjmp.  If there is a way to achieve this that is affordable, I'd far
prefer it to my proposal.

So the problem comes down to characterizing which variables will be
restored, and which will have random values.  I propose that only
variables not altered since the setjmp call be guaranteed restored; other
variables will have random values (not necessarily their last set value,
the value they had at the time of the setjmp, or anything else).
This is obviously achievable by simply saving and restoring all registers
at setjmp/longjmp; on some hardware considerable improvements in this can
be made.  

So, is this enough to be useful?  And, is more achievable transportably?

I don't see how to achieve more transportably; if someone can, good.

I find this enough to be useful, but only marginally.  It is at least
fairly easy to know what I can count on.  I consider a little bit that
I can count on better than a lot that isn't there when I need it.

			-- David Dyer-Bennet
			-- ...decwrl!dec-rhea!dec-mrvax!ddb

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/29/84)

> I see in previous articles enough information to convince me that we can't
> ask that all local variables by restored in longjmp to their values at the
> time of the setjmp, or to their last value in the routine containing the
> setjmp.  If there is a way to achieve this that is affordable, I'd far
> prefer it to my proposal.
> 
> So the problem comes down to characterizing which variables will be
> restored, and which will have random values.  I propose that only
> variables not altered since the setjmp call be guaranteed restored; other
> variables will have random values (not necessarily their last set value,
> the value they had at the time of the setjmp, or anything else).
> This is obviously achievable by simply saving and restoring all registers
> at setjmp/longjmp; on some hardware considerable improvements in this can
> be made.  
> 
> So, is this enough to be useful?  And, is more achievable transportably?

I agree with Dennis Ritchie that longjmp should resume after the call
to setjmp with all accessible data containing values as of the time of
the longjmp.  This can always be implemented, although some C run-time
designs seem to have not saved enough information on the call stack to
support longjmp.  Such implementations need to be changed.  Any system
where a subroutine call saves all the non-scratch registers (e.g., PDP-11,
Gould) is in good shape, as is any system where the number of registers
actually saved is recorded in the call frame (e.g. VAX).  Other systems
may have to have their calling sequences re-engineered, but that is what
they deserve for taking shortcuts.

johnl@godot.UUCP (11/01/84)

Doug Gwyn argues that longjmp() should always restore all variables in the 
calling routine, and that on machines where this is impossible (because you 
can't tell how many registers were saved at each call) the calling sequence 
"should be fixed." 

No dice.  On machines without stack hardware, such as most notably the IBM 370 
series, it's hard enough to set up a fast C calling sequence without adding 
extra restrictions on unwindability.  When designing VM/IX (the version of Unix 
that runs under VM/370) it took them months to get a calling sequence that 
worked and was only a few instructions.  Changing it so that longjmp() could 
unwind it would add several instructions to every call and return, or else 
require storing and loading many registers which don't change, a similar 
penalty.  This seems a lot to pay for an occasional call to longjmp().  Sounds 
to me much like the traditional bad habit in much Berkeley software of assuming 
that you can dereference a zero pointer and find a zero there -- again much 
hardware makes this difficult or impossible, and in that case I believe we all 
agree that programs that depend on such behavior aren't portable.  

There seems to be a consensus that the semantics for longjmp() should promise 
that all variables in the calling routine except those declared to be 
"register" are restored after a longjmp(), which is both easy to understand and 
straightforward to implement on all hardware of which I am aware.  It also 
happens to be the de-facto standard now, so claims that programs "can't work" 
if longjmp() behaves that way are hard to believe.  

Incendiarily, 

John Levine, ima!johnl or Levine@YALE.ARPA

stevel@haddock.UUCP (11/01/84)

How can one make setjmp longjmp work with a compiler that
optimizes register usage to the point of holding values, both
intermediate and automatics in registers across expressions. I
know that pcc and the Ritchie compiler don't do this but some
newer ones do.

Forcing one to know what was going on in all the registers and
where the current live value for automatic variable is limits
how one can use graph coloring algoithms to allocate registers.

Steve Ludlum, decvax!yale-co!ima!stevel, {amd|ihnp4!cbosgd}!ima!stevel

david@imd.UUCP (11/01/84)

>***** imd:net.lang.c / brl-tgr!gwyn /  9:56 am  Oct 30, 1984
>I agree with Dennis Ritchie that longjmp should resume after the call
>to setjmp with all accessible data containing values as of the time of
>the longjmp.  This can always be implemented, although some C run-time
>designs seem to have not saved enough information on the call stack to
>support longjmp.  Such implementations need to be changed.  Any system
>where a subroutine call saves all the non-scratch registers (e.g., PDP-11,
>Gould) is in good shape, as is any system where the number of registers
>actually saved is recorded in the call frame (e.g. VAX).  Other systems
>may have to have their calling sequences re-engineered, but that is what
>they deserve for taking shortcuts.

Perhaps setjmp/longjmp can always be  implemented.   However  its
correct   implementation   is  at  the  expense  of  performance.
However, I would disagree that the "pdp-11  is  in  good  shape".
Either  it works 100% of the time, or it should not be relied on.
The following bugs indicate to me that it would be a lot of  work
to get setjmp/longjmp to work ALL THE TIME on the pdp11.  

The following code shows a problem with setjmp/longjmp in that if 
a routine does not follow  proper  register  saving  conventions,
then longjmp cannot know that a register was saved temporarily.  

Our (system 3, I believe) ritchie pdp11 C compiler  (C  rel  2.3;
UTS  rel  1.3)  generates a push and restore of register 2 around
structure copies.  If a longjmp is needed  during  the  structure
copy,   longjmp   will  have  no  idea  that  this  register  was
temporarily used for the structure copy.  

In the following code, the register variables are set to -1,  and
never  used.   Hence  after  a  longjmp,  they  should all be -1.
However, longjmp does not realize that register 2  was  saved  on
the  stack  for the structure copy (as it was not done via csav).
Hence, when the longjmp occurs, register variable 'k' is restored 
to an incorrect value (which I believe is  the  number  of  words
left for the structure copy).   [Note  that  the  following  code
works  correctly  with  a  pdp11 pcc compiler, but fails with the
ritchie compiler, since the code that pcc  is  generating  is  so
silly].  

struct  {
	char    array[16000];
	}       astruct, bstruct;

int     jmpbuf[32];

int     alarmrtn ();

main () {
	register    int     i, j, k;

	for (;;) {
	    i = j = k = -1;             /* set all register vars to -1.     */
	    if (setjmp (jmpbuf) == 0)
		break;
	    printf ("i = %d, j = %d, k = %d\n", i, j, k);
	    }

	signal (14, alarmrtn);          /* cause longjmp after 1 second.    */
	alarm (1);                      /* set alarm higher for a busy      */
	for (;;)                        /* system to be sure gets into loop.*/
	    astruct = bstruct;          /* structure copy of 16000 bytes.   */
	}

alarmrtn ()
{
	longjmp (jmpbuf, 1);
	}


A similar problem occurs in libc.a  or  other  assembly  language
routines  that  do  not  use proper csav subroutine linkage.  For
example  our  Strncmp(3)  in  libc.a  [which  may  not   be   the
distributed strncmp] is written in assembler, and again saves and 
restores register 2.

However, even if one fixes these problems, one has to  fix  csav!
Our  csav  is  setting  up  a  new frame (r5) prior to saving the
registers, hence if  one  longjmps  from  an  intr/alarm  from  2
instructions  inside  csav,  [easiest way is with a debugger] all
register variables get clobbered!

Also, it is possible on the pdp11 to leave the floating point  in
the  wrong  int/long  (seti/setl)  mode after a longjmp.  This is
probably worse than registers getting  clobbered.   This  can  be
caused  by a similar longjmp occuring during an intr/alarm of the
C  code:  long  =  double;  This  bug  can   give   core   dumps!
Unfortunately, the obvious fix requires putting in floating point 
into  programs that have setjmp, even if they do not use floating
point.  [Which may break "ed" on  pdp's  without  floating  point
when one hits <intr>].  

So, even though I use setjmp/longjmp when I have to, didn't  your
mother ever tell you to stay away from setjmps/longjmps?  :-) 


	These views are my own, and are probably wrong!

	    David Marx
	    ima!imd!david

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/01/84)

David Marx posted a nice article about the problems with longjmp'ing
from a signal handler on PDP-11s.  The problem is not quite so bad
on systems with atomic linkage sequences but it is still tricky.

This study reinforces a conclusion I drew long ago:  About all one
should do in response to an asynchronous trap is to set a flag for
periodic testing from the main program loop.  (That gives one MUCH
better control over program internal operation.)  It makes one wonder
what good longjmp is after all...

That was the only article so far that inclines me to surrender the
desire for specifying clean longjmp semantics.  There is not much
point in insisting on that if it is unrealizable.

Anonymous@inmet.UUCP (11/02/84)

This message is empty.

thomson@uthub.UUCP (Brian Thomson) (11/03/84)

While I agree that setjmp/longjmp SHOULD behave according to Gospel,
I have a great deal of sympathy for the implementation designer
who doesn't want to sacrifice the efficiency of his calling sequence
for the sake of longjmp.

There is a similar issue of whether to use a stack frame pointer.
Your call/return sequence will be faster if you don't, and you can
address everything relative to the sp.  All the frame pointer
buys you is the ability to get a calling trace from your favourite
core image debugger.  Is it worth it?

Some 780 instruction timings I did years ago indicated that the
CALLS/RET pair take ~15 microseconds minimum, while JSR/RTS execute
in 3.  That's a big penalty for the sake of setjmp(), calling traces,
and, oh yes, the infamous nargs().
-- 
		    Brian Thomson,	    CSRI Univ. of Toronto
		    {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson

bsa@ncoast.UUCP (Brandon Allbery) (11/06/84)

> Article <1@imd.UUCP>, from david@imd.UUCP
+----------------
| >I agree with Dennis Ritchie that longjmp should resume after the call
| >to setjmp with all accessible data containing values as of the time of
| >the longjmp.  This can always be implemented, although some C run-time
| >designs seem to have not saved enough information on the call stack to
| >support longjmp.  Such implementations need to be changed.  Any system
| >where a subroutine call saves all the non-scratch registers (e.g., PDP-11,
| >Gould) is in good shape, as is any system where the number of registers
| >actually saved is recorded in the call frame (e.g. VAX).  Other systems
| >may have to have their calling sequences re-engineered, but that is what
| >they deserve for taking shortcuts.

Ummm... it's easy to push the register mask when you moveml the registers
to be saved on entry to a subroutine in the 68000, but moveml mask,(a7)-
and moveml mask,(a7)+ expect the mask reversed from each other!  How
do you propose to invert a 16-bit mask end-for-end (i.e. 11100...001
<=> 100...00111 ) and still have a fast calling sequence?

--bsa
--
  Brandon Allbery @ North Coast Xenix  |   the.world!ucbvax!decvax!cwruecmp!
6504 Chestnut Road, Independence, Ohio |       {atvax!}ncoast!{tdi1!}bsa
   (216) 524-1416             \ 44131  | E1439@CSUOHIO.BITNET (friend's acct.)
---------------------------------------+---------------------------------------
			`Confusion is my natural state.'
-- 
  Brandon Allbery @ North Coast Xenix  |   the.world!ucbvax!decvax!cwruecmp!
6504 Chestnut Road, Independence, Ohio |       {atvax!}ncoast!{tdi1!}bsa
   (216) 524-1416             \ 44131  | E1439@CSUOHIO.BITNET (friend's acct.)
---------------------------------------+---------------------------------------
			`Confusion is my natural state.'

henry@utzoo.UUCP (Henry Spencer) (11/08/84)

> Ummm... it's easy to push the register mask when you moveml the registers
> to be saved on entry to a subroutine in the 68000, but moveml mask,(a7)-
> and moveml mask,(a7)+ expect the mask reversed from each other!  How
> do you propose to invert a 16-bit mask end-for-end (i.e. 11100...001
> <=> 100...00111 ) and still have a fast calling sequence?

Easy; you're pushing a constant.  The inversion gets done at compile time.
Bear in mind also that the only thing that ever has to decipher that mask
is longjmp(), which doesn't have to be fast.

Alas, even pushing a constant on every function call is a lot of expense
when every call has to pay it just so a few can do longjmp().
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/09/84)

> Ummm... it's easy to push the register mask when you moveml the registers
> to be saved on entry to a subroutine in the 68000, but moveml mask,(a7)-
> and moveml mask,(a7)+ expect the mask reversed from each other!  How
> do you propose to invert a 16-bit mask end-for-end (i.e. 11100...001
> <=> 100...00111 ) and still have a fast calling sequence?

Just great -- more computer architecture brain damage!

The only way I can see that makes sense to try is to record the NUMBER
of registers saved (equivalently, the last register number) and use a
quick scheme to generate the appropriate mask upon cret.  (I am not
very familiar with the 68000 and so I do not know if it has any fast
way to set up such a mask; one way that would work is to index a mask
table.)

In any case, I have given up on setjmp/longjmp.  I never had any use
for them anyway.  What a lossage.

ark@alice.UUCP (Andrew Koenig) (11/10/84)

Suppose you could arrange for a sorted table that contained
triples: (routine start addr, routine end addr, register mask).
You might need to make special provisions in the linker
for this.  While you're at it, make the linker sort the table
and store the table start and end addresses in some canonical
place.

Then, as longjmp unwinds the stack, it can use the return
address from each stack frame to do a binary search in this
table to figure out what registers to restore.

chuck@dartvax.UUCP (Chuck Simmons) (11/13/84)

> Alas, even pushing a constant on every function call is a lot of expense
> when every call has to pay it just so a few can do longjmp().
>
>				Henry Spencer @ U of Toronto Zoology

I find it hard to believe that your compiler is so good at optimizing
code that a single instructuion (okay, 2 instructions on our honeywell)
on each function call is going to make that much difference.

dartvax!chuck

henry@utzoo.UUCP (Henry Spencer) (11/13/84)

> > Alas, even pushing a constant on every function call is a lot of expense
> > when every call has to pay it just so a few can do longjmp().
> 
> I find it hard to believe that your compiler is so good at optimizing
> code that a single instructuion (okay, 2 instructions on our honeywell)
> on each function call is going to make that much difference.

I'm on a pdp11, where the calling sequence is not all that great, maybe
eight or ten instructions (I forget...).  Even adding one instruction to
this is perhaps a 5-15% difference in speed.  This is *not trivial* when
we are talking about the most commonly-used control structure in C code --
if, while, and such don't even come close -- and when you bear in mind
that the function-call sequence has a history of being a major factor in
the speed of C code.  No joke:  when you profile a program on a machine
like the 11, where the C calling sequence stubs show up in the profile,
they usually aren't all that far down the sorted list.  Function-call
overhead is a major expense in C programs.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry