[net.lang.c] setjmp: read the manual

dmr@research.UUCP (10/04/84)

V7, 4.[01]BSD, and SVr0 manuals include this sentence in the setjmp/longjmp
documentation (setjmp(3): "All accessible data have values as of the time
longjmp was called."  So setjmp shouldn't save register variables; they're
irrelevant.  That is,

   ...
	register x = 1;
	if (setjmp(sj_buf)) {
		printf("x = %d\n", x)
		exit(0);
	}
	x = 2;
	f();
   }
   f() { longjmp(sj_buf, 1); }

is supposed to print "x = 2" and longjmp is supposed to trace back in the
stack to make sure this happens.  (This is nontrivial, because the
place where the register that contained x was saved can be anywhere in
call chain.)

There is a very good reason for doing it this way: it makes
register and automatic variables behave the same.  The specification
could have said that data values are restored to those at
the time that setjmp was called, but it doesn't (and shouldn't:
that's a lot of stuff to save).

I checked the behavior on V8, 4.2BSD, and SVr2.  It works right
(prints 2) on V8 and 4.2, works wrong (prints 1) on SV (on VAX, 3B20, 3B2).
I don't have 4.2 or SVr2 manuals at hand so I don't know what they say.

					Dennis Ritchie

gnu@sun.uucp (John Gilmore) (10/10/84)

dmr says:
> V7, 4.[01]BSD, and SVr0 manuals include this sentence in the setjmp/longjmp
> documentation (setjmp(3): "All accessible data have values as of the time
> longjmp was called."  ...  (This is nontrivial, because the
> place where the register that contained x was saved can be anywhere in
> call chain.)
> 
> There is a very good reason for doing it this way: it makes
> register and automatic variables behave the same.
> 
> I checked the behavior on V8, 4.2BSD, and SVr2.  It works right
> (prints 2) on V8 and 4.2, works wrong (prints 1) on SV (on VAX, 3B20, 3B2).
> I don't have 4.2 or SVr2 manuals at hand so I don't know what they say.
> 
> 					Dennis Ritchie

There's a good reason for doing it the other way: efficiency.  The Sun
(4.2 on 68010's) setjmp(3) says:

       "All memory-bound data have values as of the time longjmp was called.
	The machine registers are restored to the values they had at the time
	setjmp was called.  But, because the register storage class is
	only a hint to the C compiler, variables declared as register
	variables may not necessarily be assigned to machine registers,
	so their values are unpredictable after a longjmp.  This is
	especially a problem for programmers trying to write machine-
	independent C routines."

We looked hard at this and decided to change the meaning of
setjmp/longjmp (to the above) rather than having EVERY function save
enough information to restore the registers to the desired state in
case the function's caller did a setjmp and some routine we call does a
longjmp.  On the VAX there is a call instruction that dumps all this
stuff on the stack for you (slowly; I understand a Modula compiler that
avoided the instruction sped itself up by 20%).  On the 68000 there is
no such whizzo instruction so we were forced to do it efficiently or
tell ourselves why not.  Longjmp wasn't a good enough reason.  Also,
some function calls internally generated by the compiler do not use the
standard calling sequence (again for efficiency) and this would have to
be abandoned too.

To fix a program that this breaks, you remove "register" declarations from
routines that call setjmp.  Not so bad.

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/16/84)

I am afraid I don't follow Sun's logic.  All that is required is to
unravel the call stack frames by following the back-pointers.  The
VAX built-in support for this stuff is not used on any UNIXes I am
familiar with.  When someone does a longjmp the odds are very good
that it is not in the inside of a tight loop, so the extra time
taken to unwind the stack frame to fix up clobbered registers should
not be an efficiency issue, unless your call frames have a variable
amount of information and no internal clue about how much.

darryl@ism780.UUCP (10/17/84)

>> V7, 4.[01]BSD, and SVr0 manuals include this sentence in the setjmp/longjmp
>> documentation (setjmp(3): "All accessible data have values as of the time
>> longjmp was called."  ...  (This is nontrivial, because the
>> place where the register that contained x was saved can be anywhere in
>> call chain.)
>>
>> There is a very good reason for doing it this way: it makes
>> register and automatic variables behave the same.
>>
>There's a good reason for doing it the other way: efficiency.  The Sun
>(4.2 on 68010's) setjmp(3) says:
>
>       "All memory-bound data have values as of the time longjmp was called.
>        The machine registers are restored to the values they had at the time
>        setjmp was called.  But, because the register storage class is
>        only a hint to the C compiler, variables declared as register
>        variables may not necessarily be assigned to machine registers,
>        so their values are unpredictable after a longjmp.  This is
>        especially a problem for programmers trying to write machine-
>        independent C routines."

Gee, wouldn't it be easier to just make your jmp_buf big enough to hold all
of the registerable registers and save them at setjmp time and restore them
at longjmp?  Then they would retain their values as of setjmp...

	    --Darryl Richman
	    ...!cca!ima!ism780!darryl

darryl@ism780.UUCP (10/18/84)

Pardon me.  I should have done just as title suggested:  read the manual.
(Actually, I did, but I didn't catch the distinction about "All accessible
data have values as of the time LONGjmp was called".  I kept reading
"... the time SETjmp was called".)

	    --Darryl Richman
	    ...!cca!ima!ism780!darryl

jim@ism780b.UUCP (10/18/84)

>There's a good reason for doing it the other way: efficiency.

Since when is efficiency an issue in regard to setjmp/longjmp?


>The Sun
>(4.2 on 68010's) setjmp(3) says:
>
>       "All memory-bound data have values as of the time longjmp was called.
>        The machine registers are restored to the values they had at the time
>        setjmp was called.  But, because the register storage class is
>        only a hint to the C compiler, variables declared as register
>        variables may not necessarily be assigned to machine registers,
>        so their values are unpredictable after a longjmp.  This is
>        especially a problem for programmers trying to write machine-
>        independent C routines."

I wouldn't let anyone who write such garbledy-goop anywhere near my manuals.
Why not just say the values of register variables local to a setjmp call
are unpredictable following a return via longjmp, period?

-- Jim Balter, INTERACTIVE Systems (ima!jim)

jim@ism780b.UUCP (10/18/84)

>>To fix a program that this breaks, you remove "register" declarations from
>>routines that call setjmp.  Not so bad.
>
>Don't you also have to start removing the register declarations in the
>functions that call functions that call setjmp?  Seems like you could
>lose their (parents to funcs that call setjmp) register values further
>down in the call chain also...

No.  If setjmp saves all registers and longjmp restores them, the registers
are back to what they were at the time of the setjmp, so the parent is ok
(unless your program is in the form of a temporal Klein Bottle, and the
instantiation of the routine that called the routine that called setjmp
happened after the call to setjmp, somehow :-)

-- Jim Balter, INTERACTIVE Systems (ima!jim)

jim@ism780b.UUCP (10/18/84)

>I am afraid I don't follow Sun's logic.  All that is required is to
>unravel the call stack frames by following the back-pointers.  The
>VAX built-in support for this stuff is not used on any UNIXes I am
>familiar with.  When someone does a longjmp the odds are very good
>that it is not in the inside of a tight loop, so the extra time
>taken to unwind the stack frame to fix up clobbered registers should
>not be an efficiency issue, unless your call frames have a variable
>amount of information and no internal clue about how much.

The VAX built-in stuff used is the indication on the stack of which register
variables were saved.

-- Jim Balter, INTERACTIVE Systems (ima!jim)

geoff@desint.UUCP (Geoff Kuenning) (10/19/84)

In article <401@ism780.UUCP> Darryl Richman (darryl@ism780.UUCP) writes:

>>To fix a program that this breaks, you remove "register" declarations from
>>routines that call setjmp.  Not so bad.
>
>Don't you also have to start removing the register declarations in the
>functions that call functions that call setjmp?  Seems like you could
>lose their (parents to funcs that call setjmp) register values further
>down in the call chain also...
>

Nope.  "setjmp" itself takes care of this by saving 100% of the registers at
the time it is called.  So the *parents'* registers are completely
protected.

>>On the VAX there is a call instruction that dumps all this
>>stuff on the stack for you (slowly; I understand a Modula compiler that
>>avoided the instruction sped itself up by 20%).  On the 68000 there is
>>no such whizzo instruction so we were forced to do it efficiently or
>>tell ourselves why not.
>
>BTW the 68000 has a whizzo instruction for saving registers --
>`movem' (move multiple).

The register saving is not the problem.  The problem is recording *on the
stack* which registers have been saved, so that someone other than the routine
that saved the registers (e.g., longjmp) can restore them.  On the 68k, you
would have to explicitly push a movem mask, and unwinding that mask would
involve either slow bit shifting or (yuck) instruction modification (although
you don't have to unwind it in software very often).  But pushing that mask
(you can manage to pop it for free as part of your 'unlk') is a pretty heavy
penalty to pay in a 5-instruction entry/exit sequence (link/movem/movem/unlk
and rts), just so setjmp works a bit simpler.  And what about the pdp11, the
8086, and many other machines?  Bell supports UNIX on an ever-growing variety
of computers.

-- 
	Geoff Kuenning
	First Systems Corporation
	...!ihnp4!trwrb!desint!geoff

rcd@opus.UUCP (Dick Dunn) (10/19/84)

Regarding the setjmp/longjmp interaction with register variables...
dmr gave one approach and justified it on the basis of "correct" behavior.
A response...
> There's a good reason for doing it the other way: efficiency...

Did I really miss something, or did someone just tell us that it is
reasonable to compromise correctness for efficiency?  I have a tough time
with such compromises.

>...
> We looked hard at this and decided to change the meaning of
> setjmp/longjmp (to the above) rather than having EVERY function save...

The parent article goes on to describe problems with saving all the
registers on a 68000.  In fact, there is a "whizzo" (his words) instruction
on the 68000 to save whatever set of registers you need, and you clearly
won't need to save all of them.

>...
> To fix a program that this breaks, you remove "register" declarations from
> routines that call setjmp.  Not so bad.

Actually, the task of fixing a program that breaks is a little harder:
First you have to realize that the problem is a setjmp/register
interaction.  Realistically, that might take days to find (particularly
if you're porting someone else's code).  I can't call that "not so bad".
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Lately it occurs to me what a long, strange trip it's been.

sater@tjalk.UUCP (Hans van Staveren) (10/22/84)

In article <914@opus.UUCP> rcd@opus.UUCP (Dick Dunn) writes:
>Regarding the setjmp/longjmp interaction with register variables...
>dmr gave one approach and justified it on the basis of "correct" behavior.
>A response...
>> There's a good reason for doing it the other way: efficiency...
>
>Did I really miss something, or did someone just tell us that it is
>reasonable to compromise correctness for efficiency?  I have a tough time
>with such compromises.
>
>>...
>> We looked hard at this and decided to change the meaning of
>> setjmp/longjmp (to the above) rather than having EVERY function save...
>
>The parent article goes on to describe problems with saving all the
>registers on a 68000.  In fact, there is a "whizzo" (his words) instruction
>on the 68000 to save whatever set of registers you need, and you clearly
>won't need to save all of them.









           Setjmp/longjmp and register variables


                     Hans van Staveren
                     Vrije Universiteit
                     Amsterdam, Holland






     There has been some discussion going on  in  net.lang.c
about  the  semantics  of setjmp/longjmp in combination with
register variables.  Let me first give a short  introduction
to  the  problem  for those of you caught napping during the
start of the discussion:

        #include <setjmp.h>
        jmp_buf env;

        main() {
                register foo;
                int bar;

                foo=1;bar=2;
                if (setjmp(env)!=0) {
                        printf("Foo=%d,bar=%d\n",foo,bar);
                        exit(0);
                }
                foo=3;bar=4;
                subr();
                abort();        /* cannot happen */
        }

        subr() {
                longjmp(env,1);
        }


     The preceeding program when executed on our 4.1BSD Sys-
tem gave as output:

        Foo=1,bar=4

How come?  The setjmp routine  on  our  system  saves  *all*
registers,  and  the  longjmp  call  reloads them all.  This
includes  all  register  variables  with  the  effect   that
stackvariables  after the longjmp have values as of the time
longjmp was called, while register variables have values  as
of the time setjmp was called.









                           - 2 -


     Obviously this is incorrect  behaviour.   Whatever  the
motive  may be for this implementation, it will give strange
results when porting software.  So as a  user  I  will  give
cries  from  outrage when I encounter this: "Those %$##'&'(&
compiler writers, ...."

     Now the problem as  seen  from  the  compiler  writer's
point  of  view:  In general the word register in front of a
declaration is a hint to the compiler that there may be some
advantage  in  putting  the  variable in a register.  Now an
optimizing compiler might have different ideas, it might see
other  variables  more  fit  to  put  in registers, it might
notice that the overhead cost of saving  and  restoring  the
register is more than the saving in its use, etc.

     So the general problem  boils  down  to  this:  Suppose
there  is a number of registers, of different sizes and pro-
perties, and a number of variables with different sizes  and
usage  statistics,  what  is per procedure/function the best
assignment of variables to registers.   This  is  already  a
hard problem to solve.

     The best solution to this problem will  usually  use  a
different  set  of  registers  per procedure, and it is most
efficient to save only those  registers  that  are  actually
used  in  this procedure.  At the beginning of the procedure
the registers are saved, at the end they are restored.

     After the compiler writer has  finished  his  task,  he
will  start  implementing  the  library procedures, and what
does he  see  in  chapter  III?   "Those  %$&%%(''  language
designers,  ...." they put the non-local goto in the library
instead  of  the  language,  horrors!   Now   to   implement
setjmp/longjmp  as the users want it, there must be some way
to put the registers back to their proper values, if  possi-
ble  without  changing all those closely thought out optimum
decisions for those functions that don't use  that  devilish
pair.

     Solution 1, PDP 11, Unix V7:
Save  exactly  the  same  registers  every  function   call,
csv/cret  style.   This  is  not optimum, but what the heck,
there are only three usable registers  anyhow.   At  longjmp
time, just walk the stack and put them back. (Rhyme)

     Solution 2, VAX 11, 4.1BSD:
Save all registers at setjmp time, put them back at  longjmp
time.  Just slightly incorrect, but what the heck, who cares
about the value of some locals.

     Solution 3, VAX 11, 4.xBSD (don't know really):
Walk the stack at longjmp  time,  finding  all  those  fancy
masks  between  the  dozens of longwords in each stackframe,
and restore the right value.  Lucky to have  such  a  luxury









                           - 3 -


call  mechanism  he?  It just costs more, but what the heck,
just upgrade to a 785 :-)

     General solution, all machines, Amsterdam Compiler Kit:
Hold on to your  chair,  under  18  stop  reading,  compiler
writer's  porno  coming up.  All those youngsters gone? Hey,
you there kid, hiding in the back, out with you!!
Well here it is.  Have the  C-frontend  recognize  the  word
setjmp.   In a function containing a call to setjmp save all
registers, use none, and at the end restore  them  all.   At
longjmp  time  just  close  your eyes and jump, no registers
need be restored except the frame pointer, stack pointer and
program  counter.   This  means  that  functions  not  using
setjmp/longjmp are not bothered, in general the compiler can
continue  its  fancy register optimizations and all programs
will run correct.

     We would be glad if some magician out there would  find
a  way  to  do  it without having the compiler know the word
setjmp, and without having extra cost  for  those  functions
not using it.  However, we don't think it is possible.  C is
a language that has  evolved,  unfortunately  the  non-local
goto  has  never  been tackled by the designers, and we feel
our solution is the best that can be  done  under  the  cir-
cumstances.

     Hope I have been a  help  to  the  discussion,  if  you
disagree  with  our  views  do not hesitate to reply, flames
reduce the costs of heating :-) !!

-- 
			Hans van Staveren, Vrije Universiteit Amsterdam
			..!mcvax!vu44!sater

padpowell@wateng.UUCP (PAD Powell) (10/24/84)

Another method is to acknowlege the problem, and add a "key",
perhaps in a comment (Shades of LINT), indicating that this function
is to have no registers used, and to save/restor all registers on
entry/exit.

Patrick Powell

john@x.UUCP (John Woods) (10/24/84)

The idea of having the compiler recognize setjmp and do something
helpful is interesting.  Our (CRDS) compiler has a similar stunt
which it pulls for our handle/raise routines (similar to setjmp/longjmp,
but uses named conditions; quite a bit more similar to CLU's conditions);
a function which calls handle to handle a condition will have its function-exit
code modified slightly, because the Frame-Pointer in the stack frame is
changed to point to a linked list of handled conditions.  Yes, it is fairly
grotesque, it's a hideous kludge, but it buys us a really powerful facility
at NO extra cost (and if you call before midnight tonight...).
-- 
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1114
...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA

george@idis.UUCP (10/25/84)

I suggest that anyone seriously interested in setjmp/longjmp
read the section titled "unwinding" from BTL CSTR #102,
"The C Language Calling Sequence", by S. C. Johnson and D. M. Ritchie.
Perhaps Dennis Ritchie could post that section.

If anyone is going to implement a routine that has behavior
different from "longjmp" (as described in the tech. report
or UNIX V7 manuals) then it would be prudent to name it
something other than "longjmp".

Hans van Staveren wrote:

>	     General solution, all machines, Amsterdam Compiler Kit:
			...
>	Well here it is.  Have the  C-frontend  recognize  the  word
>	setjmp.   In a function containing a call to setjmp save all
>	registers, use none, and at the end restore  them  all.   At
>	longjmp  time  just  close  your eyes and jump, no registers
>	need be restored except the frame pointer, stack pointer and
>	program  counter.   This  means  that  functions  not  using
>	setjmp/longjmp are not bothered, in general the compiler can
>	continue  its  fancy register optimizations and all programs
>	will run correct.

There is a minor problem with this.
Since one can have an undefined extern pointer to a function,
or a pointer to a function passed as an argument,
one may need to pessimize all calls from some functions
that do not explicitly call "setjmp".
Thus contrary to what Hans claimed,
some functions not using setjmp/longjmp are bothered.
Presumably these do not occur too frequently in practice.
Some programs not using setjmp/longjmp are also bothered.
(I also believe I have heard that the Amsterdam compilers can cheat
by looking at the source for undefined extern objects,
although I do not know if it does so in the above instances.)


		George Rosenberg
		duke!mcnc!idis!george
		decvax!idis!george

sater@tjalk.UUCP (Hans van Staveren) (10/31/84)

>Hans van Staveren wrote:
>
>>	     General solution, all machines, Amsterdam Compiler Kit:
>			...
>>	Well here it is.  Have the  C-frontend  recognize  the  word
>>	setjmp.   In a function containing a call to setjmp save all
>>	registers, use none, and at the end restore  them  all.   At
>>	longjmp  time  just  close  your eyes and jump, no registers
>>	need be restored except the frame pointer, stack pointer and
>>	program  counter.   This  means  that  functions  not  using
>>	setjmp/longjmp are not bothered, in general the compiler can
>>	continue  its  fancy register optimizations and all programs
>>	will run correct.
>
>There is a minor problem with this.
>Since one can have an undefined extern pointer to a function,
>or a pointer to a function passed as an argument,
>one may need to pessimize all calls from some functions
>that do not explicitly call "setjmp".
>Thus contrary to what Hans claimed,
>some functions not using setjmp/longjmp are bothered.
>Presumably these do not occur too frequently in practice.
>Some programs not using setjmp/longjmp are also bothered.
>(I also believe I have heard that the Amsterdam compilers can cheat
>by looking at the source for undefined extern objects,
>although I do not know if it does so in the above instances.)
>
>
>		George Rosenberg
>		duke!mcnc!idis!george
>		decvax!idis!george

Alright, alright!!
I knew when I posted it that someone would see this, but it is one
of the more obnoxious things about our implementation.
Sure, if someone takes the address of setjmp and passes it to another
function which calls it indirectly, our scheme is defeated, or so it
seems. We saw that too.
There are two ways to fix this, the first is to assume all indirect calls
to be dangerous and not use registers, the second and the one we chose
was to check for people taking the address of setjmp, and give them a 
long warning referring to the footnote on page 365a.2 of the manual.
This is ugly, so I didn't post it at first.
However, any programmer that uses functions as dangerous as setjmp
and then hides their calls by indirecting should be banned to
FORTRAN-land for at least ten years.
Before someone else starts screaming about yet another function
that is dangerous in his U*X implementation running up, under or side-by-side
the MAGIC-FOOBAR operating system on the SPEEDY mainframe, it must be
said that the frontend has a list of known dangerous functions compiled
in. At the moment the list is of length 1.

What the cheating comment by George is concerned, I haven't got a clue what
he is talking about.

"If you knew what happened, you would be glad longjmp jumped at all"
-- 
			Hans van Staveren, Vrije Universiteit Amsterdam
			..!{seismo|philabs|decvax}!mcvax!vu44!sater