[comp.std.c] setjmp/longjmp

fuat@cunixc.cc.columbia.edu (Fuat C. Baran) (04/27/89)

We just got an Encore Multimax running UMAX 4.2 (equivalent to BSD
4.2).  While bringing up one of our software packages we ran into a
strange problem with setjmp/longjmp.  The code is setting a flag
(declared int in main()) and then doing a setjmp.  It would then check
this flag and if true, set the flag to false and call a function which
would longjmp.  After the longjmp back it would come back to the code
that checked the flag (expecting it to now be false), and skip the
call to the function that longjmps.  (This is a simplified description
of what is going on, but essentially it wants to call the function
only once).  This works under Ultrix 2.0, SunOS 4.x, etc. but fails on
the Encore.  The flag always is true and the function doing the
longjmp is called repeatedly.  After running this through gdb and
watching the flag getting mysteriously corrupted we read the UMAX man
page for longjmp and found the following paragraph:

   Longjmp restores the environment saved by the last call of setjmp.  It
   then returns in such a way that execution continues as if the setjmp
   call had just returned the value val to the function that invoked
   setjmp.  The function that invoked setjmp must not have returned in the
   interim.  Longjmp cannot cause setjmp to return a value of 0. If longjmp
]  is invoked with 0 as the second argument, setjmp returns 1.  All acces-
]  sible data have values as of the time longjmp was called except for
]  objects of automatic storage class that have been changed between the
]  setjmp and longjmp calls.  The values of these objects is indeterminate
]  (this behavior conforms to the Draft ANSI C language standard).

I don't know much about the Draft ANSI C, so I was wondering if
someone could explain the idea behind the behaviour we are
experiencing on the Encore (i.e. why can't automatic variables be
modified after the setjmp such that they retain their value after the
longjmp back)?  I looked at the man pages for a couple of compilers
that sort of are "Draft ANSI"ish and this point is in general not
explained clearly.

Advance thanks for your help.

						--Fuat
-- 
INTERNET: fuat@columbia.edu          U.S. MAIL: Columbia University
BITNET:   fuat@cunixc.cc.columbia.edu           Center for Computing Activities
USENET:   ...!rutgers!columbia!cunixc!fuat      712 Watson Labs, 612 W115th St.
PHONE:    (212) 854-5128                        New York, NY 10025

clyde@ut-emx.UUCP (Clyde W. Hoover) (04/27/89)

	Well, we ran headlong into that very problem ourselves.  What is going on is that the Encore C compiler puts variables into registers (making the 'register' declaration somewhat superflous).  Upon longjmp, the contents of the registers are restored to what they were at the time of the call to setjmp and since your flag has been stuffed into a register, it is being reset.

	Declaring your flag variable as 'static', should solve your problem, though even that did not work with older versions of the compiler.

	-Clyde Hoover


Shouter-To-Dead-Parrots @ Univ. of Texas Computation Center; Austin, Texas  
	clyde@emx.utexas.edu; ...!cs.utexas.edu!ut-emx!clyde

Tip #268: Don't feel insecure or inferior! Remember, you're ORGANIC!!
	  You could win an argument with almost any rock!

henry@utzoo.uucp (Henry Spencer) (04/27/89)

In article <1447@cunixc.cc.columbia.edu> fuat@cunixc.cc.columbia.edu (Fuat C. Baran) writes:
>I don't know much about the Draft ANSI C, so I was wondering if
>someone could explain the idea behind the behaviour we are
>experiencing on the Encore (i.e. why can't automatic variables be
>modified after the setjmp such that they retain their value after the
>longjmp back)? ...

Because in the general case it's very hard.  With a cooperative machine,
an implementation which accepts some inefficiency on function calls, or a
very clever compiler, it can be done.  But when the machine is unhelpful
(many are) and the efficiency of calls is important (it usually is) and
the compiler's cleverness is limited (it usually is), restoring the
values is difficult.

The major problem is in cases where the variable is in a register, either
as a result of being declared "register" or as a result of a clever compiler,
and there is a single set of registers that is used by all functions.  When
doing a longjmp, to get the values "right" it is necessary to restore the
registers to the values that they had when control left the function being
longjmped to.  Depending on the save/restore convention used by the
particular machine and compiler, this can range from trivial to impossibly
hard.  Compilers that have called functions save registers, and do so only
when necessary, will have saved values scattered through the call stack,
appearing wherever an intermediate function needed that register.  If the
machine has a self-describing stack, like the VAX, longjmp() may be able
to dig them out... but such stacks are inefficient and modern machines
seldom have them.  One can choose a save/restore convention that avoids
this, or a very clever compiler can change save/restore conventions when
it notices the setjmp(), but there are tradeoffs and it isn't always
practical.

If you have a very up-to-date compiler, you may be able to avoid this
problem by declaring the crucial variables with the "volatile" modifier,
which tells an ANSI-C-compliant compiler not to get tricky.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

rex@aussie.UUCP (Rex Jaeschke) (04/28/89)

Let me see if I can shed some light with an example. The following is 
taken verbatim from my latest book "Portability and the C Language" 
published by Hayden last October. Its, in the setjmp/longjmp chapter 
page 254.

==================================================================
#include <stdio.h>
#include <setjmp.h>

main()
{
	jmp_buf buffer;
	int i;
	int j = 10;
	register int k = 100;
	void test();

	i = setjmp(buffer);
	printf("setjmp return = %d\n", i);
	printf("j = %d, k = %d\n", j, k);

	j += 10;
	k += 20;
	if (i == 0)
		test(buffer);
}

void test(buffer)
jmp_buf buffer;
{
	longjmp(buffer, 1);
}

setjmp return = 0
j = 10, k = 100
setjmp return = 1
j = 20, k = 100

The first time through, j and k have the expected
values.  However, although both are incremented before test is
called, when longjmp returns control to setjmp, only j's
value is still intact.  The value of k was restored to its
initial value 100 rather than to 120.  For this implementation, it
appears the register variable is not preserved, while the auto
is, assuming, of course, the register variable actually is stored in a
register.  It could well be the opposite way around [[ or both or 
neither could be presevred.]]  In any case,
the result is undefined, and therefore unreliable, as we have
demonstrated.
==================================================================

Rex

----------------------------------------------------------------------------
Rex Jaeschke     | C Users Journal     |  Journal of C Language Translation
(703) 860-0091   | DEC PROFESSIONAL    |1810 Michael Faraday Drive, Suite 101
uunet!aussie!rex | Programmers Journal |     Reston, Virginia 22090, USA
----------------------------------------------------------------------------

fkittred@bbn.com (Fletcher Kittredge) (04/28/89)

In article <12491@ut-emx.UUCP> clyde@ut-emx.UUCP (Clyde W. Hoover) writes:
>
>	Well, we ran headlong into that very problem ourselves.  What is
> going on is that the Encore C compiler puts variables into registers
> (making the 'register' declaration somewhat superflous).

This is common behavior for any optimizing compiler.  For instance,
the compilers for the RISC chips for Sun, HP and MIPS do this.  Compilers
can do a better job than humans in figuring out which variables should
be in registers.

>  Upon longjmp,
> the contents of the registers are restored to what they were at the time
> of the call to setjmp and since your flag has been stuffed into a register,
> it is being reset.

Again, this is standard behavior umong modern CPUs and Unix implementations.
I have found this behavior in DEC, HP and Sun systems.  Both document this
behavior, and it appears to me that ANSI C and Posix both require this
behavior.

regards,
fletcher


Fletcher E. Kittredge  fkittred@bbn.com

rang@cpsin3.cps.msu.edu (Anton Rang) (04/28/89)

In article <1447@cunixc.cc.columbia.edu> fuat@cunixc.cc.columbia.edu (Fuat C. Baran) writes:

   We just got an Encore Multimax running UMAX 4.2 (equivalent to BSD
   4.2).  While bringing up one of our software packages we ran into a
   strange problem with setjmp/longjmp.  The code is setting a flag
   (declared int in main()) and then doing a setjmp.  It would then check
   this flag and if true, set the flag to false and call a function which
   would longjmp.

      [ story about it not working and searching the manual ]
								 All acces-
   ]  sible data have values as of the time longjmp was called except for
   ]  objects of automatic storage class that have been changed between the
   ]  setjmp and longjmp calls.  The values of these objects is indeterminate
   ]  (this behavior conforms to the Draft ANSI C language standard).

This happens with automatic variables which are allocated in registers.  The
setjmp/longjmp code may not save and restore these registers; in this case,
their values will be lost.  In some compilers, you can use 'volatile' to
avoid this; otherwise, declaring variables as 'static' may help (or making 
them 'static volatile' maybe?).

+---------------------------+------------------------+---------------------+
| Anton Rang (grad student) | "VMS Forever!"         | rec.music.newage is |
| Michigan State University | rang@cpswh.cps.msu.edu | under discussion... |
+---------------------------+------------------------+---------------------+

chris@mimsy.UUCP (Chris Torek) (04/29/89)

In article <1989Apr27.165319.23986@utzoo.uucp> henry@utzoo.uucp
(Henry Spencer) writes:
>... in the general case [getting variables right after setjmp/longjmp is]
>very hard.

It is not *that* hard.  It does inhibit some optimisation:

>... a very clever compiler can change save/restore conventions when
>it notices the setjmp() ....

This is a reasonable approach.  setjmp/longjmp are rare enough that I
doubt it will slow things much.

	... compiler code for function call ...
	if (function is setjmp || function is longjmp ||
	    setjmp is called previously in this function) {
		for (all active registers)
			if (register is used only to hold function address)
				/* do nothing */;
			else
				store the register and mark it invalid;
	}
	... generate the call itself ...
	if (function is longjmp)
		reachable = FALSE;	/* all variables are now dead */

(here `previously' means `is reachable via a flow path that might have
run before this function call is reached'---the call can appear below
if there is a loop label above this call.)  setjmp and longjmp
themselves then need not save or restore any registers.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bph@buengc.BU.EDU (Blair P. Houghton) (04/29/89)

In article <13.UUL1.3#5077@aussie.UUCP> rex@aussie.UUCP (Rex Jaeschke) writes:
>Let me see if I can shed some light with an example. The following is 
>taken verbatim from my latest book "Portability and the C Language" 
>published by Hayden last October. Its, in the setjmp/longjmp chapter 
>page 254.

Well plugged! :-)

>setjmp return = 0
>j = 10, k = 100
>setjmp return = 1
>j = 20, k = 100
>
>The first time through, j and k have the expected
>values.  However, although both are incremented before test is
>called, when longjmp returns control to setjmp, only j's
>value is still intact.  The value of k was restored to its
>initial value 100 rather than to 120.  For this implementation, it
>appears the register variable is not preserved, while the auto
>is, assuming, of course, the register variable actually is stored in a
>register.  It could well be the opposite way around [[ or both or 
>neither could be presevred.]]  In any case,
>the result is undefined, and therefore unreliable, as we have
>demonstrated.

The Umax C compiler says:

setjmp return = 0
j = 10, k = 100
setjmp return = 1
j = 10, k = 100

And the Umax manual page for setjmp says:

    SETJMP(3)

    NAME
       setjmp, longjmp - non-local goto
    ...
    All accessible data have values as of the time longjmp was called
	except for objects of automatic storage class that have been
	changed between the setjmp and longjmp calls.  The values of
	these objects is indeterminate (this behavior conforms to the
	Draft ANSI C language standard).^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

So, as you say, it seems that the unreliability of setjmp wrt automatic
variables (registers not included ??) is definitely undefined, but not
entirely broken for all compilers.

However, and as usual, despite the evident "correctness" of the Umax C
compiler, its behavior in this case does not prove the code to be
portable.

				--Blair
				  "Some things you just have to *know*..."

henry@utzoo.uucp (Henry Spencer) (04/30/89)

In article <17179@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>... in the general case [getting variables right after setjmp/longjmp is]
>>very hard.
>
>It is not *that* hard...

Depends on your environment.  If you are free to pick your own calling
conventions, in particular, it's not a big deal.  If you are constrained
by existing conventions, though, it can be pretty tricky, especially
in a dumb compiler.

>>... a very clever compiler can change save/restore conventions when
>>it notices the setjmp() ....
>
>This is a reasonable approach.  setjmp/longjmp are rare enough that I
>doubt it will slow things much.

I and others argued, as formal public comments, that odd behavior of
local variables should be restricted to variables declared "register",
on the grounds that the only real problem is silent promotion of non-
"register" variables into registers, and compilers that are smart enough
to do that are smart enough to notice setjmp and change conventions.
(Dumb compilers may be generating code on the fly, meaning that they
can't easily go back and fix earlier code on seeing setjmp(), but such
compilers generally wouldn't have enough info to promote variables.)
People are more or less used to problems with "register" variables after
longjmp; extending it to all local variables breaks a lot of programs.

Furthermore, if you look at the exact wording in the (draft) standard,
it says that only the variables in the function that called setjmp()
can be fouled up.  In particular, locals in a calling function can't be.
At first glance, satisfying this seems to be almost as hard as "doing
it right":  the setjmp()ing function has to protect any registers its
caller is using, so it might as well protect its own too.  Conventions
that guarantee that no caller registers are in use also usually guarantee
no problems on longjmp().
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

karl@haddock.ima.isc.com (Karl Heuer) (04/30/89)

In article <39197@bbn.COM> fkittred@BBN.COM (Fletcher Kittredge) writes:
>Again, this is standard behavior umong modern CPUs and Unix implementations.
>I have found this behavior in DEC, HP and Sun systems.  Both document this
>behavior, and it appears to me that ANSI C and Posix both require this
>behavior.

That's probably the wrong word; it's hard to see how a value can be "required"
to be indeterminate.  Basically, the implementation is allowed to do whatever
is most convenient if you neglect to declare such an object |volatile|.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

diamond@diamond.csl.sony.junet (Norman Diamond) (05/01/89)

In article <1989Apr29.232632.23997@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:

I >>>... a very clever compiler can change save/restore conventions when
N >>>it notices the setjmp() ....
E >>
W >>This is a reasonable approach.  setjmp/longjmp are rare enough that I
S >>doubt it will slow things much.
  >
I >I and others argued, as formal public comments, that odd behavior of
S >local variables should be restricted to variables declared "register",
  >on the grounds that the only real problem is silent promotion of non-
A >"register" variables into registers, and compilers that are smart enough
  >to do that are smart enough to notice setjmp and change conventions.
P >(Dumb compilers may be generating code on the fly, meaning that they
R >can't easily go back and fix earlier code on seeing setjmp(), but such
I >compilers generally wouldn't have enough info to promote variables.)
C >People are more or less used to problems with "register" variables after
K >longjmp; extending it to all local variables breaks a lot of programs.

Interesting.  What were the formal answers?  (I'd guess that there were no
actual answers but only formal answers :-)

Perhaps the marketplace should be encouraged to support this pseudo-standard.
If customers refuse to buy compilers with misfeatures, even if the compilers
are compliant, correct results can be obtained.

Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are my own.   |  Why are programmers criticized for
  If they're also your opinions,   |  re-inventing the wheel, when car
  you're infringing my copyright.  |  manufacturers are praised for it?

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/02/89)

In article <10203@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes:
>Perhaps the marketplace should be encouraged to support this pseudo-standard.
>If customers refuse to buy compilers with misfeatures, even if the compilers
>are compliant, correct results can be obtained.

Please get with the program.  The main reason a C standard was developed
was precisely because of this kind of "every vendor decide for himself"
approach to implementing C, which made portable programming excessively
difficult.  The are sound reasons for virtually every specification in
the forthcoming C standard, and most "why" questions are addressed by the
accompanying Rationale document.  Even if an implementer personally
disagrees with the rationale (and nearly everybody will probably find
some particular point he thinks should have been specified differently),
so long as the specification is unambiguous you do your customers no
favor by deviating from it.  You will probably also lose sales when your
compiler fails standard conformance tests.

henry@utzoo.uucp (Henry Spencer) (05/02/89)

In article <10203@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes:
>I >I and others argued, as formal public comments, that odd behavior of
>S >local variables should be restricted to variables declared "register"...
>
>Interesting.  What were the formal answers?  (I'd guess that there were no
>actual answers but only formal answers :-)

Correct! :-)  The answer to my argument about this in the second public
comment was essentially "we decided this some time ago and aren't going to
change it now".

>Perhaps the marketplace should be encouraged to support this pseudo-standard.
>If customers refuse to buy compilers with misfeatures, even if the compilers
>are compliant, correct results can be obtained.

In practice there will be considerable pressure on implementors to "do it
right" in any case, since many existing programs will break with the more
liberal X3J11 rules.  I think we can rely on "quality of implementation"
concerns to get this right on any machine where it's practical.  (There
might be a few where it isn't.)  There are enough such topics -- where no
sane implementor would do it wrong, but the standard refuses to guarantee
doing it right -- to be annoying.
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

peter@ficc.uu.net (Peter da Silva) (05/02/89)

In article <10203@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes:
>Perhaps the marketplace should be encouraged to support this pseudo-standard.
>If customers refuse to buy compilers with misfeatures, even if the compilers
>are compliant, correct results can be obtained.

In article <10189@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) flames:
> Please get with the program...
> so long as the specification is unambiguous you do your customers no
> favor by deviating from it.  You will probably also lose sales when your
> compiler fails standard conformance tests.

Please read what you're following up to. Norman was recommending that compiler
writers guarantee the validity of non-register variables after a longjmp
by not doing certain optimisations around a setjmp call. The standard says
that this behaviour is undefined, so making it do the right thing is quite in
line. This will not cause a compiler to fail any conformance tests.

I agree with Norman... the dpANS handling of setjmp/longjmp is undesirable.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

henry@utzoo.uucp (Henry Spencer) (05/03/89)

In article <10189@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>Perhaps the marketplace should be encouraged to support this pseudo-standard.
>>If customers refuse to buy compilers with misfeatures, even if the compilers
>>are compliant, correct results can be obtained.
>
>Please get with the program.  The main reason a C standard was developed
>was precisely because of this kind of "every vendor decide for himself"
>approach to implementing C, which made portable programming excessively
>difficult...
>so long as the specification is unambiguous you do your customers no
>favor by deviating from it.  You will probably also lose sales when your
>compiler fails standard conformance tests.

Uh, Doug, he's talking about doing better than mere conformance, not about
deviating from it.  "Quality of implementation", remember?
-- 
Mars in 1980s:  USSR, 2 tries, |     Henry Spencer at U of Toronto Zoology
2 failures; USA, 0 tries.      | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/03/89)

In article <1989May2.225124.12977@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Uh, Doug, he's talking about doing better than mere conformance, not about
>deviating from it.

That's okay, then -- I must have confused that with suggestions earlier
heard that existing UNIX setjmp/longjmp behavior be preserved, and there
seem to be some current implementations that don't conform to the standard.

It's harder than one might think to obtain more stringent guarantees for
what is preserved across a longjmp.  There was considerable discussion of
this in X3J11 meetings, much of which I no longer remember in detail.
I do recall that the outcome was that stricter requirements were
considered an excessive implementation burden.  If an implementer can do
better, more power to him, but portable programmers are not going to rely
on it.  Heck, they're probably not going to use longjmp much anyway.

fuat@cunixc.cc.columbia.edu (Fuat C. Baran) (05/04/89)

In article <4059@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>I agree with Norman... the dpANS handling of setjmp/longjmp is undesirable.

I encountered this problem on an Encore multimax.  Upon further
reading of the Encore cc(1) man page, we found that if you give it the

	-q nocompiler_registers

option, then automatics do not get moved into registers by the
compiler.  Thus we don't need to changed automatics to statics or
volatiles.

By the way, I looked at a copy of K&R (2nd ed) today (yes, I know,
this is not an ANSI or draft ANSI C reference), and on page 254 it
says:

"longjmp restores the state saved by the most recent call to setjmp,
using the information saved in env, and execution resumes as if the
setjmp function had just executed and returned the non-zero value val.
[...] Accessible objects have the values they had at the time longjmp
was called; values are not saved by setjmp."

						--Fuat

P.S.  What is the current status of ANSI C?  Has it become the
standard yet?  I don't have any recent information on its status.  We
would like to get a copy of the standard once it is official.


-- 
INTERNET: fuat@columbia.edu          U.S. MAIL: Columbia University
BITNET:   fuat@cunixc.cc.columbia.edu           Center for Computing Activities
USENET:   ...!rutgers!columbia!cunixc!fuat      712 Watson Labs, 612 W115th St.
PHONE:    (212) 854-5128                        New York, NY 10025

gwyn@smoke.BRL.MIL (Doug Gwyn) (05/04/89)

In article <1477@cunixc.cc.columbia.edu> fuat@cunixc.cc.columbia.edu (Fuat C. Baran) writes:
>What is the current status of ANSI C?

The proposed Standard is currently delayed until a 15-day response
period for a "late correspondent" (not his fault) expires, after
which if the correspondent has filed an objection to the official
X3J11 response to his comments there will be a 20-day X3 ballot.
If no objection is files, no additional ballot will be necessary.
After final X3 approval, the proposed Standard will be sent to
ANSI for ratification.  We should know more sometime in June.

jeff@Alliant.COM (Jeff Collins) (05/04/89)

In article <1447@cunixc.cc.columbia.edu> fuat@cunixc.cc.columbia.edu (Fuat C. Baran) writes:
>someone could explain the idea behind the behaviour we are
>experiencing on the Encore (i.e. why can't automatic variables be
>modified after the setjmp such that they retain their value after the
>longjmp back)?  I looked at the man pages for a couple of compilers
>that sort of are "Draft ANSI"ish and this point is in general not
>explained clearly.
>

	When the setjmp is first invoked, the current register state is saved,
	in the jmpbuf array.  This register state includes the current values
	of automatic variables.  When the subsequent longjmp is called, the
	state restored in the jmpbuf array is restored, including the register
	set.  This means that any changes to automatic variables after the
	setjmp are "overwritten", because the register that stores the 
	variable is restored to its value before the setjmp.  There is no way
	that setjmp/longjmp can handle this problem, as they do not know which
	automatic variable are "going to possibly change sometime in the future.

	The rule when dealing with setjmp/longjmp and an optimizing compiler
	(ie. one that makes heavy use of registers, like Encore's), is to 
	force the variable in question to not be in a register.