[comp.sys.nsc.32k] A summary of the setjmp problem

jkp@sauna.hut.fi (Jyrki Kuoppala) (02/12/91)

Many persons have instructed me to use 'volatile' or said that 'ANSI C
says that setjmp need not save registers, so you must use volatile or
take the local variables' address'.  I have apparently been unclear in
my postings (or people just didn't read all of them ;-) because people
many have not understood what I was talking about - please read this
message for hopefully a better explanation.

Here's what I sent to bug-gcc:

gcc 1.39, (emacs 18.57)

When porting emacs to a system (Minix 1.3 on NS 32532) whose library
function setjmp doesn't save the register I had the following trouble.

I tried to compile it with and without -traditional.  It crashed in a
bit different place than without -traditional, but crashed anyway.

The result of the emacs crash was that a subroutine clobbered
arguments in the caller routine - and the caller routine didn't call
setjmp().  Like this:

void foo (x, y)
int x;
int y;
{
	int count = x - y; /* the compiler puts this in a register
				variable like r5; say count is 5 */
	/* do various things, not using r5 */
	fprintf (stderr, "Done various things, count = %d\n, count", count);
		/* count is 5 here */
	bar();	/* call bar() */
	fprintf (stderr, "Done bar(), count = %d\n, count", count);
		/* Oh wow, count is a wild number like 846378 here ! */
	baz(count); /* crash in baz() because r5 (count) is bogus */
}

I am told that ANSI C doesn't require setjmp to save the automatic
variables not declared volatile - but if setjmp doesn't save them,
problems arise in the function longjmp is called: the registers used
in the function calling longjmp are left clobbered because the
function calling longjmp doesn't execute the function prologue to
restore them from the stack.

Here's an example program exhibiting the behaviour I had:

#include <stdio.h>
#include <setjmp.h>

jmp_buf env;
void foo()
{
	/* called from main(), no registers used here so nothing is
		saved in prologue  */
	if (setjmp (env)) /* at first setjmp returns 0, so we go call
				bar() */
		sleep(1);
	else {
		bar();
	}
}

void bar()
{
	register int c = 3; /* gcc uses r3 here so we save r3 in the
				stack and set it */
	fprintf (stderr, "c = %d\n", c); /* print r3 */
	longjmp(env, 1); /* do a longjmp to sleep(1) */
	sleep (1);
	/* oops, we never get here where we would restore r3 from
		stack */
}

main()
{
	/* execution starts here */
	register int a = 1; /* we allocate r3 for a */
	foo(); /* call foo, by calling convetion foo must store r3 if
			it changes it */
	fprintf (stderr, "a = %d\n", a); /* print value of a (r3), which
	magically is 3 */
}

With gcc 1.39 and a setjmp that doesn't save r3, the program prints 'a
= 3'.  With a setjmp saving r3, it prints properly 'a = 1'.


Suggestions for a solution:

a) gcc should save all call-saved registers in the prologue of a
function calling setjmp and restore them in the epilogue

b) setjmp should always store call-saved registers.  This is
impossible if users override call-saved with -fcall-saved-REG compiler
switch, so a) would seem a bit better.  But of course we could save
all the registers.  Also, if you for some reason must use a
system-provided setjmp this is out.

c) gcc should presume all registers are clobbered after a call to
setjmp and save the registers which may not be clobbered by the
function just before setjmp / restore them after that (this doesn't
really differ from b) and doesn't look like a good alternative).

d) the longjmp library function should restore the registers saved to
stack in the function longjmp is called in and all functions between
the calls to setjmp and longjmp.  I don't think this is possible in
most machines, but that's what longjmp does on bsd-vax
(LONGJMP_RESTORE_FROM_STACK in tm-vax.h and tm-tahoe.h).  Gcc could
perhaps be modified to save the info to restore the regs on all
machines but this would make the calling convention incompatible and
would cause some overhead.

(by the way, LONGJMP_RESTORE_FROM_STACK doesn't seem to be used
anywhere - the code to test it in flow.c is inside #if 0)


The a) alternative would seem best to me, with a tm-file macro to turn
it on on machines were setjmp doesn't save the registers.  d) could be
another alternative, but I don't really see how it's better and the
incompatibility and overhead speak for a).

Anyway, I don't see any reason not to save all registers in setjmp,
but if the library on some machine saves doesn't save them the
compiler should do it (like in a)).


In the current situation the program

foo { int a=1; bar(); printf ("%d\n", a);}

could print just anything when setjmp/longjmp is called somewhere in
bar() or a function called by bar() and there's a setjmp which doesn't
save registers.

//Jyrki

jvh@galactus.hut.fi (Johannes Helander) (02/12/91)

Now when this setjmp/longjmp stuff has been discussed so much, I'll
add my ten cents worth by noting that in any 4.3BSD or compatible
operating system all registers have to be saved in setjmp().

In 4.3 the setjmp/longjmp stuff is closely tied to signals. A
longjmp() is equal to a sigreturn(). In fact longjmp() calls
sigreturn() giving as an argument a jmpbuf that is interpreted as a
sigcontext by the kernel. Thus the definitions of sigcontext and
jmpbuf have to be equal. (Ultrix uses a couple of extra fields in the
end of the jmpbuf on the pmax, but even here sigcontext is a proper
subset of jmpbuf).

As signals are asyncronous events, all the registers need to be saved.
Sigreturn then restores all the registers. Longjmp simply patches the
return value into the jmpbuf/sigcontext r0 slot.

The same goes for Mach. The 4.3BSD emulation has to save all registers
and this is the way I wrote the code for the pc532 Mach. How the FPU
regs should be handled is on the other hand quite unclear to me. Is
a signal handler allowed to use the FPU? If so, should the FPU status
be restored after sigreturn? Is a longjmp supposed to keep the FPU
state? I guess it should be the responsibility of the program itself
(or the programmer).

Then a couple of notes on the NS532 CPU:

MOVUSi and MOVSUi use both ptb registers even if the processor is
working in single address mode. Is the TLB associated to ptb1 updated
as if the processor was executing in dual mode? (The data sheets don't
mention this :-). We solved this problem by always loading ptb1 when
ptb0 is loaded and always write to ivar1 whenever ivar0 is written to.

Whenever SBITI causes an ABT trap (page fault), the msr register will
indicate a read (data transfer) cycle -- not a read-modify-write cycle
as it should.  Thus only a read fault is recognized although it really
was a write fault. Is there any clean way to recognize this situation
(except looking at the instruction)? The SBIT (non-interlocked set
bit) works correctly. As it now stands, an SBITI instruction will
cause our trap handler to loop (always mapping the page for reading,
never for writing). Of course the SBITI instruction doesn't need to be
used on a single processor computer like the pc532, but somebody might
use it anyway...

A very dissapointing feature of the NS32532 processor is that there is
no 32-bit absolute addressing, only 30 bits -- or actually 29 since
the first bit is a sign bit -- can easily be addressed. This leads to
a maximum of 512MB of usable memory without severe complications in
the compiler and many other places. Maybe we should start using near
and far pointers :-). It was unfortunate that NS didn't choose to use
the one "reserved" addressing mode for this purpose (what is it used
for?). BTW, am I correct in assuming that 30 bit negative addresses
will point to top of memory?

Does anybody have bug lists for the processor? Are there other known
bugs? (I'm sure all processors have several and it is not nice finding
them the hard way (always many hours time wasted)).

	Johannes

jvh@cs.hut.fi -- mcsun!hutcs!jvh

dlr@daver.bungi.com (Dave Rand) (02/13/91)

[In the message entitled "Re: A summary of the setjmp problem" on Feb 12,  5:18, Johannes Helander writes:]
> A very dissapointing feature of the NS32532 processor is that there is
> no 32-bit absolute addressing, only 30 bits -- or actually 29 since
> the first bit is a sign bit -- can easily be addressed. This leads to
> a maximum of 512MB of usable memory without severe complications in
> the compiler and many other places. Maybe we should start using near
> and far pointers :-). It was unfortunate that NS didn't choose to use
> the one "reserved" addressing mode for this purpose (what is it used
> for?). BTW, am I correct in assuming that 30 bit negative addresses
> will point to top of memory?
> 

No. Negative addresses are invalid. In the current revision of the 532,
they do map to the top of memory, but this can not be counted on.

The displacement addressing of the Series 32000 is a 30 bit, 2's complement
signed value. This means that it may range (in branch instructions for
example) +/- 16 meg for the 32008, 32016, 32032, 32C016, 32C032 and 32CG16,
covering the entire address range with no problem. For the 32332 and
32532 (and future 32 bit processors), the range is +/- 0.5 Giga byte, not
1 Giga byte. Values in _displacement_ may range from 0x1fffffff positive to
0x20000000 negative. Regretfully, negative values in absolute addresses
are "undefined" in the current programmers reference manual for the
Series 32000, so you can't assume that:
	ADDR	-1,r0
will point r0 to the last byte in memory.

As well, register-relative also uses a displacement field (so you have
all the advantages of code density), so it too is limited to +/- 0.5
Gigabyte. You may not use register relative to span the entire 4 Gigabyte
range.

So - for that Reeeally big program (> 500 megabytes), you will have to use
a jump table, or use immediate addressing modes.

The real problem is I/O on the 332 and 532. You must use a move instruction,
followed by register-relative to get to the really high addresses:

	MOVD	$ICU,r0		# load address of ICU (0xfffffe00)
	MOVB	xx,0(r0)	# access the ICU

Life is like that. Such a small price to pay for the benefit of tight
code that you can get on the 32000.

-- 
Dave Rand
{pyramid|mips|bct|vsi1}!daver!dlr	Internet: dlr@daver.bungi.com

dmason@uwaterloo.ca (Dave Mason) (02/14/91)

In article <m0j64dI-00005ZC@daver.bungi.com> dlr@daver.bungi.com (Dave Rand) writes:
> The real problem is I/O on the 332 and 532. You must use a move instruction,
> followed by register-relative to get to the really high addresses:
>   MOVD	$ICU,r0		# load address of ICU (0xfffffe00)
>   MOVB	xx,0(r0)	# access the ICU
> Life is like that. Such a small price to pay for the benefit of tight
> code that you can get on the 32000.

Or map 0xfffff000 page to 0x1ffff000.  Thereafter, everything would be
reachable.  Any weird problems with this?

	../Dave