[net.lang.c] setjmp/longjmp, 68000 moveml instruction

geoff@desint.UUCP (Geoff Kuenning) (11/13/84)

>> Ummm... it's easy to push the register mask when you moveml the registers
>> to be saved on entry to a subroutine in the 68000, but moveml mask,(a7)-
>> and moveml mask,(a7)+ expect the mask reversed from each other!  How
>> do you propose to invert a 16-bit mask end-for-end (i.e. 11100...001
>> <=> 100...00111 ) and still have a fast calling sequence?
>
>Just great -- more computer architecture brain damage!

The only brain damage around here is in the theory that you have to reverse
the bits in software during the calling sequence.

Here is the 68000 entry/exit sequence used by the MIT pcc:

	link	a6,#<localsize>
	tstb	sp@(132.)
	moveml	#mask,a6@(offset)
	...
	moveml	a6@(offset),#mask
	unlk	a6
	rts

(The tstb is a "stack probe instruction";  it has to do with the things
that keep you from doing virtual on a plain 68000).

Now, to implement setjmp/longjmp, there is no need to write fancy code to
reverse the bits in a 16-bit word.  (By the way, that task can easily be
done in a very few instructions with a table lookup on the two bytes and a
byte swap.)  Why would you want to insert all that extra code into a
call/return sequence when you already *know* at compile time what those
reversed bits are?  In fact, the only thing you need to do is store the
information longjmp needs to restore the resgisters that were actually saved.
This involves pushing a 16-bit copy of the moveml mask onto the stack.  You
have to decode it in longjmp by bit shifting (unless you are into
instruction modification), but who cares if longjmp is a bit slow?

There is one other subtlety here:  the MIT pcc puts the saved registers at
the *lowest* address of the stack frame, not the highest.  Since longjmp
does not know how big the frame is (it can't find the link instruction in
the general case), this makes it impossible to find the registers.  So,
while you were modifying the compiler, you would also have to make a slight
modification to allocate the saved registers at the very top of the frame,
where they were easy to locate.
-- 

	Geoff Kuenning
	First Systems Corporation
	...!ihnp4!trwrb!desint!geoff

gnu@sun.uucp (John Gilmore) (11/22/84)

> Here is the 68000 entry/exit sequence used by the MIT pcc:
> 
> 	link	a6,#<localsize>
> 	tstb	sp@(132.)
> 	moveml	#mask,a6@(offset)
> 	...
> 	moveml	a6@(offset),#mask
> 	unlk	a6
> 	rts

There is an awful lot of fat in the MIT pcc code.  Above is some of it.
For example, it generates the moveml's even when the mask is zero (no
registers need to be saved).  Also, the offset used is ALWAYS the same
as the <localsize>, thus they could have done:
	moveml	#mask,sp@
saving another word of instructions & slowness.

Now, we could add back in some fat by inserting
	movw	#mask,sp@(someoffset)
to make longjmp work, but if the average subroutine is 30 instructions long,
that's a large price to pay.  I don't want to dedicate 3% of my system
to housekeeping for longjmp.  Even if the average is 100, 1% for longjmp
is still too much!