geoff@desint.UUCP (Geoff Kuenning) (11/13/84)
>> Ummm... it's easy to push the register mask when you moveml the registers >> to be saved on entry to a subroutine in the 68000, but moveml mask,(a7)- >> and moveml mask,(a7)+ expect the mask reversed from each other! How >> do you propose to invert a 16-bit mask end-for-end (i.e. 11100...001 >> <=> 100...00111 ) and still have a fast calling sequence? > >Just great -- more computer architecture brain damage! The only brain damage around here is in the theory that you have to reverse the bits in software during the calling sequence. Here is the 68000 entry/exit sequence used by the MIT pcc: link a6,#<localsize> tstb sp@(132.) moveml #mask,a6@(offset) ... moveml a6@(offset),#mask unlk a6 rts (The tstb is a "stack probe instruction"; it has to do with the things that keep you from doing virtual on a plain 68000). Now, to implement setjmp/longjmp, there is no need to write fancy code to reverse the bits in a 16-bit word. (By the way, that task can easily be done in a very few instructions with a table lookup on the two bytes and a byte swap.) Why would you want to insert all that extra code into a call/return sequence when you already *know* at compile time what those reversed bits are? In fact, the only thing you need to do is store the information longjmp needs to restore the resgisters that were actually saved. This involves pushing a 16-bit copy of the moveml mask onto the stack. You have to decode it in longjmp by bit shifting (unless you are into instruction modification), but who cares if longjmp is a bit slow? There is one other subtlety here: the MIT pcc puts the saved registers at the *lowest* address of the stack frame, not the highest. Since longjmp does not know how big the frame is (it can't find the link instruction in the general case), this makes it impossible to find the registers. So, while you were modifying the compiler, you would also have to make a slight modification to allocate the saved registers at the very top of the frame, where they were easy to locate. -- Geoff Kuenning First Systems Corporation ...!ihnp4!trwrb!desint!geoff
gnu@sun.uucp (John Gilmore) (11/22/84)
> Here is the 68000 entry/exit sequence used by the MIT pcc: > > link a6,#<localsize> > tstb sp@(132.) > moveml #mask,a6@(offset) > ... > moveml a6@(offset),#mask > unlk a6 > rts There is an awful lot of fat in the MIT pcc code. Above is some of it. For example, it generates the moveml's even when the mask is zero (no registers need to be saved). Also, the offset used is ALWAYS the same as the <localsize>, thus they could have done: moveml #mask,sp@ saving another word of instructions & slowness. Now, we could add back in some fat by inserting movw #mask,sp@(someoffset) to make longjmp work, but if the average subroutine is 30 instructions long, that's a large price to pay. I don't want to dedicate 3% of my system to housekeeping for longjmp. Even if the average is 100, 1% for longjmp is still too much!