morten@cs.qmw.ac.uk (Morten Ronseth) (03/19/90)
(A lat desperate attempt to make 'alloca' work on the mac...)
This is the assembly for my `alloca' routine, a rewrite of the 
one used in GNU Emacs.
Now, the `alloca' works fine in all of my own code, both using MPW 
and LightspeedC. I thought I had it all figured out, until I linked
it into gcc (as in the actual GNU code), using MPW C, and found I 
was wrong. The preprocessor cc1 just falls over on this code, and as 
the stackpointer gets garbled I can't do a stack-trace. When I use 
`malloc' instead of `alloca', everything is fine (but don't tell me
to use `malloc' instead and just forget about the whole thing!) 
Can somebody please tell me what is wrong?
				Morten.
Strategy:
Because one can never predict how many registers (if any)
are saved  at function entry, (movem.l <registers>,-(sp)), 
assume worst case and copy ALL registers to TOS where they 
are expected to be found at return-time.
	CASE OBJ
		
alloca PROC EXPORT
		movea.l	(sp)+,a0	; pop return addr from tos
		move.l	(sp)+,d0	; pop size in bytes from tos
		move.l	sp,a1		; save old sp for register copy
		move.l	sp,d1		; compute new sp
		sub.l	d0,d1		; allocate requested space on stack
		and.l	#-4,d1		; round down to longword
		sub.l	#16 * 4,d1	; space for saving registers
		move.l	d1,sp		; save new value of sp
		move.w	#16 - 1,d0	; loop counter...
@loop
		move.l	(a1)+,(sp)+	; copy the register to top of stack
		dbra	d0,@loop	; loop...
		move.l	sp,d0		; return value
		move.l	d1,sp		; load new value for sp
		jmp	(a0)		; rts
		ENDPROC
		END
(If I remeber correctly, MPW C defers the pop'ing, `addq.l #n,sp',
 of params after the return of a 'jsr', i.e. relying on an `unlk' to
 adjust the stack. Hence I do not need a `subq.l #4,sp' in my 'alloca'
 to fix the stack. But what, do I hear you ask, if the calling function
 doesn't take any params nor uses any local vars? Well, I'm compiling 
 with `-g' so the compiler should generate a `link' and an `unlk' no matter
 what...shouldn't it?) 
	The LightspeedC version looks like this (pretty much the same, huh?):
#define MAXREG	16
long alloca ()
{
	asm{
		movea.l	(sp)+,a0	; pop return addr from tos
		move.l	(sp)+,d0	; pop size in bytes from tos
		move.l	sp,a1		; save old sp for register copy
		move.l	sp,d1		; compute new sp
		sub.l	d0,d1		; allocate requested space on stack
		and.l	#-4,d1		; round down to longword
		sub.l	#MAXREG * 4,d1	; space for saving registers
		move.l	d1,sp		; save new value of sp
		move.w	#MAXREG - 1,d0	; loop counter...
loop:
		move.l	(a1)+,(sp)+	; copy the register to top of stack
		dbra	d0,@loop	; loop...
		move.l	sp,d0		; return value
		move.l	d1,sp		; load new value for sp
		subq.l	#4,sp		; caller will do `addq.l #4,sp'
		jmp	(a0)		; rts
	}
}
(No optimization here, lets fix the stack right away, we've got all the
 time in the world, right?)
-- 
==============================================================================
Morten Lerskau Ronseth
UUCP:     morten@qmw-cs.uucp       	   or ...seismo!mcvax!ukc!qmw-cs!morten
JANET:    morten@uk.ac.qmw.cs 	       Post:  Dept of Computer Science 
ARPA:     morten%qmw.cs@ucl-cs.arpa    		  Queen Mary and Westfield College 
Easylink: 19019285                     		  University of London
Telex:    893750 QMCUOL                		  Mile End Road
Fax:      +44 1 981 7517               		  London E1 4NS
Phone:    +44 1 975 5220               		  Englandbrecher@well.sf.ca.us (Steve Brecher) (03/21/90)
In article <1801@sequent.cs.qmw.ac.uk>, morten@cs.qmw.ac.uk (Morten Ronseth) writes: > This is the assembly for my `alloca' routine, a rewrite of the > one used in GNU Emacs. > ... > movea.l (sp)+,a0 ; pop return addr from tos > move.l (sp)+,d0 ; pop size in bytes from tos > move.l sp,a1 ; save old sp for register copy > move.l sp,d1 ; compute new sp > sub.l d0,d1 ; allocate requested space on stack > and.l #-4,d1 ; round down to longword > sub.l #16 * 4,d1 ; space for saving registers > move.l d1,sp ; save new value of sp > move.w #16 - 1,d0 ; loop counter... > @loop > move.l (a1)+,(sp)+ ; copy the register to top of stack > dbra d0,@loop ; loop... > move.l sp,d0 ; return value > move.l d1,sp ; load new value for sp > jmp (a0) ; rts Unfortunately I don't know what "alloca" is or what it's supposed to do; but I know the above code is wrong, because (SP)+ never makes sense as the destination of a move, unless maybe to make sure some data on the stack is overwritten (say, a password) for security reasons. The stack is used by asynchronous processes such as interrupt handlers and hence any data immediately below SP is subject to destruction at any time. I infer that what is wanted is something like this: On entry-- stuff[15] ("stuff" = saved register value?) ... stuff[0] size of desired stack allocation SP-> return address On exit-- stuff[15] ... stuff[0] end of allocated stack area ... D0-> start of allocated stack area copy of stuff[15] ... SP-> copy of stuff[0] If I infer correctly, then: Move.L (SP)+,A0 ;return address MoveQ #-4,D0 ;truncation mask And.L (SP)+,D0 ;size of allocated area, truncated Lea 4*16(SP),A1 ;point beyond end of stuff to be copied Sub.L D0,SP ;allocate the area Move.L SP,D0 ;return value MoveQ #16-1,D1 ;Dbra count @0 Move.L -(A1),-(SP) ;copy a longword of stuff Dbra D1,@0 Jmp (A0) ;return > (If I remeber correctly, MPW C defers the pop'ing, `addq.l #n,sp', > of params after the return of a 'jsr', i.e. relying on an `unlk' to > adjust the stack. Hence I do not need a `subq.l #4,sp' in my 'alloca' > to fix the stack. But what, do I hear you ask, if the calling function > doesn't take any params nor uses any local vars? Well, I'm compiling > with `-g' so the compiler should generate a `link' and an `unlk' no > matter what...shouldn't it?) MPW C does not necessarily depend on a final UNLK to fix the stack; it may deallocate the parameters to a called routine, or re-use the allocated stack space, at any time after the call. So if the above is to be an MPW C function, you should indeed fix the stack by inserting SubQ #4,SP before the final Jmp. Whether the assumption that there are saved registers (and not some transient temporaries) on the stack immediately above the parameter is valid I cannot say, but in the general case of a function called from an arbitrary point in another, the assumption would NOT be warranted. -- brecher@well.sf.ca.us (Steve Brecher)