ian@sibyl.eleceng.ua.oz.au (12/20/89)
While trying to get the -fomit-frame-pointer option working for the
ns32k I realised that there is a lot of the work required was
processor independent. Here is my implimentation of a new (processor
independent) function which greatly simplifies the definition of
FIX_FRAME_POINTER_ADDRESS on most (all?) machines. The
FIX_FRAME_POINTER_ADDRESS macro itself is still required to be
processor dependent as it depends on the stack layout. The following
has been tested for the ns32k (stage 2 compare works with the
-fomit-frame-pointer option), no guarantees for other architectures
although as I say I believe the fix-frame-pointer-address function to
be achitecture independent.
I get 16% increase in dhrystones using the -fomit-frame-pointer
option. I don't claim you will necessarilly get that speed increase
for real programs.
I currently have the fix_frame_pointer function in the aux_output.c
file in order to keep all my hacks together. It logically belongs in
reload1.c.
The architecture independent function:
/* Used by FIX_FRAME_POINTER_ADDRESS
* This function recursively looks at "addr" to see if there is a frame
* pointer reference in there somewhere. If so it converts it to a reference
* to stack pointer plus offset
* This function is intended to work no matter how CISCy (or otherwise) the
* addressing modes are.
*/
rtx fix_frame_pointer_address (addr, offset)
rtx addr;
int offset;
{
rtx arg0, old_arg0, arg1, old_arg1;
enum machine_mode mode = GET_MODE (addr);
if (addr == frame_pointer_rtx)
return plus_constant(stack_pointer_rtx, offset);
if (addr == 0)
return addr;
switch (GET_CODE(addr))
{
case REG:
return addr;
break;
case MEM:
old_arg0 = arg0 = XEXP(addr, 0);
arg0 = fix_frame_pointer_address(arg0, offset);
if ( arg0 != old_arg0)
{
rtx mem = gen_rtx(MEM, mode, arg0);
MEM_VOLATILE_P (mem) = MEM_VOLATILE_P (addr);
return mem;
}
else
return addr;
break;
case PLUS:
old_arg0 = arg0 = XEXP(addr, 0);
old_arg1 = arg1 = XEXP(addr, 1);
if (arg0 == frame_pointer_rtx)
return plus_constant(gen_rtx(PLUS, mode, stack_pointer_rtx, arg1),
offset);
else if (arg1 == frame_pointer_rtx)
return plus_constant(gen_rtx(PLUS, mode, arg0, stack_pointer_rtx),
offset);
else
{
arg0 = fix_frame_pointer_address(arg0, offset);
arg1 = fix_frame_pointer_address(arg1, offset);
if (arg0 != old_arg0 || arg1 != old_arg1)
return gen_rtx(PLUS, mode, arg0, arg1);
else
return addr;
}
break;
case MULT:
old_arg0 = arg0 = XEXP(addr, 0);
old_arg1 = arg1 = XEXP(addr, 1);
arg0 = fix_frame_pointer_address(arg0, offset);
arg1 = fix_frame_pointer_address(arg1, offset);
if (arg0 != old_arg0 || arg1 != old_arg1)
return gen_rtx(MULT, mode, arg0, arg1);
else
return addr;
break;
}
return addr;
}
The implimentation of FIX_FRAME_POINTER_ADDRESS for the ns32k, using the
above function:
#define FIX_FRAME_POINTER_ADDRESS(ADDR,DEPTH) \
{ \
register int regno, offset = (DEPTH) - 4; \
extern char call_used_regs[]; \
extern rtx fix_frame_pointer_address(); \
for (regno = 0; regno < 16; regno++) \
if (regs_ever_live[regno] && ! call_used_regs[regno]) \
offset += 4; \
ADDR = fix_frame_pointer_address(ADDR, offset); \
}
I am still not all that sure about the "correct" way to do gcc code.
Particularly I am unsure exactly when a new rtx needs to be created as
opposed to simply changing the contents of an existing rtx. I also
don't know whether this creates a memory "leakage" problem. How do the
new rtx structures get freed? These concerns aside, it does as I say,
work.