ian@sibyl.eleceng.ua.oz.au (12/20/89)
While trying to get the -fomit-frame-pointer option working for the ns32k I realised that there is a lot of the work required was processor independent. Here is my implimentation of a new (processor independent) function which greatly simplifies the definition of FIX_FRAME_POINTER_ADDRESS on most (all?) machines. The FIX_FRAME_POINTER_ADDRESS macro itself is still required to be processor dependent as it depends on the stack layout. The following has been tested for the ns32k (stage 2 compare works with the -fomit-frame-pointer option), no guarantees for other architectures although as I say I believe the fix-frame-pointer-address function to be achitecture independent. I get 16% increase in dhrystones using the -fomit-frame-pointer option. I don't claim you will necessarilly get that speed increase for real programs. I currently have the fix_frame_pointer function in the aux_output.c file in order to keep all my hacks together. It logically belongs in reload1.c. The architecture independent function: /* Used by FIX_FRAME_POINTER_ADDRESS * This function recursively looks at "addr" to see if there is a frame * pointer reference in there somewhere. If so it converts it to a reference * to stack pointer plus offset * This function is intended to work no matter how CISCy (or otherwise) the * addressing modes are. */ rtx fix_frame_pointer_address (addr, offset) rtx addr; int offset; { rtx arg0, old_arg0, arg1, old_arg1; enum machine_mode mode = GET_MODE (addr); if (addr == frame_pointer_rtx) return plus_constant(stack_pointer_rtx, offset); if (addr == 0) return addr; switch (GET_CODE(addr)) { case REG: return addr; break; case MEM: old_arg0 = arg0 = XEXP(addr, 0); arg0 = fix_frame_pointer_address(arg0, offset); if ( arg0 != old_arg0) { rtx mem = gen_rtx(MEM, mode, arg0); MEM_VOLATILE_P (mem) = MEM_VOLATILE_P (addr); return mem; } else return addr; break; case PLUS: old_arg0 = arg0 = XEXP(addr, 0); old_arg1 = arg1 = XEXP(addr, 1); if (arg0 == frame_pointer_rtx) return plus_constant(gen_rtx(PLUS, mode, stack_pointer_rtx, arg1), offset); else if (arg1 == frame_pointer_rtx) return plus_constant(gen_rtx(PLUS, mode, arg0, stack_pointer_rtx), offset); else { arg0 = fix_frame_pointer_address(arg0, offset); arg1 = fix_frame_pointer_address(arg1, offset); if (arg0 != old_arg0 || arg1 != old_arg1) return gen_rtx(PLUS, mode, arg0, arg1); else return addr; } break; case MULT: old_arg0 = arg0 = XEXP(addr, 0); old_arg1 = arg1 = XEXP(addr, 1); arg0 = fix_frame_pointer_address(arg0, offset); arg1 = fix_frame_pointer_address(arg1, offset); if (arg0 != old_arg0 || arg1 != old_arg1) return gen_rtx(MULT, mode, arg0, arg1); else return addr; break; } return addr; } The implimentation of FIX_FRAME_POINTER_ADDRESS for the ns32k, using the above function: #define FIX_FRAME_POINTER_ADDRESS(ADDR,DEPTH) \ { \ register int regno, offset = (DEPTH) - 4; \ extern char call_used_regs[]; \ extern rtx fix_frame_pointer_address(); \ for (regno = 0; regno < 16; regno++) \ if (regs_ever_live[regno] && ! call_used_regs[regno]) \ offset += 4; \ ADDR = fix_frame_pointer_address(ADDR, offset); \ } I am still not all that sure about the "correct" way to do gcc code. Particularly I am unsure exactly when a new rtx needs to be created as opposed to simply changing the contents of an existing rtx. I also don't know whether this creates a memory "leakage" problem. How do the new rtx structures get freed? These concerns aside, it does as I say, work.