[gnu.gcc.bug] -fomit-frame-pointer implimentation

ian@sibyl.eleceng.ua.oz.au (12/20/89)
While trying to get the -fomit-frame-pointer option working for the
ns32k I realised that there is a lot of the work required was
processor independent. Here is my implimentation of a new (processor
independent) function which greatly simplifies the definition of
FIX_FRAME_POINTER_ADDRESS on most (all?)  machines. The
FIX_FRAME_POINTER_ADDRESS macro itself is still required to be
processor dependent as it depends on the stack layout.  The following
has been tested for the ns32k (stage 2 compare works with the
-fomit-frame-pointer option), no guarantees for other architectures
although as I say I believe the fix-frame-pointer-address function to
be achitecture independent.

I get 16% increase in dhrystones using the -fomit-frame-pointer
option. I don't claim you will necessarilly get that speed increase
for real programs.

I currently have the fix_frame_pointer function in the aux_output.c
file in order to keep all my hacks together. It logically belongs in
reload1.c.

The architecture independent function:

    /* Used by FIX_FRAME_POINTER_ADDRESS
     * This function recursively looks at "addr" to see if there is a frame
     * pointer reference in there somewhere. If so it converts it to a reference
     * to stack pointer plus offset
     * This function is intended to work no matter how CISCy (or otherwise) the
     * addressing modes are.
     */

    rtx fix_frame_pointer_address (addr, offset)
         rtx addr;
         int offset;
    {
      rtx arg0, old_arg0, arg1, old_arg1;
      enum machine_mode mode = GET_MODE (addr);

      if (addr == frame_pointer_rtx)
        return plus_constant(stack_pointer_rtx, offset);
      if (addr == 0)
        return addr;
      switch (GET_CODE(addr))
        {
        case REG:
          return addr;
          break;
        case MEM:
          old_arg0 = arg0 = XEXP(addr, 0);
          arg0 = fix_frame_pointer_address(arg0, offset);
          if ( arg0 != old_arg0)
            {
              rtx mem =  gen_rtx(MEM, mode, arg0);
              MEM_VOLATILE_P (mem) = MEM_VOLATILE_P (addr);
              return mem;
            }
          else
            return addr;
          break;
        case PLUS:
          old_arg0 = arg0 = XEXP(addr, 0);
          old_arg1 = arg1 = XEXP(addr, 1);
          if (arg0 == frame_pointer_rtx)
            return plus_constant(gen_rtx(PLUS, mode, stack_pointer_rtx, arg1),
                                 offset);
          else if (arg1 == frame_pointer_rtx)
            return plus_constant(gen_rtx(PLUS, mode, arg0, stack_pointer_rtx),
                                 offset);
          else
            {
              arg0 = fix_frame_pointer_address(arg0, offset);
              arg1 = fix_frame_pointer_address(arg1, offset);
              if (arg0 != old_arg0 || arg1 != old_arg1)
                return gen_rtx(PLUS, mode, arg0, arg1);
              else
                return addr;
            }
          break;
        case MULT:
          old_arg0 = arg0 = XEXP(addr, 0);
          old_arg1 = arg1 = XEXP(addr, 1);
          arg0 = fix_frame_pointer_address(arg0, offset);
          arg1 = fix_frame_pointer_address(arg1, offset);
          if (arg0 != old_arg0 || arg1 != old_arg1)
            return gen_rtx(MULT, mode, arg0, arg1);
          else
            return addr;
          break;
        }
      return addr;
    }


The implimentation of FIX_FRAME_POINTER_ADDRESS for the ns32k, using the
above function:

    #define FIX_FRAME_POINTER_ADDRESS(ADDR,DEPTH) \
    {                                                                  \
      register int regno, offset = (DEPTH) - 4;                        \
      extern char call_used_regs[];                                    \
      extern rtx fix_frame_pointer_address();                          \
      for (regno = 0; regno < 16; regno++)                             \
        if (regs_ever_live[regno] && ! call_used_regs[regno])          \
          offset += 4;                                                 \
      ADDR = fix_frame_pointer_address(ADDR, offset);                  \
    }


I am still not all that sure about the "correct" way to do gcc code.
Particularly I am unsure exactly when a new rtx needs to be created as
opposed to simply changing the contents of an existing rtx. I also
don't know whether this creates a memory "leakage" problem. How do the
new rtx structures get freed? These concerns aside, it does as I say,
work.