[comp.sys.ibm.pc] Optimizers and RISC

brad@looking.UUCP (Brad Templeton) (05/03/89)

Did RISC machines cause a push in optimizing compilers?  I always thought
the big advantage of the RISC was that it made the low level optimizations
so much simpler -- there's only one way to code the thing, so no need to
worry about the best use of fancy instructions and addressing modes.

High level optimizations are another matter, but they apply as well to CISC
as to RISC.
-- 
Brad Templeton, Looking Glass Software Ltd.  --  Waterloo, Ontario 519/884-7473

hedrick@geneva.rutgers.edu (Charles Hedrick) (05/03/89)

>Did RISC machines cause a push in optimizing compilers?  I always thought
>the big advantage of the RISC was that it made the low level optimizations
>so much simpler -- there's only one way to code the thing, so no need to
>worry about the best use of fancy instructions and addressing modes.

Sort of.  Actually it's probably better to say that RISC *depended
upon* a push in optimizing compilers.  How much it depends upon them
depends upon the specific architecture.  Some of the original RISC
proposals include things like delayed branching, and in fact the SPARC
(Sun 4) does that.  That is, a branch instruction doesn't take effect
until after the next instruction.  In order to produce optimal code,
your compiler may have to rearrange code.  Simple-minded compilers
just put a no-op after the branch.  Also, in order to get the benefit
out of RISC, you have to be able to use all those registers.  This
means a compiler intended for RISC should have a good register
allocation method.  When you can refer to memory about as easily as a
register, that's not quite as critical for CISC (though of course
minimizing memory references certainly helps there too).  The
conventional wisdom is that RISC machines need very smart compilers in
order to give reasonable performance.

gph@hpsemc.HP.COM (Paul Houtz) (05/04/89)

brad@looking.UUCP (Brad Templeton) writes:

>Did RISC machines cause a push in optimizing compilers?  I always thought
>the big advantage of the RISC was that it made the low level optimizations
>so much simpler -- there's only one way to code the thing, so no need to

  Your question is a little ambiguous.   Are you asking if RISC machines
  caused a change in  the INDUSTRY, or do they give impetus for a single
  vendor to start making optimizers?

  HP has RISC machines, and yes, all of a sudden we have optimizing
  compilers for the first time.

  You see, Optimizers become a big boon with RISC because instead of having
  complex instructions (which cannot be optimized at compile time), you 
  have little programs that do what the CISC instructions used to do.  
  Those little programs are software, and once they are dragged into 
  the code, they can be optimized along with everything else in the 
  program.

  Also, when you start talking about pipelining, delayed branch 
  slots, and single cycle instructions, it actually becomes more reasonable
  to allow the compiler to generate whatever construct is efficient for 
  a given function.   Then it is very simple for a separate pass to do 
  things like fill in branch delay slots, move non-dependent structures
  out of loops, etc.

allbery@ncoast.ORG (Brandon S. Allbery) (05/08/89)

As quoted from <3181@looking.UUCP> by brad@looking.UUCP (Brad Templeton):
+---------------
| Did RISC machines cause a push in optimizing compilers?  I always thought
| the big advantage of the RISC was that it made the low level optimizations
| so much simpler -- there's only one way to code the thing, so no need to
| worry about the best use of fancy instructions and addressing modes.
+---------------

I was under the impression that it had.  At least one person has suggested
that I was wrong -- but I'm not entirely certain of that.

Optimization under RISC consists primarily of (1) recognizing that some
loops will run faster when "unwound" into linear code and (2) optimizing the
use of registers.  The latter involves copying often-used values into a
register once instead of constantly fetching it from memory, and recognizing
that the "brute-force" method of using registers can result in multiple
registers having the same value, and saving only one of those registers for
future use while "freeing" the others up for use by the next calculation.

Both can be applied to CISC code; in fact, register optimization is, if
anything, *more* useful on hardware which doesn't have registers to spare.
Unwinding loops isn't normally useful unless you've decided to trade code
size for the absolute maximum of speed, although certain "trivial" loops
might always be expanded with little cost (and, probably, equally little
gain).

The difference that RISC brings to it is that both are very nearly
*required* for RISC, whereas CISC can get by without either; consider yours
and my "favorite" compiler, AT&T's pcc.  The speed benefits of RISC are
quickly lost if the compiler produces code that wastes time doing things
that don't really need to be done because it isn't aware of prior context.

++Brandon
-- 
Brandon S. Allbery, moderator of comp.sources.misc	     allbery@ncoast.org
uunet!hal.cwru.edu!ncoast!allbery		    ncoast!allbery@hal.cwru.edu
      Send comp.sources.misc submissions to comp-sources-misc@<backbone>
NCoast Public Access UN*X - (216) 781-6201, 300/1200/2400 baud, login: makeuser

gph@hpsemc.HP.COM (Paul Houtz) (05/10/89)

allbery@ncoast.ORG (Brandon S. Allbery) writes:

>Optimization under RISC consists primarily of (1) recognizing that some
>loops will run faster when "unwound" into linear code and (2) optimizing the
>use of registers.  The latter involves copying often-used values into a
>register once instead of constantly fetching it from memory, and recognizing
>Both can be applied to CISC code; in fact, register optimization is, if
>anything, *more* useful on hardware which doesn't have registers to spare.

>The difference that RISC brings to it is that both are very nearly
>*required* for RISC, whereas CISC can get by without either; consider yours

The statement in the first paragraph is false.  RISC optimization does not 
consist PRIMARILY of unwinding loops and optimizing registers.   Optimizing
registers is a big part of RISC optimization, but you left out filling
branch delay slots.   You see, most RISC implementations are pipelined
implementations, and branches cause bubbles in pipelines.   Therefore the
RISC architectures usually allow the branch to occur after the instruction
following the branch.  That way, the bubble in the pipeline can be filled.

Architecture non-specific example:

     Normally you would code:

         ld r2, *+2            Load return address in register 2
         bl subprog            branch to subprogram

 subprog st r25,$ADDIT
         ad $ADDIT,1
         .
         .
         b r2

     In a RISC implementation, the programs would look the same, except the
     code to get to the subroutine would be "optimized" to look like this:

        bl subprog
        ld r2, *+1

     Since the instruction after the brach WILL be executed before the 
     branch, you can save a couple of cycles by placing the branch after
     the load instruction.

The RISC optimizers will fill branch delay slots first, then look at 
unwinding loops and then register optimization.  However, in the compilers
I have looked at, register optimization is something the compiler does, not
the optimizer.  Of course, all systems are not the same.

Also, I think it is incorrect to say that RISC architectures MUST be 
optimized to have acceptable performance.   I don't think that this is
born out by scientific evidence availble.

Rather, I would say that RISC code can benefit more from optimization, since 
there is no ROM microcode.   You see, you can not combine millions line of
ROM code and then optimize the whole mess, becuase it is ROM.   With 
RISC, all the code is RAM resident.  Theoretically, once the compiler
pulls in the code it needs to do what CISC instructions used to do, the
whole thing can then be optimized.