[comp.software-eng] Cost/Benefit of compiler optimization techniques?

baxter@zola.ICS.UCI.EDU (Ira Baxter) (11/28/90)

Compiler optimizations for conventional procedural languages on
conventional architectures (for sake of argument, consider both CISC
and RISC in this class) seem to come at two levels:

   Machine-independent optimizations
   Machine-dependent optimizations

Aho's Dragon book suggests "some of the most useful optimizations" that
fall into the machine-independent class are:
  Function-preserving optimizations:
    Common subexpression elimination
    Use of algebraic identities
    Copy propagation
    Dead-code elimination
    Constant folding
  Loop Optimizations:
    Code Motion
    Induction-variable elimination
    Reduction in Strength
Others come to mind:
    Use of programmer-supplied data assertions or branch-probabilities
    Trace scheduling
    Procedure inlining
    Partial evaluation
    <fill in your favorites>

For code generation, some optimizations beyond a vanilla code generator are:
   Careful use of machine idioms
   Register allocation (by hueristic methods, graph coloring, etc.)
   JMP-chain elimination
   Peephole optimization
   Dynamic-programming for optimal code generation
   <fill in your favorites>

But there is little information about the relative utility of each of
these techniques.

I am interested in finding out the "most useful" optimizations (at both
levels) for conventional procedural languages (FORTRAN, C, etc.), by
comparing average quantitative payoffs which rank them (say, reduction in
execution time measured over some large suite of user programs) to some
measure of the average effort to implement that optimization (measured in
man-somethings).  Essentially what I am asking is, "What are the
cost/benefit ratios of various techniques"?  Additional useful information
would be something like conditional utility, i.e., if technique A is used,
then technique B is X% less useful.  Surely enough compilers have been built
so this information is known and documented.

Pointers to literature, knowledgeable sources, and even corrections to
the form of the question would be appreciated.  I'll summarize.
Thanks in advance.

[Do I want to build a compiler? Not this week. But this information
is clearly something every would-be compiler writer should have.]

IDB
(714) 856-6693  ICS Dept/ UC Irvine, Irvine CA 92717
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

turpin@cs.utexas.edu (Russell Turpin) (11/29/90)

In article <9011280511.aa16546@ICS.UCI.EDU> Ira Baxter <baxter@zola.ICS.UCI.EDU> writes:
> I am interested in finding out the "most useful" optimizations 
> (at both levels) for conventional procedural languages (FORTRAN, 
> C, etc.), by comparing average quantitative payoffs which rank
> them ... to some measure of the average effort
> to implement that optimization ...

I think that you will find that such studies are almost always
focused on an application area and a particular kind of
architecture.  For example, I have seen several papers on the
effectiveness of various kind of automatic vectorization
techniques when applied to fluid flow (or FEA or ...) programs on
the Cray.  It is not clear what information would be useful
unless the application area and target architecture are somehow
constrained.  

Russell
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

norvell@csri.toronto.edu (Theo Norvell) (11/29/90)

In article <9011280511.aa16546@ICS.UCI.EDU> Ira Baxter <baxter@zola.ICS.UCI.EDU> writes:
>
>I am interested in finding out the "most useful" optimizations (at both
>levels) for conventional procedural languages (FORTRAN, C, etc.), by
>comparing average quantitative payoffs which rank them (say, reduction in
>execution time measured over some large suite of user programs) to some
>measure of the average effort to implement that optimization (measured in
>man-somethings).  Essentially what I am asking is, "What are the
>cost/benefit ratios of various techniques"?  Additional useful information
>would be something like conditional utility, i.e., if technique A is used,
>then technique B is X% less useful.
                        ^^^^ (or more!)

An ought-to-be classic work on just this question is Fredrick Chow's thesis,
done under John Henessey:

Author:       F.C. Chow
Title:        A Portable Machine-independent Global Optimizer - Design and
              Measurements
Report:       CSL T.R. 83-254
Publisher:    Computer Systems Laboratory, Stanford University
Availability: cost: $8.25

Theo Norvell