[comp.compilers] Cost/Benefit of compiler optimization techniques?

baxter@zola.ICS.UCI.EDU (Ira Baxter) (11/28/90)

Compiler optimizations for conventional procedural languages on
conventional architectures (for sake of argument, consider both CISC
and RISC in this class) seem to come at two levels:

   Machine-independent optimizations
   Machine-dependent optimizations

Aho's Dragon book suggests "some of the most useful optimizations" that
fall into the machine-independent class are:
  Function-preserving optimizations:
    Common subexpression elimination
    Use of algebraic identities
    Copy propagation
    Dead-code elimination
    Constant folding
  Loop Optimizations:
    Code Motion
    Induction-variable elimination
    Reduction in Strength
Others come to mind:
    Use of programmer-supplied data assertions or branch-probabilities
    Trace scheduling
    Procedure inlining
    Partial evaluation
    <fill in your favorites>

For code generation, some optimizations beyond a vanilla code generator are:
   Careful use of machine idioms
   Register allocation (by hueristic methods, graph coloring, etc.)
   JMP-chain elimination
   Peephole optimization
   Dynamic-programming for optimal code generation
   <fill in your favorites>

But there is little information about the relative utility of each of
these techniques.

I am interested in finding out the "most useful" optimizations (at both
levels) for conventional procedural languages (FORTRAN, C, etc.), by
comparing average quantitative payoffs which rank them (say, reduction in
execution time measured over some large suite of user programs) to some
measure of the average effort to implement that optimization (measured in
man-somethings).  Essentially what I am asking is, "What are the
cost/benefit ratios of various techniques"?  Additional useful information
would be something like conditional utility, i.e., if technique A is used,
then technique B is X% less useful.  Surely enough compilers have been built
so this information is known and documented.

Pointers to literature, knowledgeable sources, and even corrections to
the form of the question would be appreciated.  I'll summarize.
Thanks in advance.

[Do I want to build a compiler? Not this week. But this information
is clearly something every would-be compiler writer should have.]

IDB
(714) 856-6693  ICS Dept/ UC Irvine, Irvine CA 92717
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

turpin@cs.utexas.edu (Russell Turpin) (11/29/90)

In article <9011280511.aa16546@ICS.UCI.EDU> Ira Baxter <baxter@zola.ICS.UCI.EDU> writes:
> I am interested in finding out the "most useful" optimizations 
> (at both levels) for conventional procedural languages (FORTRAN, 
> C, etc.), by comparing average quantitative payoffs which rank
> them ... to some measure of the average effort
> to implement that optimization ...

I think that you will find that such studies are almost always
focused on an application area and a particular kind of
architecture.  For example, I have seen several papers on the
effectiveness of various kind of automatic vectorization
techniques when applied to fluid flow (or FEA or ...) programs on
the Cray.  It is not clear what information would be useful
unless the application area and target architecture are somehow
constrained.  

Russell
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

norvell@csri.toronto.edu (Theo Norvell) (11/29/90)

In article <9011280511.aa16546@ICS.UCI.EDU> Ira Baxter <baxter@zola.ICS.UCI.EDU> writes:
>I am interested in finding out the "most useful" optimizations ... by
>comparing average quantitative payoffs which rank them ... to some
>measure of the average effort to implement that optimization ...
>Additional useful information would be something like conditional utility,
>i.e., if technique A is used, then technique B is X% less useful.
			                               ^^^^ (or more!)

An ought-to-be classic work on just this question is Fredrick Chow's thesis,
done under John Henessey:

Author:       F.C. Chow
Title:        A Portable Machine-independent Global Optimizer - Design and
              Measurements
Report:       CSL T.R. 83-254
Publisher:    Computer Systems Laboratory, Stanford University
Availability: cost: $8.25

Theo Norvell
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

victor@vivaldi.csc.ti.com (Brian Victor) (11/30/90)

> I am interested in finding out the "most useful" optimizations ...

Here is a reference to an article you may find interesting:

    `Interprocedural Optimization: Experimental Results', Richardson and
     Ganapathi, Software Practise & Experience, v19n2 (2/89).

I don't have the paper handy, but I believe they tested some algorithms in a
Pascal compiler on about 20 or 30 benchmarks.  I think the authors were
disappointed with amount of improvement shown, and found the most difficult
algorithms the least effective!

- Brian
  victor@itg.ti.com
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

preston@ariel.rice.edu (Preston Briggs) (11/30/90)

> I am interested in finding out the "most useful" optimizations ...

Some papers, in no special order

	Effectiveness of a Machine-Level, Global Optimizer
	Johnson and Miller
	Sigplan '86, Compiler Construction Conference
	[studies based on HP's Spectrum compiler]

	Measurement of Program Improvement Algorithms
	Cocke and Markstein
	IFIP '80
	[early measurements of the PL.8 compiler]

	Simple Code Optimizations
	Hanson
	Software -- Practice and Experience, 1983
	[argues for, and present 3 simple, effective optimizations]

	A Portable Optimizing Compiler for Modula-2
	Powell
	Siplan '84, Compiler Construction Conference
	[specifically tried for "best, simple" optimizations]

In general though, it's difficult to talk about the effectiveness and
expense of specific optimizations in isolation.  Good optimizers do many
optimizations, counting in the combined effect to do a good job.  The
implementation costs are especially difficult.  For example, a strength
reducer alone might cost X amount.  But, if you've got a dead code
eliminator, the strength reducer is simpler and can make use of some
common code (data flow analysis, ...).

For some exceptionally profitable optimizations, see

	Improving Register Allocation for Subscripted Variables
	Callahan, Carr, Kennedy
	Sigplan '90, Programmling Language Design and Implementation

Preston
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.