[comp.lang.misc] Definition of "Aggressive optimization"

rcg@lpi.liant.com (Rick Gorton) (11/07/90)

Ungk.  Me mongo.  Mongo write Code Generator.
Code Generator for many Language.  Me confused.
Me no understand big wurd.  What "Aggressive" mean?

Seriously, I want to know language (and furthermore,
what compiler) is being used here that is being
used as the basis for some of the discussion going on:

barmar> So aggressive optimization refers to optimization that does
barmar> things like interprocedural flow analysis?  Are loop unrolling,
barmar> strength reduction, and common subexpression elimination
barmar> aggressive or not?

pcg> They are all aggressive. Even substituting a shift for a multiplication
pcg> or division by a power of two may be considered aggressive in certain
pcg> languages where the properties of the two are not precisely identical.

Huh?  If the language spec says that shifting is not the same as multiplying
for powers of two, then it is not an AGGRESSIVE optimization to convert
the multiply into a shift, rather an ERRONEOUS one.  And not just at the
intermediate language level.

barmar> The only optimizers that are independent of the primitive
barmar> operators are peephole optimizers.
pcg> Yep, yep. Long live peephole optimizers!

Ungk.  Mongo totally confused.  What peepholer you use?
Yes, peephole optimizers are usually independent of the intermediate
language, but in the case specified (about no shifting instead of multiplying)
It is REQUIRED that the peepholer know some things about the language
involved, if it doesn't have hooks into the intermediate language or
symbol table to get storage characteristics/requirements, etc.

How about an example say you?  Okay.  Here's a simple one:
On the 88K, the integer operation:
	mul	r2, r3, 9		(r2 = 9*r3)
takes more than 1 cycle.  However, the following 1 cycle instruction
does exactly the same thing, but by doing a scaled add:
	lda.d	r2, r3[r3]		( r2 = r3 + (r3 shifted 3 times))
Presumably, this is illegal in the language described.
A good peepholer is capable of doing all this, and much more.
In fact, there are times when peepholers can (and will) monkey
with loops, and pull things out of them that don't need to be in them.
But none of this can be done if the language specifies that this
behavior is incorrect.

pcg> I am sick of having to disassemble compiler output to see where the
pcg> optimizer has botched it. I'd rather like that the effort put in the
pcg> pursuit of code generator level aggressive optimizers were put into
pcg> source code level rewrite assistants, and in making compilers more
pcg> stable, efficient, and reliable at doing translation, not rewriting.

Which brings me back to: "So what language, and what compiler are
you using?"  Have you reported your problems to the compiler vendor?
Have they been responsive?  If not, why are you sticking with that vendor?
Or even further, "Is the buggy application written in the WRONG language?"
Or how about, "Was the language chosen the correct one, but the person
developing the buggy part of the code was not familiar with the language?"
Or more more descriptively, "Was the person writing COBOL/FORTRAN programs
in PASCAL?"

pcg> I intended to say that the code that effects the transformation adds to
pcg> the size and complexity of the compiler and therefore affects the
pcg> compiler's reliability....You have a large amount of hairy code in the
pcg> compiler's bowels, and this is baaaad.

I maintain that if the compiler is trying to perform optimizations
explicitly prohibited by the language, then having code in the compiler
to support these illegal optimizations is bad.  Having large amounts
of difficult code is only bad if you find bugs or limitations in it
that require major rewrites.  In this case, we are no longer talking
about aggressive compilation, rather software [re]design, engineering,
and development issues.

All optimization is NOT evil.  All compilers are NOT worthless.
A compiler is a tool which translates source code into object code.
Sorry to have to be the bearer of bad news, but compilers CAN have bugs.
Just like Operating Systems.

So how about some POSITIVE discussions about what kind of features
good software productivity tools have/need, instead of this continuous
optimization-bashing festival?
-- 
Richard Gorton               rcg@lpi.liant.com  (508) 626-0006
Language Processors, Inc.    Framingham, MA 01760
Hey!  This is MY opinion.  Opinions have little to do with corporate policy.

pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/12/90)

On 6 Nov 90 18:39:55 GMT, rcg@lpi.liant.com (Rick Gorton) said:

rcg> Seriously, I want to know language (and furthermore,
rcg> what compiler) is being used here that is being
rcg> used as the basis for some of the discussion going on:

barmar> So aggressive optimization refers to optimization that does
barmar> things like interprocedural flow analysis?  Are loop unrolling,
barmar> strength reduction, and common subexpression elimination
barmar> aggressive or not?

pcg> They are all aggressive. Even substituting a shift for a
pcg> multiplication or division by a power of two may be considered
pcg> aggressive in certain languages where the properties of the two are
pcg> not precisely identical.

rcg> Huh?  If the language spec says that shifting is not the same as
rcg> multiplying for powers of two, then it is not an AGGRESSIVE
rcg> optimization to convert the multiply into a shift, rather an
rcg> ERRONEOUS one.  And not just at the intermediate language level.

Well, I once called aggressive an optimization that involves considering
the meaning of multiple language primitives. Just to continue the
contrived example of shift vs. multiplication: if the language says that
*in every relevant case* shift and multiplication by 2^N, then they are
in effect the same thing (for those relevant cases) and can be
translated in the same way.

But: if the language for example says that shift and multiplication by 2^N
ar eonly equivalent if both operands are positive, and at least one of
the operands is a variable, then the correctness of the optimization
depends on analysis of the flow of values, and is thus aggressive.

barmar> The only optimizers that are independent of the primitive
barmar> operators are peephole optimizers.

pcg> Yep, yep. Long live peephole optimizers!

rcg> Yes, peephole optimizers are usually independent of the
rcg> intermediate language, but in the case specified (about no shifting
rcg> instead of multiplying) It is REQUIRED that the peepholer know some
rcg> things about the language involved, if it doesn't have hooks into
rcg> the intermediate language or symbol table to get storage
rcg> characteristics/requirements, etc.

Why? The peepholer need only know the properties of the machine, not of
the language being compiled. In other words it can rewrite the generated
code in any way it likes, independently of the source language, as long
as it can prove that in the given context the two machine instruction
sequences are equivalent.

Note that a peephole optimizer can be aggressive, and thus more or less
error prone -- if the machine architecture gives simple rules for
instruction sequence equivalencing then probably the peephole rewriter
will be less buggy and more aggressive than one that has to deal with a
more difficult architecture. Note that IMNHO aggressive peephole
optimizers are not a good bet either.

     [ ... a clever code generation example ... ]

rcg> A good peepholer is capable of doing all this, and much more.  In
rcg> fact, there are times when peepholers can (and will) monkey with
rcg> loops, and pull things out of them that don't need to be in them.
rcg> But none of this can be done if the language specifies that this
rcg> behavior is incorrect.

Assuming that the generated machine code is correct, the source language
does not matter. Of course the peepholer can work better if it makes use
of contextual information derived from the source, but this is not
necessary or desirable.

pcg> I intended to say that the code that effects the transformation adds to
pcg> the size and complexity of the compiler and therefore affects the
pcg> compiler's reliability....You have a large amount of hairy code in the
pcg> compiler's bowels, and this is baaaad.

rcg> Having large amounts of difficult code is only bad if you find
rcg> bugs or limitations in it that require major rewrites.

Ahem. Software physics... It can also be bad if the bugs affect the
correctness of every other program in a system. A code generator is not
just difficult, it is critical. Maybe the most critical code in a
system.

You know, even little trivial bugs, than can be corrected with a one
line patch, in a code generator can cause major problems: not in the
compiler, which still works 99%, but in all the compiled programs, which
may stop working at all in unanticipated ways. Do we want to increase
the chances for any type of bug in the code generator? Maybe, if the
programmers ("dusty programmer" problem) can be expected to be even
buggier.

rcg> In this case, we are no longer talking about aggressive
rcg> compilation, rather software [re]design, engineering, and
rcg> development issues.

Yep, precisely. But the motivation for doing aggressive optimization in
the first place is precisely engineering, i.e. cost effectiveness.
Aggressive optimization is not *necessary* for correctness or
functionality, only for reducing one of the costs of a program.

The probability of increasing the problems in a compiler is a software
engineering issue, just as the probability of increasing the speed at
which programs execute, just as are the consequences of using a code
generator level aggressive optimizer versus a source level rewriter.
These are all difficult sw engineering issues.

rcg> So how about some POSITIVE discussions about what kind of features
rcg> good software productivity tools have/need, instead of this
rcg> continuous optimization-bashing festival?

Note that I am not against optimization (intended as translation to
optimal code) at all, and not even against aggressive optimization; the
former is IMNHO very important, and the latter is too, as long as it is
performed on the source, or however outside the compiler.

All the difficult problems of aggressive optimization, stemming from
program logic analysis and rewriting, are very interesting and
important, only I don't think that the best place where to solve them is
the code generator.

Source level analysis and improvement is, in the EDP world, a very
important and commercially profitable activity. EDP organizations spend
immense amounts of money each year in cleaning, maintaining, converting
dusty decks.

What I would like to see is more of this going on in the scientific
sector as well. But maybe many numerical analysts (because it is about
these that we have been speaking mostly) are more stuffy than EDP
people, and they resist any idea of learning anything beyond what they
managed to get in their undergraduate Fortran or C classes (I'm not a
programmer! Fortran II is perfectly adequate for my research, and
besides I am too busy!).

With Herman Rubin as a counterexample, just to to confirm the rule,
maybe.
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk