[comp.compilers] What is ...

chased@Eng.Sun.COM (David Chase) (08/01/90)

> I do not see how a compiler can recognize that the complicated body of code
>  can be replaced by a single instruction and make that substitution.

Sounds to me like another case of someone who hasn't done their homework.
Besides peephole optimization (mentioned by Preston Briggs), there's also
Graham-Glanville style code-generation, which has been adapted in recent
years to take advantage of improvements in tree-pattern matching.  Check out
the work of (for instance) Hatcher and Christopher, Pelegri-Llopart, Henry,
Tsiang (et al?), Fraser, and Aho and Ganapathi.  You should be able to find
this (and pointers to other work) in POPL proceedings back through about
1985.  O'Donnell's book, _Equational Logic as a Programming Language_ (MIT
press, 1985), gives a nice treatment of pattern-matching, though it doesn't
include the new table compression tricks or selecting the cheapest pattern
out of many.  For completeness (though non-Europeans may find these papers
harder to obtain) there's also been work by Wilhelm, Giegerich, Weisgerber,
and Moncke, and Kron's thesis from UC Santa Cruz (write to UMI for this one).

To make a long story short, you just write down the patterns, and (for
machines with registers [see Pelegri's thesis for details]) after a few
coffee breaks the tree-pattern-matcher generator spits out tables for you.
Find a new pattern, just add it to the list of patterns with a cost and code
generation rule, and you're set.  There's a bit of additional work required
to canonicalize the input patterns (think of all the horrible ways that
 
     ceil(x + 0.5) - (fmod(x*0.5 + 0.25, 1.0) != 0)

could be permuted), but similar transformations are already done to assist in
the recognition of loop-invariant and inductive computations (for example).
People have proposed optimizing APL in a similar fashion (Cheng, Omdahl and
Strawn, "Idiom Matching ...", Iowa SU Tech report, 1982), but had problems
with table sizes.  New techniques (not using costs) build tables that use
less than 11K without further compression [POPL, 1987, pp 168-177], or less
than 1K if the tables are further compressed [unpublished result -- use a
traveling salesman approximation to get the minimum size for a run-compressed
representation of the indexing arrays].

Do people build code generators like this today?  Yes.  Write to Robert
Henry, rrh@cs.washington.edu, and ask for UWCODEGEN.

David Chase
Sun Microsystems, Inc.
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus| world}!esegue.  Meta-mail to compilers-request@esegue.