mike@thor.acc.stolaf.edu (Mike Haertel) (10/05/90)
In article <13208@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes: >Gcc has the capability to do two things: [ ... ] >* Optimization of instructions that the compiler doesn't know > how to emit, provided the instructions are in the machine > description. > >I'm much fuzzier on the latter, but I think it works something >like this: > >* The machine description contains information about how to > emit the "div" and "mod" instructions. > >* The machine description contains a description of a peephole > optimization that says something like ``if there's a "div" > instruction next to a "rem" instruction, and they operate on > the same operands, then trash the "div" instruction and get > the results from the "rem" which computes "div" as a side- > effect". > >* The compiler has no way of producing a "rem" instruction. > >* The user defines an "asm" that emits a "rem" instruction. > >* If the peephole matches, the optimization occurs, even tho' > the compiler never emitted the "rem". Sorry, gcc doesn't do this. It does have the capability of producing instructions that the code generator proper does not know how to admit; it does this, as described above, with a peephole recognizer driven by the machine description. However, the peephole recognizer is only capable of recognizing combinations that were originally produced by the code generator; it cannot recognize anything from an asm(). (Why? Because asm()s can contain arbitrary text, and writing a parser for their contents would be more trouble than it's worth, since it would need to be redone for each new architecture.) Conceivably, a restricted form of asm() could be specified that would be allowed to contain only strings in a syntax recognizable by the compiler, which could then relate asm() contents with RTL descriptions of the semantics of those contents, but the benefits of this would probably be too insignificant to be worth the effort. An unrelated topic: In a previous post I commented that gcc's back end is driven by a largely language-independent tree representation, and the moderator wondered just how language-independent it is. Offhand, I would say it is immediately suitable for Fortran, Pascal, and Modula-2. Since the tree representation was extended to support C++ (and the extended representation is now used in all versions of the compiler) I believe it is probably also suitable for Modula-3. To support Ada or Eiffel, I believe it might be necessary to extend the representation in some way to deal with generics (parameterized classes in Eiffel), but since the C++ world is looking to introduce some mechanism for the same sort of thing, it is quite possible that this extension is already being worked on for G++. Ada tasks would probably also require a further extension. It probably would not be reasonable to use the tree representation as the intermediate language for a non-imperative language. (But it might still win to generate RTL expressions and feed them directly to the true back end, although this might be difficult because of assumptions made by the optimizer about the output of the code generator.) Just to help clarify the preceding discussion, here is a block diagram of the overall structure of gcc: +--------+ +-----------+ +----------+ | |-1->| code |----------3--------->| assembly | | parser | | generator | +-----------+ | language | | |-2->| |-4->| optimizer |-5->| output | +--------+ +-----------+ +-----------+ +----------+ Notes: 1. The parser makes some direct procedure calls to the code generator for control flow statements. 2. The parser passes expression trees to the code generator. 3. The code generator makes some direct procedure calls to the output routines. (But only a very few, and none that produce actual code.) 4. The code generator produces RTL code which it feeds to the optimizer. This is where the bulk of the code generator's output goes, including all of the output corresponding to actual machine instructions. 5. The optimizer feeds RTL code to the output routines. In a non-optimizing compilation the "optimizer" block is replaced with a "stupid register allocator" block, since the output of the code generator assumes an infinite number of registers. In summary, ad-hoc procedure calls are used at (1) and (3), the tree representation is used at (2), and the RTL representation is used at (4) and (5). This should give some idea of how the various portions of the back end of gcc might be re-used in different contexts. -- Mike Haertel <mike@acc.stolaf.edu> -- Send compilers articles to compilers@esegue.segue.boston.ma.us {ima | spdcc | world}!esegue. Meta-mail to compilers-request@esegue.