compilers@ima.UUCP (01/10/86)
[from kurt at ALLEGRA/FLUKE (Kurt Guntheroth)] Organization: John Fluke Mfg. Co., Inc., Everett, WA 1. When I first learned about optimization, I thought the ideal optimizer would work in the following way: Parse the source into trees/dags/IL/whatever and reorder and simplify the trees to the optimal equivalent program. Then generate code by any reasonably good technique. The code would be almost perfect since it was generated from an optimal program. Now I find out that this doesn't work too well. Real machines are so un-orthogonal that you inevitably de-optimize the code when you generate machine instructions. People seem to be concentrating more and more on the machine language, generating machine instructions simply and then optimizing the instructions by performing translations permitted by some rule set (grammars, tables, etc.) People like Frazer swear by this method. Other people I have talked to say that almost any optimization might de-optimize code for a given processor by making it more difficult to generate some instruction sequence. What do you practitioners consider the 'right' way to do things? It seems that optimizations involving moving code would be much more difficult if you do them on machine instructions (the grammars don't handle it too well). What about this, oh gurus? 2. I once considered generating code for the 6502. This miserable little processor doesn't even have a 16 bit register so you must form and move addresses one byte at a time (ugh). I suspect the problems of generating code for the 6502 must be similar to the problems of generating good code for a segmented machine like the x86, where the low address byte is like an offset and the high address byte is like a segment. Any comments? 3. How do I get the GNU compiler tools? Ideally, are they small enough to post to usenet's net.sources? Also, is there a public domain S/SL compiler? S/SL is a tool from U of Toronto that builds table driven ad hoc top down compilers. This may sound strange, but it seems to combine many of the nice intuitive features of ad hoc recursive descent compilers with the speed and compactness of table driven parsers. Building the S/SL translator is not difficult (especially compared to writing a LALR parser generator) but I havn't done it yet and I am interested if somebody else has. 4. Has anybody seen good books on actually implementing good optimization? I have Wolfe's book on BLISS -- I mean any other ones. I can find literature about optimization, but it is generally at a very abstract level. I would like to steal implementation-level ideas if I can, instead of reinventing all the wheels in the world. Kurt Guntheroth John Fluke Mfg. Co., Inc. {uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!kurt [I haven't seen any books, but there have been many articles on machine- specific optimizations. Look at the November 1980 issue of the IBM Journal of R+D, or at the various compiler construction conference proceedings published by Sigplan. -John]
compilers@ima.UUCP (01/14/86)
[from Chris Torek <harvard!mimsy.umd.edu!gymble!chris>] Like yourself, I `thought the ideal optimizer would ... parse the source into trees/dags/IL/whatever and reorder and simplify [these] to the optimal equivalent program'. If I understand correctly, the problem with this is that the compiler's back end has to do some really strange things for some of what the front end might generate. But if this is the case then it seems to me the *real* problem is a lack of communication from the back end to the front end. What is needed is a way that the code generator can tell the optimizer, `x is not good', `y is very good', `z is only worthwhile under the following conditions', etc. Maybe a weighted production system style database that is interpreted by the front end during optimization would be the way to do this. Then you could say `if there is repeated division by a power of two inside a loop, try to determine whether the dividend is positive, even if you have to put an ``if'' around the outside and make two loops', or `loops that count down to zero are twice as good as those that count up to a constant', or whatever. (The former lets you directly use arithmetic shifts instead of integer divides on two's complement machines---this is extremely important on many microprocessors---and the latter allows use of the `sob' style VAX instructions or the NS32000 count-to-zero instructions.) Of course with enough rules this might produce the world's slowest compiler. . . . :-) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu [My understanding of the PL.8 compiler from IBM is that first it compiles into an extremely low-level intermediate code, second it does machine independent optimizations such as common subexpression and loop invariant removal, and then it does register assignment and instruction selection. The intermediate code has an infinite number of registers and very simple operations, so that real machine instructions usually subsume several intermediate instructions. I don't know how they deal with giving hints to the intermediate code generator so that it can use fancy real machine instructions, but this was part of the 801 RISC project which had a machine with very few "clever" instructions. They get superb code, but the compiler is so slow that I gather nothing less than a 370/168 will do. -John]
compilers@ima.UUCP (01/16/86)
[from allegra!sbcs!debray (Saumya Debray)] It isn't entirely surprising that transforming a HLL program P1 to an equivalent "optimal" (!?) HLL program P2 and then generating code from P2 might not always yield the best results. The virtual machine corresponding to the high-level language is really very different from that corresponding to the low-level one, and one might expect their cost criteria to be different as well. I have a couple of comments against restricting optimization _exclusively_ to low-level code, however: (1) The idea behind code optimization is to preserve equivalence between programs at the HLL level while improving the estimated cost at the low level. However, if optimization is restricted to object code, then we are necessarily required to preserve equivalence at the object code level. This may be overly restrictive, since it may not take into account the fact that certain aspects of the object code programs are irrelevant at the HLL level. (2) The level of abstraction provided by a HLL may make certain optimizations much simpler at that level than at the object code level. For example, the transformation of a recursive program to compute the factorial function, fact(N) = if N = 0 then 1 else N * fact(N-1) to tail-recursive and thence to iterative form is fairly straightforward at the HLL level, using the associativity of *. It's not obvious to me how the corresponding transformation would go working only on object code. -Saumya Debray SUNY at Stony Brook