markhall@pyrps5.pyramid.com (Mark Hall) (11/30/88)
I'm interested in knowing whether any of the common code generator-generators (Glanville, Cattell, Ganapathi, etc) are used in any production compilers. Assuming they are not, is this just a manifestation of academia vs. industry, or are there real reasons for not using them in (*ahem*) real compilers? Pointers and opinions welcome. Particulary from those with experience with more than one CGG. -Mark Hall (smart mailer): markhall@pyramid.pyramid.com {amdahl|decwrl|sun|seismo|lll-lcc}!pyramid!markhall -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
grow@druhi.att.com (12/02/88)
In article <2964@ima.ima.isc.com>, markhall@pyrps5.pyramid.com (Mark Hall) writes: > I'm interested in knowing whether any of the common code > generator-generators (Glanville, Cattell, Ganapathi, etc) are used in > any production compilers. ... The AIE Ada compiler done by Intermetrics Incorporated of Cambridge Mass. used the PQCC (Production Quality Compiler Compiler) technology. This compiler does very sexy instruction selections and register allocations. If anybody mentions this compiler they are very libale to say that it was a real dog as far as its speed of compilation was concerned. This is probabily true but, the blame should be placed on the intermediate representations (Diana to be exact) rather than the code generator technology! The ALS Ada compiler done the Softech Corporation of Waltham Mass. used the Graham-Glanville technology. This compiler's code wasn't nearly as sexy as the code produced by the AIE but, it was supposed to be fairly good. The same thing can be said about the speed of compilation that was said about the AIE. Note, I believe that the ALS compiler is still under development, so any comments made about it must be taken as relevant only to it's early history! -- Gary Oblock Compiler Consultant to Bell Labs att!druhi!grow (303)538-4169 -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
knighten@encore.uucp (Bob Knighten) (12/06/88)
In article <2964@ima.ima.isc.com>, markhall@pyrps5.pyramid.com (Mark Hall) writes: > I'm interested in knowing whether any of the common code > generator-generators (Glanville, Cattell, Ganapathi, etc) are used in > any production compilers. ... I've tangentially worked on two and designed a third code generator using CGGs, so I guess I've something to offer. The first was the SofTech ALS Ada compiler for the Vax. I worked on this towards the end of the implementation phase. This was a fairly direct implementation of the technique described in Glanville's thesis. Graham was actually a consultant during the early design stages for this compiler. Glanville didn't really address global optimization/register allocation issues and the ALS compiler dealt with this by doing on-the-fly register allocation based on lifetime information produced by the global optimizer which worked on a largely machine independent expanded DIANA tree. The parser generator was written by a SofTech engineer. It was written in Ada, the parsers were in Ada and the parser tables were Ada aggregates. A minor but important change in the language accepted by the parser (compared to Glanville's design) was to factor addressing modes. This was crucial to bring the size of the parse tables for the VAX down to merely LARGE. This compiler was retargeted to various of the Intel microchips (8086, 80286, ?) and to a navy computer (AN/UYK-?). The local code generation was acceptable, but for the VAX and AN/UYK the code quality was poor. The reason for this was that the register management scheme really did not work at all well. The code quality for the Intel machines was fairly good. This was partly because register management is just not very important and partly because more tuning was done on the compilers. The SofTech code generator was redesigned (by a group headed by me and a colleague) for the NTT/DIPS machine. This is a variation on the IBM 370 architecture. Instead of using the straight LR approach of Graham and Glanville, we went to an affix grammar approach. This is similar too, but not the same as Ganapathi's approach. For this we contracted with MetaWare to modify their TWS to produce an affix grammar parser. The actual language used is described in the MetaWare TWS manual (such as it is). It shows a notable Ada influence. Again the parser tables are Ada aggregates, and the parser skeleton is written in Ada. We also changed a number of aspects of the code generation/register allocation process. The machine description used was not for the actual machine, but rather for a similar virtual machine having all three operand instructions and an infinite supply of all types of registers. [In contrast to the IBM machine, the DIPS has separate arithmetic and address registers.] Following code generation, we then do flow analysis, replacement of three operand instructions by two operand instructions and register binding using a register coloring approach. I left SofTech before the implementation was complete, but I've been assured by friends who are still there that the quality of code generated is quite good. We actually had a contractual commitment to produce code within some fixed percentage of the execution speed of the best hand assembled code for a set of benchmarks and we met that goal. All of these compilers are quite slow. This was primarily because the front end and global optimizer are slow, but this just allowed the code generator to slip by. These are slow code generators, but this seems to be a reflection that speed was not important rather than an inability to be fast. The most recent CGG based compiler I was involved with is the Retargetable Back End at Prime Computer. Aspects of that project are discussed in two recent papers: Prescott K. Turner, "Up-down parsing with prefix grammars", SIGPLAN Notices, Vol. 21 No 12 (Dec. 1986) pp. 167-174 David Spector and Prescott K. Turner. "Limitations of Graham-Glanville style code generation", SIGPLAN Notices, Vol. 22 No. 2 (Feb. 1987) pp. 100-108 The major change here was that Scott Turner came up with a different parsing method. I'm told that Weisgerber and Wilhelm at the Universitat des Saarlandes have devised a similar tree matching code generator. Two compilers were built at Prime using this technique for code generation. They were used internally, but never released due to a change in corporate plans. The speed was not good and the code quality was poor. Again the local code quality was quite good, but nothing was done take advantage of global information. In particular the problem of integrating register management well had not been solved when the project was cancelled. As mentioned in <2971@ima.ima.isc.com> grow@druhi.att.com (Gary Oblock) mentions the Intermetrics AIE Ada compiler as being based on PQCC technology. Most of the recent Intermetrics compilers have been inspired by the PQCC project and the Ada compiler came much closer to using the full PQCC approach, but it's probably too strong to say it was based on the PQCC technology. This also illustrates why the term "Code generator generator" is ill-advised. The original hope of the PQCC project was to largely automate the production of compilers using as input language and machine specifications. At Intermetrics a large suite of tools have been built to help automate the building of compilers, but the process has not gotten terribly much faster or cheaper. Similarly at SofTech, the reason for going with the Graham-Glanville technique was to produce retargetable compilers. But the process of retargetting these compilers has not proved particularly easier than building hand crafted code generators from well-defined specifications. The reason for this is that local code selection - the part that can be done by parsing, tree matching, what have you - is not the hardest part of code generation, so there remains a substantial hand crafted part. There are other production compilers based on CGGs: Rodney Farrow, "Experience with an attribute grammar-based compiler" SIGPLAN Notices, 1982 This is a report on Intel's Pascal-86 compiler. Hans-Stephan Jansohn, Rudolf Landwehr, Jochen Hayek and Michael Thatner, "Generating MC68000 code for Ada", SIGPC Notes, Vol. 6 No. 2 (1983) pp. 81-87 (1983 ACM Conference on Personal and Small Computers) This is a discussion of the University of Karlsruhe Ada compiler which is based on the GAG (Generator based on Attribute Grammars) and CGSS (Code Generator Synthesis System). At least the VAX version of this compiler is available from a small German company, Systeam A.G. John Yates and Robert Schwartz, "Dynamic programming and industrial-strength instruction selection: Code generation by tiring, but not exhaustive, search", SIGPLAN Notices, Vol 23 No. 10 (October 1988) pp. 131-140 This describes the new code generators being used in the Apollo compiler for their new RISC machines. It is based on bottom-up tree matching and was implemented using LEX, YACC and C code. Finally I mention that COMPASS has an automated code generator production system (PEARL?) based on a lexical analysis technique. This is discussed in a COMPASS technical report by Kathleen Knobe and Joan Lukas. Based on my experience, both direct and indirect, I've come to feel that the value of machine generated code generators (or, more likely, machine generated code selectors) is in providing a framework for generating AND MAINTAINING many different compilers. And this not because making one new compiler is much faster this way than in a less automated fashion, but because this enforces a discipline and uniformity which is very hard to achieve without automation. I would be very interested in hearing other viewpoints on this question. Bob Knighten Encore Computer Corp. 257 Cedar Hill St. Marlborough, MA 02172 (508) 460-0500 ext. 2626 Arpanet: knighten@multimax.arpa Usenet: {bu-cs,decvax,necntc,talcott}!encore!knighten -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request