johnson@p.cs.uiuc.edu (Ralph Johnson) (05/11/89)
The work of Davidson and Fraser points the way to a practical UNCOL. We have been using a register transfer langauge as an intermediate language in an optimizing compiler for Smalltalk. Davidson and Fraser designed a machine independent peephole optimizer that would translate a machine language program into register transfers, optimize the resulting program, and combine the transfers back into machine language. They claimed that register transfers can be a useful UNCOL. Early attempts at UNCOLs, they claimed, were essentially union languages that tried to include all the features that any machine might have. In contrast, register transfers contain only the intersection of features that any machine might have. They claimed that their combining algorithm would reconstruct more complicated instructions. Our use of register transfers as an intermediate language has been very successful. Our original system was quite similar to PO, which Davidson wrote. The main difference was that we generated register transfers directly. Also, we attempted more global optimizations and better register assignment, though this was not significantly better than what Davidson and Fraser did. Our version of the combining algorithm was much faster than theirs because we could use a lot of tricks in Smalltalk. However, we gradually changed our code generator until it is now quite different from PO. In particular, our intermediate data structure is similar to Static Single Assignment (SSA), invented by folks at IBM. We think that we now have a very elegant and reasonably efficient low-level representation for programs that is easy to retarget to different machines. The code generator has only been targeted to two different machines, so we don't yet have proof that it is really portable. One of the machines is the 68020 and the other is a local horizontally microcoded machine. However, our model stretches to fit both those machines, which is a pretty good sign. I don't think we will have any trouble with RISC machines. In summary, I think that a very simple-minded register transfer language is a good candidate for an UNCOL, using a combining algorithm similar to that of Davidson and Fraser for converting UNCOL programs into machine language for each kind of machine. Ralph Johnson [Isn't this the same model that GCC uses, fairly successfully? In any event, I wonder how well it addresses machines like the 360 and to some extent the PDP-10 where there are register asymmetries, e.g. the famous 360 BXLE instruction which does "r1 += r2, if r1 <= r3 goto x", but r2 and r3 have to be in an even-odd register pair. I know that IBM's PL.8 uses a similar scheme to generate 370 code, but believe that it has a lot of hints to help put things in the right registers. I know that such instructions are now considered old-fashioned, but there are a lot of old-fashioned machines. -John] certain registers in [From Ralph Johnson <johnson@p.cs.uiuc.edu>] -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
pardo@june.cs.washington.edu (David Keppel) (05/13/89)
Ralph Johnson <johnson@p.cs.uiuc.edu> writes: >[Fraser/Davidson RTL IR] >[moderator: isn't that what gcc uses (successfully)?] My compilers prof has pointed out to me that matching for optimizations and instruction selection in an RTL (Fraser/Davidson, IBM PL.801) is a special case of matching in a DAG (directed acyclic graph) -structured ir (Graham/Glanville/Henry). One important distinction is that all matching in an RTL is done at the level of things that usually look like RISC machine instructions. Thus the code quality is better when there is a good match between the instructions in the machine-independent RTL and the instructions in the physical hardware. ;-D on ( The code degenerator ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
compilers-sender@ima.ima.isc.com (05/18/89)
> The work of Davidson and Fraser points the way to a practical UNCOL. > We have been using a register transfer langauge as an intermediate > language in an optimizing compiler for Smalltalk. Davidson and Fraser > designed a machine independent peephole optimizer that would translate > a machine language program into register transfers, optimize the > resulting program, and combine the transfers back into machine language. But an RTL really only addresses part of the UNCOL problem, namely one part of the executable code. There are some knotty problems in the data side-- think about the usual nasties like FORTRAN EQUIVALENCE and other problems of defining the "memory model" assumed by the intermediate language. Para- meter passing has been another sticking point--although maybe machines are more similar now than they were 10-15 years ago. (Does anyone still use copy/restore for FORTRAN?) I'll grant that their work *does* suggest some useful progress for the part of the UNCOL problem it does address. The trouble with trying to figure out why various UNCOL attempts didn't succeed is that it's usually not possible to point to any single area and say "THAT's where the big problem is; that's why it didn't work." It tends to be a myriad of little problems that overwhelms a putative solution as it tries to migrate from experimental to practical. (That's also why I was fussing earlier about OSF's two-machine/one-language requirement--it may not be enough to find problems.) --- Dick Dunn UUCP: {ncar,nbires}!ico!rcd (303)449-2870 [Last time I checked, IBM 370s use copy/restore because the lack of indirect addressing makes direct reference slow. I suggest that one of the ANDF target machines for the demo should be one with a wordsize not a power of two. -John] -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
johnson@p.cs.uiuc.edu (Ralph Johnson) (05/18/89)
Question from editor: [Isn't this the same model that GCC uses, fairly successfully? In any event, I wonder how well it addresses machines like the 360 and to some extent the PDP-10 where there are register asymmetries, e.g. the famous 360 BXLE instruction which does "r1 += r2, if r1 <= r3 goto x", but r2 and r3 have to be in an even-odd register pair. I know that IBM's PL.8 uses a similar scheme to generate 370 code, but believe that it has a lot of hints to help put things in the right registers. I know that such instructions are now considered old-fashioned, but there are a lot of old-fashioned machines. -John] It is quite similar to the model that GCC uses. GCC has also been influenced by the work of Robert Henry at U. of Washington. Either our compiler or PO will handle instructions that combine assignments with changes of control flow. I think that other table-driven code generators, like Henry's, will also handle those kind of instructions. The technique we use for register allocation will handle register asymmetries with no problem. I don't recall how PO did it. In our compiler, converting register transfers into machine instructions creates constraints on register allocation. For example, when the register transfer r3 = r1 + r2 gets converted to a 68020 ADD instruction, registers r1 and r3 are constrained to be the same, because ADD is a two-address instruction. If this constraint cannot be met (i.e. if r1 is used later) then the register allocator will have to introduce a new logical register. We use a graph-coloring based register allocation algorithm. In general, these kinds of algorithms are good at handling constraints. Ralph Johnson [From Ralph Johnson <johnson@p.cs.uiuc.edu>] -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request