mcg@mipon2.intel.com (12/19/90)
[Mr. McGeady sent me a copy of his inliner, which appears to be pretty much the same version as the one in the comp.sources.unix archives. Here's some of his comments and parts of the NOTES file from the distribution. -John] I have written of these. It seems to mostly work, but probably still has bugs in it. I don't use this on a regular basis anymore, but it was robust enough for several large software projects. Please send back bug fixes if you decide to use this. I also have some correspondence on file from people over the years who have pointed out minor problems, some of which I've fixed and some of which I haven't. Ask via return mail if you want any of this. The copyright statement is hereby relaxed for sufficient for you to use this program for any direct use, but not to sell it or otherwise redistribute it without asking me first. Hope this is OK. I don't want to get caught up in supporting something that's 4 years old. [From the NOTES file:] 'Inline' Implementation Notes - Mon Apr 27 14:28:43 PDT 1987 - Sat May 9 20:53:04 PDT 1987 - Wed Sep 23 16:47:59 PDT 1987 (c) Copyright, 1986, 1987, S. McGeady - all rights reserved. General Comments: This program was written as part of a challenge. When discussing the implementation of an inline substituter with a colleague (Jim Valerio), he suggested that the best way to do this was by parsing the entire program and rewriting expressions, whereas I felt that it could be done entirely on a lexical level. I didn't want to go to the trouble of parsing, keeping an entire symbol table, etc. He felt that one didn't have enough information at the lexical level to do the job. While I suppose I have proved my point, I think that Jim also proved his. This program consists of over 4000 lines of totally twisted code, and most of the complication herein involves working around things that wouldn't need to be worked around if I had a parse tree. While there are areas where I resort to some ad-hoc recursive-descent parsing, for the most part the problems are solved with goofy lexical heuristics. I have pumped a lot of code through this program, and that leaves me to believe it is mostly correct, but I would have felt better if I had a formal grammar. Some of the biggest problems that I ran into are things I didn't think about originally, such as confusing user-defined types (typedefs) and structure tags with identifiers, problems related to identifier hiding between scopes, and the parsing and regurgitating of function pointer declarations. Until recently, this program didn't support the biggest area where inline functions are needed - in the control parts of for() and while() loops, i.e. the places where you need an expression, rather than an embedded local-scope block. I long thought that this problem couldn't be solved without a parse tree, but it turns out that the heuristics to expressionize a routine are fairly simple (at least in comparison to the rest of this program) - about 300 lines of code. It is amusing to note that the previous version of these notes contained the statement, referring to this module (rewrite.c) "I have come up with some heuristics for this, but they are too grotesque for even me to put in at this point." They really aren't that bad. In my defense, I did finish this program (insofar as this is finished), and Jim never finished his. On the whole, this does a more complete and correct job than the inliner in the release of C++ that I have seen (an early one), which won't handle inlines with loops in them, and which screws up inlines with multiple returns. Someday I may rewrite this to contain a full C parser, but not today. What to expect from this in terms of performance improvement in real code? Well, the operative phrase is "your mileage may vary". I took the widely-distributed "compress" program, which is a prime candidate for this because the input and output routines are called once per character, and are only called once. For very large files, "compress" was improved by 10% by making the output routine an inline. I suspect a more typical example is 2-5%. Inlining does *not* generally buy you big increases in performance, and it is generally *not* going to get you anything in carefully-crafted C code. It's major utility is to provide a new feature in C - macros that are not sensitive to side-effects in their arguments, and which can be more complex than cpp macros typically can be. Perhaps this will be considered the "prior art" needed for inline's inclusion in C next time around. S. McGeady ---------------- Overall Strategy: The general technique is to take a declaration like: inline int foo(a,b) { int c; return(c = a+b); } and a call to it like: boff = foo(x+y,400); and turn it into a local block like: { int __foo_a; int __foo_b; int __foo_retval; __foo_a = (x+y); __foo_b = (400); { int c; __foo_retval = c = __foo_a + __foo_b; goto __foo_end; } __foo_end: boff = __foo_retval; } Additionally, we turn it into an expression like: (retval = c = a+b),retval In case it is used in one of these sorts of contexts: while(foo(1,2)) { ... } for ( ; x < foo(2,3); ..) { ... } We also try to do some optimization on the actual parameters themselves, only assigning them to block-locals (e.g. __foo_a) when they are either assigned to inside the inline or when they have side-effects (e.g. foo(x++,y)), so that the actual expansion above looks more like: { int __foo_a; int __foo_b; int __foo_retval; { int c; __foo_retval = c = (x+y) + (400); goto __foo_end; } __foo_end: boff = __foo_retval; } There are a myriad of other little issues, including renaming variables so as not to conflict with those in enclosing scopes, etc. To avoid any naming conflicts, the following variable renaming is done: 1) if a procedure calls an inline, all of it's local variables are changed to '_auto_var' (where 'var' is the original variable name). this prevents collision of external references from inlines that have been expanded with local variables of the same name in the calling routine; 2) formal parameter names in inlines are changed to avoid conflicts during initialization. they are changed to __foo_formal (where 'foo' is the inline function name and 'formal' is the formal name; 3) variables in locally-scoped block within inlines are renamed to '_loc_var_lev' (where 'var' is the variable name and 'lev' is a digit indicating the scoping level), so that when the statement is expressionized and all of the initializations are moved to the same level, the local variables remain unique. -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.