[comp.compilers] Squashing C Source - Inline

mcg@mipon2.intel.com (12/19/90)
[Mr. McGeady sent me a copy of his inliner, which appears to be pretty much
the same version as the one in the comp.sources.unix archives.   Here's some
of his comments and parts of the NOTES file from the distribution. -John]

I have written of these.  It seems to mostly work, but probably still has
bugs in it.  I don't use this on a regular basis anymore, but it was robust
enough for several large software projects.  Please send back bug fixes if
you decide to use this.  I also have some correspondence on file from people
over the years who have pointed out minor problems, some of which I've fixed
and some of which I haven't.  Ask via return mail if you want any of this.
The copyright statement is hereby relaxed for sufficient for you to use this
program for any direct use, but not to sell it or otherwise redistribute it
without asking me first.  Hope this is OK.  I don't want to get caught up in
supporting something that's 4 years old.

[From the NOTES file:]

'Inline' Implementation Notes	-   Mon Apr 27 14:28:43 PDT 1987
				-   Sat May  9 20:53:04 PDT 1987
				-   Wed Sep 23 16:47:59 PDT 1987

(c) Copyright, 1986, 1987, S. McGeady - all rights reserved.

General Comments:

This program was written as part of a challenge.  When discussing the
implementation of an inline substituter with a colleague (Jim Valerio), he
suggested that the best way to do this was by parsing the entire program and
rewriting expressions, whereas I felt that it could be done entirely on a
lexical level.  I didn't want to go to the trouble of parsing, keeping an
entire symbol table, etc.  He felt that one didn't have enough information
at the lexical level to do the job.  While I suppose I have proved my point,
I think that Jim also proved his.  This program consists of over 4000 lines
of totally twisted code, and most of the complication herein involves working
around things that wouldn't need to be worked around if I had a parse tree.
While there are areas where I resort to some ad-hoc recursive-descent parsing,
for the most part the problems are solved with goofy lexical heuristics.  I
have pumped a lot of code through this program, and that leaves me to believe
it is mostly correct, but I would have felt better if I had a formal grammar.

Some of the biggest problems that I ran into are things I didn't think about
originally, such as confusing user-defined types (typedefs) and structure tags
with identifiers, problems related to identifier hiding between scopes, and
the parsing and regurgitating of function pointer declarations.

Until recently, this program didn't support the biggest area where inline
functions are needed - in the control parts of for() and while() loops,
i.e. the places where you need an expression, rather than an embedded
local-scope block.  I long thought that this problem couldn't be solved
without a parse tree, but it turns out that the heuristics to expressionize
a routine are fairly simple (at least in comparison to the rest of this
program) - about 300 lines of code.  It is amusing to note that the
previous version of these notes contained the statement, referring to this
module (rewrite.c) "I have come up with some heuristics for this, but they
are too grotesque for even me to put in at this point."  They really
aren't that bad.

In my defense, I did finish this program (insofar as this is finished), and
Jim never finished his.  On the whole, this does a more complete and correct
job than the inliner in the release of C++ that I have seen (an early one),
which won't handle inlines with loops in them, and which screws up inlines
with multiple returns.

Someday I may rewrite this to contain a full C parser, but not today.

What to expect from this in terms of performance improvement in real code?
Well, the operative phrase is "your mileage may vary".  I took the
widely-distributed "compress" program, which is a prime candidate for this
because the input and output routines are called once per character, and are
only called once.  For very large files, "compress" was improved by 10% by
making the output routine an inline.  I suspect a more typical example is
2-5%.  Inlining does *not* generally buy you big increases in performance,
and it is generally *not* going to get you anything in carefully-crafted C
code.  It's major utility is to provide a new feature in C - macros that are
not sensitive to side-effects in their arguments, and which can be more
complex than cpp macros typically can be.  Perhaps this will be considered
the "prior art" needed for inline's inclusion in C next time around.

S. McGeady

----------------

Overall Strategy:


The general technique is to take a declaration like:

	inline int foo(a,b) {
		int c;

		return(c = a+b);
	}

and a call to it like:

	boff = foo(x+y,400);


and turn it into a local block like:

	{
		int __foo_a; int __foo_b;
		int __foo_retval;
		__foo_a = (x+y);
		__foo_b = (400);
		{
			int c;
			__foo_retval = c = __foo_a + __foo_b;
			goto __foo_end;
		}
	__foo_end:
		boff = __foo_retval;
	}

Additionally, we turn it into an expression like:

	(retval = c = a+b),retval

In case it is used in one of these sorts of contexts:

	while(foo(1,2)) { ... }
	for ( ; x < foo(2,3); ..) { ... }

We also try to do some optimization on the actual parameters themselves, only
assigning them to block-locals (e.g. __foo_a) when they are either assigned
to inside the inline or when they have side-effects (e.g. foo(x++,y)), so
that the actual expansion above looks more like:

	{
		int __foo_a; int __foo_b;
		int __foo_retval;
		{
			int c;
			__foo_retval = c = (x+y) + (400);
			goto __foo_end;
		}
	__foo_end:
		boff = __foo_retval;
	}


There are a myriad of other little issues, including renaming variables
so as not to conflict with those in enclosing scopes, etc.  To avoid any
naming conflicts, the following variable renaming is done:

	1) if a procedure calls an inline, all of it's local variables
	   are changed to '_auto_var'  (where 'var' is the original variable
	   name).  this prevents collision of external references from inlines
	   that have been expanded with local variables of the same name
	   in the calling routine;

	2) formal parameter names in inlines are changed to avoid
	   conflicts during initialization.  they are changed to
	   __foo_formal (where 'foo' is the inline function name and 'formal'
	   is the formal name;

	3) variables in locally-scoped block within inlines are renamed
	   to '_loc_var_lev' (where 'var' is the variable name and 'lev'
	   is a digit indicating the scoping level), so that when the
	   statement is expressionized and all of the initializations are
	   moved to the same level, the local variables remain unique.
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.