burley@gnu.ai.mit.edu (Craig Burley) (05/03/91)
I disagree that "inline" in C/C++ compilers has been oversold...at least, it hasn't in any literature I've read. I agree with the idea that arbitrary inlining can be disadvantageous to performance and program size. However, the "marketing" for inline has always been, for me, the ability to compose smaller semantically correct functions without undue performance implications. For example: Used to be, if I wrote a name-space manager that happened to keep all the names it knew about on a linked list, any users of that manager might do the following to get the next name: name n; /* "name" is, say, "typedef struct {...} name;" */ ... n = n->next; Then I learned (through education and the necessities of maintenance and debugging) to abstract away such things so the implementation was known only to the name-space manager, not to all its users: n = name_next(n); (Or, better yet, define a nameHandle typedef and let users use that, in case I needed to change name's implementation of lists for optimization or other purposes.) The obvious way to do this in C is via a macro (in name.h): #define name_next(n) ((n)->next) All the parens are, in the general case, necessary. Now, when I started down this path, I naturally found out (as do most C programmers of library packages of this sort) that macro replacement (string substitution) has its problems. Parenthesizing to avoid surprising precedence problems are one thing, but two other problems come up: multiple side effects when referencing an argument twice, as in the trivial case of: #define some_function(i) ((i) * 3 + (i) / 2) (Where if "some_function" is invoked with an expression having a side effect, the side effect happens twice.) The other nastier problem (since I wasn't used to it) hit me the other day when I used a macro like this: #define symbol_set_value(s,value) ((s) = expression_eval(value)) The compiler barfed, and after an unpleasant amount of mucking around, I discovered that the macro's use of "value" as a dummy conflicted with expression_eval (a macro) referencing a structure member named "value" (a member of the dummy's structure type, for example), causing substitution where it wasn't wanted. Ideally, the "definer" of symbol_set_value shouldn't care what goes on in the implementation of expression_eval. (Changing the dummy argument name from "value" to "v", for example, solved the problem, but obviously shouldn't be necessary when looking at the situation as an abstractionist.) Anyway, the "proper" solution to all this is to use the mechanisms already provided for in C to cause binding and evaluation to happen the way people expect when using function-invocation form: i.e. use functions. Unfortunately, calling functions for things like evaluating "n->next" given n is almost always (on most architectures) slower AND larger (in program memory size) than doing these things directly. So, the reason for inline in C (as GCC provides), at least as I've always understood it, is to allow the (usually) more semantically reliable and predictable mechanism of function definition and reference (vs. macro definition and reference) without sacrificing performance by putting the function definition in the #include file with the "inline" attribute. That is, to promote data abstraction and encapsulation to a higher degree by offering more reliability without sacrificing performance. In C++, since the idea is to use such abstraction as much as possible, and things like references are provided so macros like #define set_me(x,v) ((x) = (v)) can be transparently made into functions (which they can't in ANSI C), it apparently drove the original decision to provide the "inline" directive so such trivial abstractions can still be compiled into efficient code. (At least I think "inline" first appeared in C++, but I'm not sure.) There are arguments against inlining that still apply to trivial abstractions like the ones I've shown above: - Changing them requires recompiling all user modules (sure, that's true, but changes to trivial abstractions, at least in my experience, tend to be changes to names and formal parameters at least as often as implementation, and the former changes require recompilation anyway) - Inlining slows down compilation (yes, of course abstraction where direct expression entry would suffice will slow down compilation in most cases, but that's a case of having the computer do the kind of work it is supposed to do -- it should "know" what "symbol_next" means and how to do it, rather than the programmer "knowing" to do "n->next" each time, and having to change every such reference when the implementation changes - Inlining slows down optimization (yes, because optimization is typically much better for trivial abstractions, so code surrounding these trivial inlines doesn't get broken up by a call possibly requiring dumping/restoring registers and breaking control-flow analysis) - Inlining doesn't always do as good a job as interprocedural analysis (yes, and interprocedural analysis is orders of magnitude more difficult to implement so that it does at least as good as inlining in most cases...and even then, inlining gives interprocedural analysis a great head start as long as it is intelligently used) Inlining naturally has advantages beyond what I've discussed for trivial abstractions. For example, in cases where an actual argument to an inline procedure is a constant, the inline function definition might be easily optimized by the compiler to fold the constant; in cases where a formal argument isn't even referenced by the inline function definition, the compiler might decide to avoid wasting time constructing the corresponding actual argument (evaluating the expression, except for its side effects perhaps); and so on. However, I don't think "inline" is anything close to a GENERAL solution for control over optimization of an application, and I always support the idea of doing before/after performance evaluation when turning on inlining. (Note: "turning on" inlining in a decent compiler like GCC near the end of a large programming project is much less likely to result in lots of new bugs introduced compared to "turning on" macroization of trivial abstractions as has been typical in the past; I've participated in the latter sort of project to know how dangerous it is, which is why I like "inline" so much.) For example, when it comes to a function definition that isn't "trivial" (simple get or set of a member value or evaluation of a simple function) but still smallish, one doesn't always want to say "inline me always" in the definition as "inline this guy here" in the reference, such as inside a tight loop. On the other hand, using naming conventions, #include, and so on, I suppose "inline" could be used to create a mechanism for doing this in a standard way on a given project. Generally, I wouldn't recommend "inline" for any function that incorporates a loop or a decision without first evaluating exactly the extent to which the inline code differs from the code needed to make a call to a procedure (non-inline) version of the function. (That would be a "nonportable" decision, i.e. one which needed to be reevaluated for every port to a new machine; it is rarely necessary to do so for inlining trivial abstractions.) Oh, one other advantage of inline that I'm not sure of but makes conceptual sense anyway: to a debugger, an inline function has a lot more chance of looking like something useful (something "invokable" at the debugger command line, for example) than a corresponding macro (since the latter usually is removed by the preprocessor and thus never seen by the debugger). Hope this rambling helps people who thought inline was great then weren't so sure. It's great when you remember why it was created (IMHO): for fairly trivial abstractions/functions. To make a lexer run faster by inlining the function that implements it? Naaah. :-) -- James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.