[comp.lang.c++] inline policy

pcg@aber-cs.UUCP (Piercarlo Grandi) (11/10/89)

I was really astonished when I realized that the "peephole" optimizer in
the new System V 3.2 for the 386 does (by default!) inlining, based on an
analysis of the assembler code generated by the compiler. There is even
an option ("-y N") that allows you to tell the optimizer not to inline
code sections longer than N bytes (or instructions?).

A daring, but probably effective, approach, that is entirely source language
independent. Inlining the assembler code could of course be applied also to
cfront output.

An aside: I'd really be curious to see some description of that incredible
peephole optimizer. Inlining is just one of the hairy things it does. There
are newarly two dozen options to control various transformations, and none is
documented... For example, inlining creates problems with the use of the
alloca() procedure, and the way to disable it (or even the fact that inlining
is performed at all) is never mentioned.
-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

gary@dgcad.SV.DG.COM (Gary Bridgewater) (11/10/89)

In article <28966@shemp.CS.UCLA.EDU> rjc@cs.ucla.edu (Robert Collins) writes:
>In article <1520010@hpmwjaa.HP.COM> jeffa@hpmwtd.HP.COM (Jeff Aguilera) writes:
>>As Gary Bridgewater fell off the deep end, he wrote:
I was pushed.
>>If you really want such fine control over your application, then program in
>>assembly, not C++.  The world will be a better place.
I used to and the world is pretty much as it was.
>>You are sure arrogant.  Since inlining involves various tradeoffs, good
>>engineering practice should dictate the decision whether a particular idiom
>>is inlined or not.  A self-centered programmer who covets every line of
>>his code and inlines for any damn reason is obviously not intelligent enough
>>to decide.
Petulant perhaps, but not arrogant. I don't advocate big inlines just the
freedom to have them. In a language full of concepts  having to remember
what I can and cannot inline seems an unnecesary burden. Programming practices
are a matter of policy and tools don't - or shouldn't - try to make policy.
Detroit doesn't make cars that will only go 55MPH (65 on Interstates).

>You should apologize to Gary too, IMHO.  I think you both should just
>relax a little!  Put a few smileys in your articles!  :-) :-) :-)  Gosh,
>I feel better already.  Try it! ;-)
I must apologize to the person I was originally replying to - I missed the
point of his posting. I don't feel I am owed one. This is almost a religious
topic and I expected some amount of disagreement. Ok - :^}

>Ok, let's talk about inlining.  I think you both are right, although I
>also think you both go a little overboard.  Inline functions are good
Again - I was pushed.
>for two things (1) avoiding function call overhead and (2) procedural
>integration (optimization). 
3) concentrating platform specific code as opposed to zillions of
   embedded #ifdef's,
4) building powerful "verbs" to allow inherently complex code to be more
    easily understood in a safer, more controlled manner than #defines,
5) as a surrogate for private functions such as is found in PL/1 or Pascal.
to name a few more.
>... An optimizing compiler has the potential to do a whole lot
>more when it can actually see all the code that is to be executed. ...
A good point - especially on RISCs.

>Now, I don't intend to imply that *every* function should be inlined.
Exactly
>I want the compiler to help me out.  I will help it, by telling it I
>think the resulting program might benefit if a given function were
>expanded inline.  I want it to help me by evaluating the costs/benefits
>of inlining each occurance of the function.  And I don't mean the
>current strategy of simply counting the number of instructions to
>implement the `standalone inlinable function'.  The evaluation has to
>take into account the *net* increase/decrease in code size that will
>result.  Now, I realize that this is a very hard problem to do well,
>especially for a C++ `compiler' that produces C as the target language.
>That is ok, though, because the last thing we want is bored, unchallenged
>compiler-writers, right? :-)
Yes - inlining for performance will be much more interesting with c++
compilers that generate native code. Now we get back to coding c++ like
it was assembly language. Why is that so abhorrent? Coding in assembler
on RISC systems is not very simple and not at all easy to maintain w.r.t.
optimization.
So, I agree that at some unspecified future date - soon, I hope, the
compiler will be spiffy enough to decide what the right thing really is
to do for inlining based on performance metrics. That time is not now
however, so I want to be able to do it. Again - that is not the same as
saying I want to do it and far from saying I want to do it all the time
to the detriment of maintainability, clarity or sanity. I don't even
want to know of an obsfucated c++ contest much less enter one.
-- 
Gary Bridgewater, Data General Corporation, Sunnyvale California
gary@proa.sv.dg.com or {amdahl,aeras,amdcad}!dgcad.SV.DG.COM!gary
Shaken but not stirred.

bs@alice.UUCP (Bjarne Stroustrup) (11/12/89)

On inlining:

Inlining is an optimization. In most cases inlining is best used only
after profiling has given you a good idea where the function call
overhead is significant enough to be worth optimizing away.

Personally, I typically use inlining only for ``one liners'' where
a function body is ``clearly'' shorter than the function call it
would replace.

The value of inlining depends critically on the target machine architecture.
Typically, inlining is worthwhile more frequently for a machine with
a relatively slow function call.

Some inlining can be done by some linkers. The C++ mechanism for inlining
attempts to make necessary optimizations independent of the smarts of the
linker used. Most linkers are not smart enough to do significant inlining.
Thus, the C++ inlining mechanism can be seen as a portability aid.

Not every call of an inline function can be inlined. Consider a pair of
mutually recursive inline functions with a non-trivial criteria for ending
the recursion. In the limit, this is the halting problem. The criteria for
when to inline will be compiler dependent. Cfront will inline a function
at most once in a single expression. The primary reason for this is to
protect against recursion. A secondary reason is to allow different inlinings
of a function in a block to share local temporary variables. The latter can
be important on architectures with limited stack space.

If the address of an inline function is taken or if it is decided that a call
of an inline cannot be inlined an ``outline'' version must be layed down
somewhere. Since `inline' implies internal linkage the obvious thing to
do is to lay down a `static' function. Doing this in several files can
cause several ``outlined'' copies of a single inline function to be layed
down. Such replication can be avoided, but currently no cfront/linker
combination does that.

It is possible for a compiler to estimate the relative cost of a function
call and inlining of the called function. Cfront makes such estimates
and outlines a function if the inlined code would be ``too large''
relative to the call. This is a source of outlined inlines. Improvements
are clearly possible here.

pcg@aber-cs.UUCP (Piercarlo Grandi) (11/13/89)

    A more reasonable approach (which unfortunately only a true compiler like
    G++ can implement) is to let the compiler generate the code on the fly
    for each potential inlined function and then
Message-ID: <1485@aber-cs.UUCP>
Date: 9 Nov 89 22:19:42 GMT
Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi)
Organization: Dept of CS, UCW Aberystwyth
	(Disclaimer: my statements are purely personal)
Lines: 20

I was really astonished when I realized that the "peephole" optimizer in
the new System V 3.2 for the 386 does (by default!) inlining, based on an
analysis of the assembler code generated by the compiler. There is even
an option ("-y N") that allows you to tell the optimizer not to inline
code sections longer than N bytes (or instructions?).

A daring, but probably effective, approach, that is entirely source language
independent. Inlining the assembler code could of course be applied also to
cfront output.

An aside: I'd really be curious to see some description of that incredible
peephole optimizer. Inlining is just one of the hairy things it does. There
are newarly two dozen options to control various transformations, and none is
documented... For example, inlining creates problems with the use of the
alloca() procedure, and the way to disable it (or even the fact that inlining
is performed at all) is never mentioned.
-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

bright@Data-IO.COM (Walter Bright) (11/14/89)

The inline keyword is a *hint* to the compiler, nothing more, nothing less.
It is not a command.

The similarities to the register keyword are worth
noting. The compiler is free to use or ignore the register keyword. In
fact, modern compilers ignore it and do their own register assignments.
The original reason for the register keyword are that compilers weren't
sophisticated enough to do it automatically.

Currently (with the separate compilation model), compilers are not
sophisticated enough to make the proper decisions about inlining versus
instantiating functions. The inline keyword is therefore there to help
out. I fully expect in the future that the inlining decision will be
made appropriately by the compiler, and that the inline keyword can be
dropped.

tom@elan.elan.com (Tom Smith) (11/14/89)

From article <10121@alice.UUCP>, by bs@alice.UUCP (Bjarne Stroustrup):
> On inlining:

[ discussion of inlining being an optimization with varied value removed ]

> Personally, I typically use inlining only for ``one liners'' where
> a function body is ``clearly'' shorter than the function call it
> would replace.

My most common use by far is "access functions", to provide read-only
access to private data members.

> Cfront will inline a function
> at most once in a single expression. The primary reason for this is to
> protect against recursion. A secondary reason is to allow different inlinings
> of a function in a block to share local temporary variables. The latter can
> be important on architectures with limited stack space.

This seems like a limiting and potentially damaging generalization.
Take the following illustration of an inline access-type member function:
    class Foo {
	int a;
    public:
	int A() const  { return a; }
    };

There is clearly (from the compiler's point of view) no danger of recursion
or side affects given the lack of local variables and the 'const' declaration
of the function itself.  If the above comment regarding inlining within
a statement is taken literally, the following results in a static "outlined"
copy of Foo::A():
    int b = foo.A() + foo.A() / 2;

After using C++ almost exclusively for 3 years on several large-scale
projects, I have found a major problem to be that of executable size.
The primary causes of an overly-large executable are either multiple
virtual function table instances (no longer a problem in cfront 2.0),
or gratuitous local function copies resulting from non-inlineable functions.
If cfront outlines a copy of Foo::A() in the statement above, without
issuing a warning, then it will be necessary for programmers using the
AT&T C++ implementation to examine the translated output of cfront
regularly for outlined functions, and to tune their source code in
non-intuitive ways merely to ensure adequate performance.

    Thomas Smith
    Elan Computer Group, Inc.
    tom@elan.com, ...!{ames, uunet, hplabs}!elan!tom

sabbagh@acf5.NYU.EDU (sabbagh) (11/14/89)

In article <2203@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes:
>The inline keyword is a *hint* to the compiler, nothing more, nothing less.
>It is not a command.
>

OK, OK, I've resisted long enough.  Here's another reason for C++ source code
translators to DO inline expansions:

	-- To permit vectorizing C compilers to optimize loops.

I am working on a large numerical simulation of fluid flow.  I have
been pondering the use of C++ via cfront to generate Cray II code.
In order to maximize the use of the vectorizing C compiler, I _need_ some 
of the member functions associated with my Vector class to be inlined,
since the subroutine overhead is much larger than the two or three 
vector instructions needed to do things like dot products, etc. on the
Cray.

BTW, Walter: I just received Zortech 2.0 Developer's kit and am 
_very_ impressed. Keep up the excellent work.

-hgs