pcg@aber-cs.UUCP (Piercarlo Grandi) (11/10/89)
I was really astonished when I realized that the "peephole" optimizer in the new System V 3.2 for the 386 does (by default!) inlining, based on an analysis of the assembler code generated by the compiler. There is even an option ("-y N") that allows you to tell the optimizer not to inline code sections longer than N bytes (or instructions?). A daring, but probably effective, approach, that is entirely source language independent. Inlining the assembler code could of course be applied also to cfront output. An aside: I'd really be curious to see some description of that incredible peephole optimizer. Inlining is just one of the hairy things it does. There are newarly two dozen options to control various transformations, and none is documented... For example, inlining creates problems with the use of the alloca() procedure, and the way to disable it (or even the fact that inlining is performed at all) is never mentioned. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
gary@dgcad.SV.DG.COM (Gary Bridgewater) (11/10/89)
In article <28966@shemp.CS.UCLA.EDU> rjc@cs.ucla.edu (Robert Collins) writes: >In article <1520010@hpmwjaa.HP.COM> jeffa@hpmwtd.HP.COM (Jeff Aguilera) writes: >>As Gary Bridgewater fell off the deep end, he wrote: I was pushed. >>If you really want such fine control over your application, then program in >>assembly, not C++. The world will be a better place. I used to and the world is pretty much as it was. >>You are sure arrogant. Since inlining involves various tradeoffs, good >>engineering practice should dictate the decision whether a particular idiom >>is inlined or not. A self-centered programmer who covets every line of >>his code and inlines for any damn reason is obviously not intelligent enough >>to decide. Petulant perhaps, but not arrogant. I don't advocate big inlines just the freedom to have them. In a language full of concepts having to remember what I can and cannot inline seems an unnecesary burden. Programming practices are a matter of policy and tools don't - or shouldn't - try to make policy. Detroit doesn't make cars that will only go 55MPH (65 on Interstates). >You should apologize to Gary too, IMHO. I think you both should just >relax a little! Put a few smileys in your articles! :-) :-) :-) Gosh, >I feel better already. Try it! ;-) I must apologize to the person I was originally replying to - I missed the point of his posting. I don't feel I am owed one. This is almost a religious topic and I expected some amount of disagreement. Ok - :^} >Ok, let's talk about inlining. I think you both are right, although I >also think you both go a little overboard. Inline functions are good Again - I was pushed. >for two things (1) avoiding function call overhead and (2) procedural >integration (optimization). 3) concentrating platform specific code as opposed to zillions of embedded #ifdef's, 4) building powerful "verbs" to allow inherently complex code to be more easily understood in a safer, more controlled manner than #defines, 5) as a surrogate for private functions such as is found in PL/1 or Pascal. to name a few more. >... An optimizing compiler has the potential to do a whole lot >more when it can actually see all the code that is to be executed. ... A good point - especially on RISCs. >Now, I don't intend to imply that *every* function should be inlined. Exactly >I want the compiler to help me out. I will help it, by telling it I >think the resulting program might benefit if a given function were >expanded inline. I want it to help me by evaluating the costs/benefits >of inlining each occurance of the function. And I don't mean the >current strategy of simply counting the number of instructions to >implement the `standalone inlinable function'. The evaluation has to >take into account the *net* increase/decrease in code size that will >result. Now, I realize that this is a very hard problem to do well, >especially for a C++ `compiler' that produces C as the target language. >That is ok, though, because the last thing we want is bored, unchallenged >compiler-writers, right? :-) Yes - inlining for performance will be much more interesting with c++ compilers that generate native code. Now we get back to coding c++ like it was assembly language. Why is that so abhorrent? Coding in assembler on RISC systems is not very simple and not at all easy to maintain w.r.t. optimization. So, I agree that at some unspecified future date - soon, I hope, the compiler will be spiffy enough to decide what the right thing really is to do for inlining based on performance metrics. That time is not now however, so I want to be able to do it. Again - that is not the same as saying I want to do it and far from saying I want to do it all the time to the detriment of maintainability, clarity or sanity. I don't even want to know of an obsfucated c++ contest much less enter one. -- Gary Bridgewater, Data General Corporation, Sunnyvale California gary@proa.sv.dg.com or {amdahl,aeras,amdcad}!dgcad.SV.DG.COM!gary Shaken but not stirred.
bs@alice.UUCP (Bjarne Stroustrup) (11/12/89)
On inlining: Inlining is an optimization. In most cases inlining is best used only after profiling has given you a good idea where the function call overhead is significant enough to be worth optimizing away. Personally, I typically use inlining only for ``one liners'' where a function body is ``clearly'' shorter than the function call it would replace. The value of inlining depends critically on the target machine architecture. Typically, inlining is worthwhile more frequently for a machine with a relatively slow function call. Some inlining can be done by some linkers. The C++ mechanism for inlining attempts to make necessary optimizations independent of the smarts of the linker used. Most linkers are not smart enough to do significant inlining. Thus, the C++ inlining mechanism can be seen as a portability aid. Not every call of an inline function can be inlined. Consider a pair of mutually recursive inline functions with a non-trivial criteria for ending the recursion. In the limit, this is the halting problem. The criteria for when to inline will be compiler dependent. Cfront will inline a function at most once in a single expression. The primary reason for this is to protect against recursion. A secondary reason is to allow different inlinings of a function in a block to share local temporary variables. The latter can be important on architectures with limited stack space. If the address of an inline function is taken or if it is decided that a call of an inline cannot be inlined an ``outline'' version must be layed down somewhere. Since `inline' implies internal linkage the obvious thing to do is to lay down a `static' function. Doing this in several files can cause several ``outlined'' copies of a single inline function to be layed down. Such replication can be avoided, but currently no cfront/linker combination does that. It is possible for a compiler to estimate the relative cost of a function call and inlining of the called function. Cfront makes such estimates and outlines a function if the inlined code would be ``too large'' relative to the call. This is a source of outlined inlines. Improvements are clearly possible here.
pcg@aber-cs.UUCP (Piercarlo Grandi) (11/13/89)
A more reasonable approach (which unfortunately only a true compiler like G++ can implement) is to let the compiler generate the code on the fly for each potential inlined function and then Message-ID: <1485@aber-cs.UUCP> Date: 9 Nov 89 22:19:42 GMT Reply-To: pcg@cs.aber.ac.uk (Piercarlo Grandi) Organization: Dept of CS, UCW Aberystwyth (Disclaimer: my statements are purely personal) Lines: 20 I was really astonished when I realized that the "peephole" optimizer in the new System V 3.2 for the 386 does (by default!) inlining, based on an analysis of the assembler code generated by the compiler. There is even an option ("-y N") that allows you to tell the optimizer not to inline code sections longer than N bytes (or instructions?). A daring, but probably effective, approach, that is entirely source language independent. Inlining the assembler code could of course be applied also to cfront output. An aside: I'd really be curious to see some description of that incredible peephole optimizer. Inlining is just one of the hairy things it does. There are newarly two dozen options to control various transformations, and none is documented... For example, inlining creates problems with the use of the alloca() procedure, and the way to disable it (or even the fact that inlining is performed at all) is never mentioned. -- Piercarlo "Peter" Grandi | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcvax!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
bright@Data-IO.COM (Walter Bright) (11/14/89)
The inline keyword is a *hint* to the compiler, nothing more, nothing less. It is not a command. The similarities to the register keyword are worth noting. The compiler is free to use or ignore the register keyword. In fact, modern compilers ignore it and do their own register assignments. The original reason for the register keyword are that compilers weren't sophisticated enough to do it automatically. Currently (with the separate compilation model), compilers are not sophisticated enough to make the proper decisions about inlining versus instantiating functions. The inline keyword is therefore there to help out. I fully expect in the future that the inlining decision will be made appropriately by the compiler, and that the inline keyword can be dropped.
tom@elan.elan.com (Tom Smith) (11/14/89)
From article <10121@alice.UUCP>, by bs@alice.UUCP (Bjarne Stroustrup): > On inlining: [ discussion of inlining being an optimization with varied value removed ] > Personally, I typically use inlining only for ``one liners'' where > a function body is ``clearly'' shorter than the function call it > would replace. My most common use by far is "access functions", to provide read-only access to private data members. > Cfront will inline a function > at most once in a single expression. The primary reason for this is to > protect against recursion. A secondary reason is to allow different inlinings > of a function in a block to share local temporary variables. The latter can > be important on architectures with limited stack space. This seems like a limiting and potentially damaging generalization. Take the following illustration of an inline access-type member function: class Foo { int a; public: int A() const { return a; } }; There is clearly (from the compiler's point of view) no danger of recursion or side affects given the lack of local variables and the 'const' declaration of the function itself. If the above comment regarding inlining within a statement is taken literally, the following results in a static "outlined" copy of Foo::A(): int b = foo.A() + foo.A() / 2; After using C++ almost exclusively for 3 years on several large-scale projects, I have found a major problem to be that of executable size. The primary causes of an overly-large executable are either multiple virtual function table instances (no longer a problem in cfront 2.0), or gratuitous local function copies resulting from non-inlineable functions. If cfront outlines a copy of Foo::A() in the statement above, without issuing a warning, then it will be necessary for programmers using the AT&T C++ implementation to examine the translated output of cfront regularly for outlined functions, and to tune their source code in non-intuitive ways merely to ensure adequate performance. Thomas Smith Elan Computer Group, Inc. tom@elan.com, ...!{ames, uunet, hplabs}!elan!tom
sabbagh@acf5.NYU.EDU (sabbagh) (11/14/89)
In article <2203@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes: >The inline keyword is a *hint* to the compiler, nothing more, nothing less. >It is not a command. > OK, OK, I've resisted long enough. Here's another reason for C++ source code translators to DO inline expansions: -- To permit vectorizing C compilers to optimize loops. I am working on a large numerical simulation of fluid flow. I have been pondering the use of C++ via cfront to generate Cray II code. In order to maximize the use of the vectorizing C compiler, I _need_ some of the member functions associated with my Vector class to be inlined, since the subroutine overhead is much larger than the two or three vector instructions needed to do things like dot products, etc. on the Cray. BTW, Walter: I just received Zortech 2.0 Developer's kit and am _very_ impressed. Keep up the excellent work. -hgs