David.Chase@Eng.Sun.COM (David Chase) (05/04/91)
First, there is some literature. Anne Holler recently finished a dissertation at U Virginia studying inlining, when it works, when it doesn't (I must confess that I have only read a draft in serious detail). So there is literature. Mary Wolcott Hall at Rice also spent some time studying inlining ("compiler torture" is how I heard it described), so Preston Briggs has some inside information. But, the conclusion is that *with current register allocation technology*, register pressure is the big risk, not code bloat in the cache. This doesn't mean code bloat doesn't happen -- it just means that register pressure happens first. In discussions of technology, it is important to separate theory from practice. Daniel Weise argues that register pressure is no reason to damn inlining; perhaps the fault lies in the register allocators. It does. Unfortunately, right now they're the only allocators that we've got, and improved allocators are of no use to code that has already been compiled. Register allocators will improve, so these decisions must be reexamined in the future. For a real datapoint, the Sun compilers just released do make use of (automatic) inlining at -O4, and it does help performance (generally -- see below), and tuning it to avoid compiler explosion was not trivial. (I didn't do this -- I watched Kurt Goebel and Chris Aoki do this.) Note that the Sparc, if used in the manner described in the ABI, has features that tend to favor inlining. Inlining (and tail-call elimination, and leaf-routine optimization) all help reduce churning of register windows. Also, the SAVE and RESTORE instructions operate on blocks of 16 (integer) registers, whether you use them or not. Until all the registers are used, inlining adds no additional (integer) register pressure (i.e., nothing else needs to be spilled in the procedure prologue or restored in the epilogue -- this is a slightly different sense of "register pressure", but I think the idea is clear). Other architectures probably benefit from inlining for other reasons. The point is that the exact effects and costs may be very architecture (and calling-convention) dependent. If it isn't obvious already, there are so many knobs to twiddle in compiler and architecture design that few results in the field are timeless, or even necessarily general. David Chase Sun -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.