druschel@cs.arizona.edu (Peter Druschel) (08/25/90)
Consider the following code (compiled with g++ 1.37.1): class Base { public: inline virtual void foo() {}; }; class Derived : public Base { public: inline virtual void foo() {}; }; class Container { public: inline void bar(Base &b) {b.foo();}; }; main() { Derived d; Container c; d.foo(); /* this call to Derived::foo() is statically resolved and inline expanded */ c.bar(d); /* this call to Container::bar() is also statically resolved and inline expanded the virtual function foo(), called within bar(), however, is invoked indirectly through Derived's virtual table. */ } Note that bar() is inline expanded in main() and the argument d passed to bar() is known to be of class Derived. Thus, the exact type of the argument passed to foo() from within bar() is also known. Consequently, foo() could also be statically resolved and inline expanded. It seems to me that with little effort, the compiler could notice this situation and take advantage of it. This optimization would be very useful; in many cases, it could even eliminate the need for parametric container classes. Usually, the member functions of container classes invoke small virtual functions of their arguments (e.g. the contained objects). In many cases, the member functions (of the container class) themselves are short and can be inlined. If the compiler was capable of doing the above optimization, it could eliminate all the virtual invocations. The resulting container class would be just as efficient as a parametric container class created specifically for a particular (contained) object class. What do people think about this? Is there any chance this (and other strategies for exploring static resolution of virtual functions, as hinted in the "ideas" file in the distribution) will be implemented any time soon? I have thought about making the required modifications to g++, but I don't know enough about g++ internals to be able to estimate what would be involved. Some information about how g++ does inlining and a few hints might encourage me to do it. -- Peter Druschel Internet: druschel@cs.arizona.edu Dept of Computer Science University of Arizona Tucson, AZ 85721 USA
dl@g.g.oswego.edu (Doug Lea) (08/25/90)
As Paul Vaughan mentioned (thanks!), I've looked into ways of permitting and performing static resolution, and described my best thoughts for doing so in the '90 USENIX C++ proceedings. I'll outline the highlights here. First, it's important to realize that the goal of static resolution is NOT merely to replace indirect function calls with direct function calls. Happily, virtual function calls usually result in very small, often unnoticeable, performance penalties over direct calls. The goal is instead to enable PROCEDURAL INTEGRATION, achieved by the inlining of a typically small number of crucial member functions, which in turn enables other standard powerful optimization techniques to take place. For example, the sum(Matrix&) routine listed in my paper, in which all access functions (rows(), operator()(i, j), etc.) are virtual, executes less than 2 times faster when virtual calls are replaced by direct calls, but runs 25 times faster when hand-optimized in a way that static resolution could automate. (These are on a Sun4/110 using g++.) Factors of 25 are hard to ignore. People (myself included) sometimes use highly compromised, obscure, and even unsafe design and coding schemes to avoid such things. My goals in exploring static resolution lied not so much in getting raw speed per se (since this can almost always be done now via ugly hacks) but instead in encouraging better design in the first place, by eliminating much of the motivation for such compromise. (See my discussions about `Abstract Base Classes' (ABCs), etc., in the paper.) After thinking to themselves, ``why can't my compiler just figure out how to optimize this code all by itself'', most people's first good idea about how to enable static resolution is to be able to somehow indicate whether a class is a `leaf' class. Since methods of such a class could not be overridden, hardwiring would be possible. I found this unnacceptable for several reasons: 1) As Steve Clamage wrote, it disables an important software reuse idiom in a thoroughly unacceptable manner, since leaf classes could never be subclassed. Thus, the strategy hurts the common practice of inheriting mechanism across implementation classes. 2) Unless ALL leaf classes are required to be designated as such (and perhaps even still, depending on various other details), programmers would still have to write out each and every special leaf class version of polymorphic functions themselves, even though, as is almost always the case, the special versions would contain EXACTLY the same C++ statements, just optimized in different ways for different arguments. This gets out of hand very quickly for functions with several arguments. Moreover, if a programmer decides to modify a hierarchy, changing a leaf into a non-leaf class, all of this work would have to be manually redone. 3) Because of (2), as Steve and others have also pointed out, this leads to no pragmatic improvement over existing constructs, since you can now manually invoke subclass routines inside functions anyway, albeit in a sometimes-ugly and always-dangerous manner. (Dangerous, since such routines can fail when new subclasses are introduced. Despite other failings, specially denoting leaf classes would prevent this.) The alternative strategy I laid out is to make support for polymorphism-by-synthesis (as opposed to polymorphism-by-dispatch) a first-class mechanism in C++. The technique is called `customization', and is based on work originally described by Ungar, Chambers, and others developing the SELF compiler. The strategy amounts to alerting the compiler (via a barely-defensible bastardization/alternative usage of the `template' keyword) that all specializations of a method or function should be resynthesized (customized) by the compiler rather than relying on virtual function calls that otherwise allow the same exectuable code to work with each different subclass argument it might encounter. This can be implemented via an internal macro-expander-like system driven by a type inferencer. For example, if there were a base class `Matrix', a function `float sum(template Matrix& m)', a Matrix subclass `RowMajorMatrix', and finally client code RowMajorMatrix a; float s = sum(a) then a specialized version of `float sum(RowMajorMatrix& m)', specifically optimized for RowMajorMatrix (and not any subclass thereof!) would be automatically be synthesized to handle the `float s = sum(a)' call. Of course, there are requirements for backup strategies, involving continued use of dispatch-based versions for cases where the exact type of an argument is unknown at compile time, along with various other niceties described in the paper (as well as a few that weren't -- write for details if you are interested.) The most controversial aspects of the proposal are 1) It has surprisingly far-reaching effects on the C++ type system itself, including a few outright conflicts where static resolution and dynamic resolution can currently have differing effects. (Of course, I argue that all such changes are for the best! Just for one example, it provides a way for programmers to avoid `slicing' via customized call by value, which addresses a topic recently discussed here.) 2) Even though several of the most important cases where customization can be applied are relatively straightforward to implement, getting the whole thing right doesn't look to be doable in an afternoon, a week, a month, or perhaps even a year. 3) It has consequences for C++ environments. Means for programmers to have some say about tradeoffs of potential code bloat for efficiency, prospects for integrating customization with parameterized types, and impact on the separate compilation model all need to be addressed. Many people apparently believe that there might be a means to get similar effects with less impact on the language and current compilers. I remain unconvinced that this is possible. The fact that both the SELF and TS (typed smalltalk) designers and implementers use variations on customization extensively to obtain their impressive performance improvements over standard OO compilation schemes lends credence to the notion that customization can play an important role in maintaining C++'s usual edge in efficiency compared to other OO languages. However, one viable alternative is to incorporate customization into a metalanguage with associated tools comprising a C++ programmers workbench, rather than into the language proper. I've been exploring this too. -Doug -- Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2688 email: dl@g.oswego.edu or dl@cat.syr.edu UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl
brucec@phoebus.phoebus.labs.tek.com (Bruce Cohen;;50-662;LP=A;) (08/28/90)
In article <DL.90Aug25123348@g.g.oswego.edu> dl@g.g.oswego.edu (Doug Lea) writes: ... a description of his proposal for "customization" ... > > Many people apparently believe that there might be a means to get > similar effects with less impact on the language and current > compilers. I remain unconvinced that this is possible. The fact that > both the SELF and TS (typed smalltalk) designers and implementers use > variations on customization extensively to obtain their impressive > performance improvements over standard OO compilation schemes lends > credence to the notion that customization can play an important > role in maintaining C++'s usual edge in efficiency compared to other > OO languages. > It seems to me there's another viable approach to the optimization of leaf object functions, etc: making the decisions in the linker rather than the compiler. The problem with doing this sort of thing in the compiler has always been the file granularity of compilation in the Unix(tm)/C model of development, which results in the compiler not being able to guarantee that it has all the global information it needs to make the decisions. On the other hand, the linker is in the right place in the pipeline; we just have to ensure that the information it needs is available. We also have to make sure that the linker is smart enough, and the object module format comprehensive enough, to be able to sew the functions and their calls together properly. It's theoretically possible that a smart enough linker could even be able to sew an inline version of a virtual function into a place where it could guarantee the class of object being invoked. The compiler and linker techniques are complementary, and could easily be made symbiotic. I'll have to read your paper to be able to comment on how other aspects of customization fit in, but I wouldn't be surprised if some of the work could be foisted off on the linker. > However, one viable alternative is to incorporate customization into a > metalanguage with associated tools comprising a C++ programmers > workbench, rather than into the language proper. I've been exploring > this too. > Are you proposing something like Demeter, which provides an object metalanguage that can be compiled into code skeletons for several different languages? Or a language for controlling the tools separately from specifying the program they'll transform? --------------------------------------------------------------------------- NOTE: USE THIS ADDRESS TO REPLY, REPLY-TO IN HEADER MAY BE BROKEN! Bruce Cohen, Computer Research Lab email: brucec@tekcrl.labs.tek.com Tektronix Laboratories, Tektronix, Inc. phone: (503)627-5241 M/S 50-662, P.O. Box 500, Beaverton, OR 97077 -- --------------------------------------------------------------------------- NOTE: USE THIS ADDRESS TO REPLY, REPLY-TO IN HEADER MAY BE BROKEN! Bruce Cohen, Computer Research Lab email: brucec@tekcrl.labs.tek.com Tektronix Laboratories, Tektronix, Inc. phone: (503)627-5241 M/S 50-662, P.O. Box 500, Beaverton, OR 97077
glenn@huxley.huxley.bitstream.com (Glenn P. Parker) (08/29/90)
In article <BRUCEC.90Aug27134836@phoebus.phoebus.labs.tek.com> brucec@phoebus.phoebus.labs.tek.com (Bruce Cohen;;50-662;LP=A;) writes: > It seems to me there's another viable approach to the optimization of leaf > object functions, etc: making the decisions in the linker rather than the > compiler. This is what Eiffel does. The big problem is the n-squared time (it might not be that bad) it takes to link large projects. A recent article in HOOT pointed this out as a serious drawback, but I agree that the linker is generally the right place for this kind of work. -- -------------------------------------------------------------------------- Glenn P. Parker Bitstream, Inc. uunet!huxley!glenn 215 First Street glenn@bitstream.com Cambridge, MA 02142-1270