druschel@cs.arizona.edu (Peter Druschel) (08/25/90)
Consider the following code (compiled with g++ 1.37.1):
class Base {
public:
inline virtual void foo() {};
};
class Derived : public Base {
public:
inline virtual void foo() {};
};
class Container {
public:
inline void bar(Base &b) {b.foo();};
};
main()
{
Derived d;
Container c;
d.foo(); /* this call to Derived::foo() is statically resolved and
inline expanded
*/
c.bar(d); /* this call to Container::bar() is also statically resolved
and inline expanded
the virtual function foo(), called within bar(), however,
is invoked indirectly through Derived's virtual table.
*/
}
Note that bar() is inline expanded in main() and the argument d passed
to bar() is known to be of class Derived. Thus, the exact type of the
argument passed to foo() from within bar() is also known. Consequently,
foo() could also be statically resolved and inline expanded.
It seems to me that with little effort, the compiler could notice
this situation and take advantage of it.
This optimization would be very useful; in many cases, it could even
eliminate the need for parametric container classes.
Usually, the member functions of container classes invoke small virtual
functions of their arguments (e.g. the contained objects). In many cases,
the member functions (of the container class) themselves are short and
can be inlined.
If the compiler was capable of doing the above optimization, it could
eliminate all the virtual invocations. The resulting container class
would be just as efficient as a parametric container class created
specifically for a particular (contained) object class.
What do people think about this? Is there any chance this (and other
strategies for exploring static resolution of virtual functions, as
hinted in the "ideas" file in the distribution) will be implemented
any time soon?
I have thought about making the required modifications to g++, but I
don't know enough about g++ internals to be able to estimate what
would be involved. Some information about how g++ does inlining
and a few hints might encourage me to do it.
--
Peter Druschel Internet: druschel@cs.arizona.edu
Dept of Computer Science
University of Arizona
Tucson, AZ 85721 USAdl@g.g.oswego.edu (Doug Lea) (08/25/90)
As Paul Vaughan mentioned (thanks!), I've looked into ways of
permitting and performing static resolution, and described my best
thoughts for doing so in the '90 USENIX C++ proceedings. I'll outline
the highlights here.
First, it's important to realize that the goal of static resolution is
NOT merely to replace indirect function calls with direct function
calls. Happily, virtual function calls usually result in very small,
often unnoticeable, performance penalties over direct calls. The goal
is instead to enable PROCEDURAL INTEGRATION, achieved by the inlining
of a typically small number of crucial member functions, which in turn
enables other standard powerful optimization techniques to take place.
For example, the sum(Matrix&) routine listed in my paper, in which all
access functions (rows(), operator()(i, j), etc.) are virtual,
executes less than 2 times faster when virtual calls are replaced by
direct calls, but runs 25 times faster when hand-optimized in a way
that static resolution could automate. (These are on a Sun4/110 using
g++.)
Factors of 25 are hard to ignore. People (myself included) sometimes
use highly compromised, obscure, and even unsafe design and coding
schemes to avoid such things. My goals in exploring static resolution
lied not so much in getting raw speed per se (since this can almost
always be done now via ugly hacks) but instead in encouraging better
design in the first place, by eliminating much of the motivation for
such compromise. (See my discussions about `Abstract Base Classes'
(ABCs), etc., in the paper.)
After thinking to themselves, ``why can't my compiler just figure
out how to optimize this code all by itself'', most people's first
good idea about how to enable static resolution is to be able to
somehow indicate whether a class is a `leaf' class. Since methods of
such a class could not be overridden, hardwiring would be possible. I
found this unnacceptable for several reasons:
1) As Steve Clamage wrote, it disables an important software reuse
idiom in a thoroughly unacceptable manner, since leaf classes
could never be subclassed. Thus, the strategy hurts the
common practice of inheriting mechanism across implementation
classes.
2) Unless ALL leaf classes are required to be designated as such
(and perhaps even still, depending on various other details),
programmers would still have to write out each and every
special leaf class version of polymorphic functions
themselves, even though, as is almost always the case, the
special versions would contain EXACTLY the same C++
statements, just optimized in different ways for different
arguments. This gets out of hand very quickly for functions
with several arguments. Moreover, if a programmer decides to
modify a hierarchy, changing a leaf into a non-leaf class, all
of this work would have to be manually redone.
3) Because of (2), as Steve and others have also pointed out, this
leads to no pragmatic improvement over existing constructs,
since you can now manually invoke subclass routines inside
functions anyway, albeit in a sometimes-ugly and
always-dangerous manner. (Dangerous, since such routines can
fail when new subclasses are introduced. Despite other failings,
specially denoting leaf classes would prevent this.)
The alternative strategy I laid out is to make support for
polymorphism-by-synthesis (as opposed to polymorphism-by-dispatch) a
first-class mechanism in C++. The technique is called `customization',
and is based on work originally described by Ungar, Chambers, and
others developing the SELF compiler. The strategy amounts to alerting
the compiler (via a barely-defensible bastardization/alternative usage
of the `template' keyword) that all specializations of a method or
function should be resynthesized (customized) by the compiler rather
than relying on virtual function calls that otherwise allow the same
exectuable code to work with each different subclass argument it might
encounter. This can be implemented via an internal
macro-expander-like system driven by a type inferencer.
For example, if there were a base class `Matrix', a function `float
sum(template Matrix& m)', a Matrix subclass `RowMajorMatrix', and
finally client code
RowMajorMatrix a; float s = sum(a)
then a specialized version of `float sum(RowMajorMatrix& m)',
specifically optimized for RowMajorMatrix (and not any subclass
thereof!) would be automatically be synthesized to handle the `float
s = sum(a)' call. Of course, there are requirements for backup
strategies, involving continued use of dispatch-based versions for
cases where the exact type of an argument is unknown at compile time,
along with various other niceties described in the paper (as well
as a few that weren't -- write for details if you are interested.)
The most controversial aspects of the proposal are
1) It has surprisingly far-reaching effects on the C++ type system
itself, including a few outright conflicts where static
resolution and dynamic resolution can currently have differing
effects. (Of course, I argue that all such changes are for
the best! Just for one example, it provides a way for
programmers to avoid `slicing' via customized call by value,
which addresses a topic recently discussed here.)
2) Even though several of the most important cases where
customization can be applied are relatively straightforward to
implement, getting the whole thing right doesn't look to be
doable in an afternoon, a week, a month, or perhaps even a
year.
3) It has consequences for C++ environments. Means for programmers
to have some say about tradeoffs of potential code bloat for
efficiency, prospects for integrating customization with
parameterized types, and impact on the separate
compilation model all need to be addressed.
Many people apparently believe that there might be a means to get
similar effects with less impact on the language and current
compilers. I remain unconvinced that this is possible. The fact that
both the SELF and TS (typed smalltalk) designers and implementers use
variations on customization extensively to obtain their impressive
performance improvements over standard OO compilation schemes lends
credence to the notion that customization can play an important
role in maintaining C++'s usual edge in efficiency compared to other
OO languages.
However, one viable alternative is to incorporate customization into a
metalanguage with associated tools comprising a C++ programmers
workbench, rather than into the language proper. I've been exploring
this too.
-Doug
--
Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2688
email: dl@g.oswego.edu or dl@cat.syr.edu
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dlbrucec@phoebus.phoebus.labs.tek.com (Bruce Cohen;;50-662;LP=A;) (08/28/90)
In article <DL.90Aug25123348@g.g.oswego.edu> dl@g.g.oswego.edu (Doug Lea) writes: ... a description of his proposal for "customization" ... > > Many people apparently believe that there might be a means to get > similar effects with less impact on the language and current > compilers. I remain unconvinced that this is possible. The fact that > both the SELF and TS (typed smalltalk) designers and implementers use > variations on customization extensively to obtain their impressive > performance improvements over standard OO compilation schemes lends > credence to the notion that customization can play an important > role in maintaining C++'s usual edge in efficiency compared to other > OO languages. > It seems to me there's another viable approach to the optimization of leaf object functions, etc: making the decisions in the linker rather than the compiler. The problem with doing this sort of thing in the compiler has always been the file granularity of compilation in the Unix(tm)/C model of development, which results in the compiler not being able to guarantee that it has all the global information it needs to make the decisions. On the other hand, the linker is in the right place in the pipeline; we just have to ensure that the information it needs is available. We also have to make sure that the linker is smart enough, and the object module format comprehensive enough, to be able to sew the functions and their calls together properly. It's theoretically possible that a smart enough linker could even be able to sew an inline version of a virtual function into a place where it could guarantee the class of object being invoked. The compiler and linker techniques are complementary, and could easily be made symbiotic. I'll have to read your paper to be able to comment on how other aspects of customization fit in, but I wouldn't be surprised if some of the work could be foisted off on the linker. > However, one viable alternative is to incorporate customization into a > metalanguage with associated tools comprising a C++ programmers > workbench, rather than into the language proper. I've been exploring > this too. > Are you proposing something like Demeter, which provides an object metalanguage that can be compiled into code skeletons for several different languages? Or a language for controlling the tools separately from specifying the program they'll transform? --------------------------------------------------------------------------- NOTE: USE THIS ADDRESS TO REPLY, REPLY-TO IN HEADER MAY BE BROKEN! Bruce Cohen, Computer Research Lab email: brucec@tekcrl.labs.tek.com Tektronix Laboratories, Tektronix, Inc. phone: (503)627-5241 M/S 50-662, P.O. Box 500, Beaverton, OR 97077 -- --------------------------------------------------------------------------- NOTE: USE THIS ADDRESS TO REPLY, REPLY-TO IN HEADER MAY BE BROKEN! Bruce Cohen, Computer Research Lab email: brucec@tekcrl.labs.tek.com Tektronix Laboratories, Tektronix, Inc. phone: (503)627-5241 M/S 50-662, P.O. Box 500, Beaverton, OR 97077
glenn@huxley.huxley.bitstream.com (Glenn P. Parker) (08/29/90)
In article <BRUCEC.90Aug27134836@phoebus.phoebus.labs.tek.com> brucec@phoebus.phoebus.labs.tek.com (Bruce Cohen;;50-662;LP=A;) writes: > It seems to me there's another viable approach to the optimization of leaf > object functions, etc: making the decisions in the linker rather than the > compiler. This is what Eiffel does. The big problem is the n-squared time (it might not be that bad) it takes to link large projects. A recent article in HOOT pointed this out as a serious drawback, but I agree that the linker is generally the right place for this kind of work. -- -------------------------------------------------------------------------- Glenn P. Parker Bitstream, Inc. uunet!huxley!glenn 215 First Street glenn@bitstream.com Cambridge, MA 02142-1270