[comp.lang.c++] Efficient compilation of virtual functions

nelson@suna2 (Taed Nelson) (08/23/90)

I was wondering how intelligent the C++ compilers (I have the newest
  releases of G++ an CFront) were with respect to virtual functions.  It
  appears that the answer is "not very."

If we have a virtual member function in the base class, and the same member
  function is redeclared in the subclass, whenever we are using an instance
  of the subclass, there is _no need_ to consult the virtual function table
  since we will _always_ call the subclass function.  In addition, there
  are other cases where we do not have to consult the virtual function
  table.

Unfortunately, I could get neither of the compilers to avoid looking at
  the virtual function table in such a clear-cut case.  This is important
  for a number of reasons (to me especially):
	1.  Virtual lookup is comparatively expensive.
	2.  We are unable to inline virtual functions.

To examine the behavior of the following code, I used the following lines:
	g++ -O -finline-functions -S xxx
	CC -O -S xxx

Am I missing something?  Shouldn't this be an easy optimization for these
  compilers to make?  Anyone know of a way to get around this limitation?

In the following code,
	subclassPointer->mutate() should DEFINITELY be inlined/non-looked-up
	and the other two could be since there are no other uses of the ptrs.

--------------

int number = 0;		// so that no optimizations can be done

class base
{
public:
	virtual void mutate () { number = 1; };
};

class subclass : public base
{
public:
	void mutate () { number = 2; };
};


main ()
{
	base *basePointer = new base;
	subclass *subclassPointer = new subclass;
	base *indirectPointer = new subclass;

	basePointer->mutate();
	subclassPointer->mutate();	// this should be optimized/inlined
	indirectPointer->mutate();
}
------------------

pejn@wolfen.cc.uow.oz (Paul Nulsen) (08/23/90)

nelson@suna2 (Taed Nelson) writes:

> ...
>If we have a virtual member function in the base class, and the same member
>  function is redeclared in the subclass, whenever we are using an instance
>  of the subclass, there is _no need_ to consult the virtual function table
>  since we will _always_ call the subclass function. 
> ...
>	base *basePointer = new base;
>	subclass *subclassPointer = new subclass;
>	base *indirectPointer = new subclass;

>	basePointer->mutate();
>	subclassPointer->mutate();	// this should be optimized/inlined
>	indirectPointer->mutate();

This is not as straightforward as you suggest: subclass may have another
class derived from it, in which case subclassPointer->mutate() is ambiguous
and must be resolved via the virtual function table.

The problem then is that the compiler needs to know that subclass has
nothing derived from it. This is not possible with separate compilation.

Paul Nulsen
pejn@wolfen.cc.uow.edu.au

amanda@iesd.auc.dk (Per Abrahamsen) (08/24/90)

>>>>> On 22 Aug 90 23:40:48 GMT, pejn@wolfen.cc.uow.oz (Paul Nulsen) said:

Paul> The problem then is that the compiler needs to know that subclass has
Paul> nothing derived from it. This is not possible with separate compilation.

It would be nice to be able to declare a class or even a member
function as "leaf". Even if we disregard the possible optimizations,
it would add to the programmers ability to document the classes.

jean@paradim.UUCP (Jean Pierre LeJacq) (08/24/90)

In article <1990Aug22.193347.18486@ux1.cso.uiuc.edu>, nelson@suna2 (Taed Nelson) writes:
> If we have a virtual member function in the base class, and the same member
>   function is redeclared in the subclass, whenever we are using an instance
>   of the subclass, there is _no need_ to consult the virtual function table
>   since we will _always_ call the subclass function. ...
> 
> class base
> {
> public:
> 	virtual void mutate () { number = 1; };
> };
> 
> class subclass : public base
> {
> public:
> 	void mutate () { number = 2; };
> };
> 
> main ()
> {
> 	subclass *subclassPointer = new subclass;
> 	subclassPointer->mutate();	// this should be optimized/inlined
> }

There are two ways of preventing use of the virtual mechanism: call
the member function with an object and not a pointer;

        subclass S;
        S.mutate();

or call the member function using explicit qualification.

        subclass S;
        subclass * SP = new subclass;
        S.base::mutate();
        SP->subclass::mutate();

In both cases you are telling the compiler that you know the class
of the object. In general a member function is declared virtual
exactly for situations where the class of the object is unknown.
That is, you require polymorphism.

In your example an optimizing compiler with sufficient knowledge
of control flow can prevent use of the virtual mechanism. It
appears that your compilers are not smart enough.

rpk@rice-chex.ai.mit.edu (Robert Krajewski) (08/24/90)

In article <1990Aug22.193347.18486@ux1.cso.uiuc.edu> nelson@suna2.UUCP (Taed Nelson) writes:
>
>If we have a virtual member function in the base class, and the same member
>  function is redeclared in the subclass, whenever we are using an instance
>  of the subclass, there is _no need_ to consult the virtual function table
>  since we will _always_ call the subclass function.

This does not obtain if you allow separate compilation.  A virtual function
call can only be resolved statically (that is, by the compiler), if
it knows that the subclass really lies at the end of the tree (i.e.,
none of *its* subclasses implements the same virtual function).

A linker could do this sort of thing if it was given the entire class
tree to look at during the link process.  A compiler could do it if
you told it that the class tree that it knew about was the whole
enchilada.

Robert P. Krajewski
Internet: rpk@ai.mit.edu ; Lotus: robert_krajewski.lotus@crd.dnet.lotus.com

rfg@NCD.COM (Ron Guilmette) (08/24/90)

In article <6428@wolfen.cc.uow.oz> pejn@wolfen.cc.uow.oz (Paul Nulsen) writes:
+nelson@suna2 (Taed Nelson) writes:
+
+> ...
+>If we have a virtual member function in the base class, and the same member
+>  function is redeclared in the subclass, whenever we are using an instance
+>  of the subclass, there is _no need_ to consult the virtual function table
+>  since we will _always_ call the subclass function. 
+> ...
+>	base *basePointer = new base;
+>	subclass *subclassPointer = new subclass;
+>	base *indirectPointer = new subclass;
+
+>	basePointer->mutate();
+>	subclassPointer->mutate();	// this should be optimized/inlined
+>	indirectPointer->mutate();
+
+This is not as straightforward as you suggest: subclass may have another
+class derived from it, in which case subclassPointer->mutate() is ambiguous
+and must be resolved via the virtual function table.
+
+The problem then is that the compiler needs to know that subclass has
+nothing derived from it. This is not possible with separate compilation.

This whole issue is rather interesting.

If I may, I'd like to refine Paul's final statement to say that "What the
compiler need to know is that it is not possible for the given pointer to
point to an object of a yet more derived class."

Paul is correct that this can be a virtually impossible thing for the
compiler to know in cases where the pointer variable itself is either
(a) a file-scope variable with external linkage, or (b) a parameter
to the current function whose value is being passed in from God knows
where.

Note however that in the example given by Taed, the pointer variable in
question was local to the function, and a simple bit of data flow analysis
could have helped the compiler to establish (unambiguously) that the given
pointer variable would *never* take on a value which pointed to any "more
derived" class.

Many (most?) optimizing compilers do perform at least some flow analysis,
so it would seem that de-virtualizing some calls on some virtual functions
should be both possible and reasonable.

This looks like it could be a potentially fertile type of C++ specific
optimization which future C++ compilers may implement.  On the other hand
(and until the day when lots of optimizing C++ compilers do this) it might
be useful (for time critical things) to provide both virtual and non-virtual
member functions (which do the same thing, but which have slightly different
names) in derived classes (calling the non-virtual one only when you are
sure that the non-virtual one will do what is needed).

Humm... sounds bad stylistically.... too error prone.  OK.  Forget I said that.
:-)
-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

steve@taumet.com (Stephen Clamage) (08/24/90)

amanda@iesd.auc.dk (Per Abrahamsen) writes:

|>>>>> On 22 Aug 90 23:40:48 GMT, pejn@wolfen.cc.uow.oz (Paul Nulsen) said:

|Paul> The problem then is that the compiler needs to know that subclass has
|Paul> nothing derived from it. This is not possible with separate compilation.

|It would be nice to be able to declare a class or even a member
|function as "leaf". Even if we disregard the possible optimizations,
|it would add to the programmers ability to document the classes.

I don't think I agree with this.  If you declared a leaf node, you 
would be prohibiting further reuse of the class by deriving from it.
How do you know that another object with slightly more specialized
behavior will never be needed in the future?

Someone needing this capability while also using your code would be
required to duplicate the existing code with new names, then create
conversion functions to your old class hierarchy.  This seems like a
steep price to pay for some special-case micro-optimization.

I believe the original poster wanted to optimize calls to a virtual
function of the current class.  While this is potentially dangerous,
the language already provides the mechanism to do this.  Example:

class C {
    ...
    virtual foo();
    bar();
};

C::bar()
{
    ...
    foo();	// virtual call
    ...
    C::foo();	// not a virtual call
    ...
}

In the second call, the compiler must make a direct call to C::foo, with
no virtual function call overhead.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

vaughan@mcc.com (Paul Vaughan) (08/25/90)

Check out Doug Lea's article in the 1990 Usenix Conference on C++ for
some well reasoned suggestions about possible C++ optimizations.


 Paul Vaughan, MCC CAD Program | ARPA: vaughan@mcc.com | Phone: [512] 338-3639
 Box 200195, Austin, TX 78720  | UUCP: ...!cs.utexas.edu!milano!cadillac!vaughan

amanda@iesd.auc.dk (Per Abrahamsen) (08/26/90)

>>>>> On 24 Aug 90 15:59:02 GMT, steve@taumet.com (Stephen Clamage) said:

Per> It would be nice to be able to declare a class or even a member
Per> function as "leaf". Even if we disregard the possible optimizations,
Per> it would add to the programmers ability to document the classes.

Stephen> I don't think I agree with this.  If you declared a leaf node, you 
Stephen> would be prohibiting further reuse of the class by deriving from it.

I think the class designer should be able to specify which classes can
be reused by deriving from them, in the same way as the class designer
can specify which member functions can be used by an derived class
(protected and public) and which can not (private).

Stephen> How do you know that another object with slightly more specialized
Stephen> behavior will never be needed in the future?

In my programs, such a situation would indicate the need for a new
abstract class, which provided the functionality shared by both
classes. 

Stephen> Someone needing this capability while also using your code would be
Stephen> required to duplicate the existing code with new names, then create
Stephen> conversion functions to your old class hierarchy.

Unless he is allowed to introduce a new abstract class. 

Stephen> This seems like a steep price to pay for some special-case
Stephen> micro-optimization.

The only price to pay for people who do not like that way of
programming, would be yet another feature of the language to ignore,
in a language which are already to full of such features.

jimad@microsoft.UUCP (Jim ADCOCK) (08/28/90)

In article <1311@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>Note however that in the example given by Taed, the pointer variable in
>question was local to the function, and a simple bit of data flow analysis
>could have helped the compiler to establish (unambiguously) that the given
>pointer variable would *never* take on a value which pointed to any "more
>derived" class.
>
>Many (most?) optimizing compilers do perform at least some flow analysis,
>so it would seem that de-virtualizing some calls on some virtual functions
>should be both possible and reasonable.

An obvious first step is to get compilers to understand that operator new
really is a special member function, so that in the common case of:

	BASE* pb = new DIRV;

new is known by the compiler to return a ptr to an object of exactly DIRV type.
Compilers need to understand that special member functions are special, and
not just some other function.

Another "obvious" optimization that "C++" compilers need to understand is
the difference between stack and heap, so that this==0 ? type tests are
not generated for objects created on the stack.  Surprisingly, many C 
compilers do not recognize that objects on the stack must not have null
addresses.  Again, C++ optimization issues are different from C optimization
issues.

jimad@microsoft.UUCP (Jim ADCOCK) (08/28/90)

In article <417@taumet.com> steve@taumet.com (Stephen Clamage) writes:
|amanda@iesd.auc.dk (Per Abrahamsen) writes:
|
||>>>>> On 22 Aug 90 23:40:48 GMT, pejn@wolfen.cc.uow.oz (Paul Nulsen) said:
|
||Paul> The problem then is that the compiler needs to know that subclass has
||Paul> nothing derived from it. This is not possible with separate compilation.
|
||It would be nice to be able to declare a class or even a member
||function as "leaf". Even if we disregard the possible optimizations,
||it would add to the programmers ability to document the classes.
|
|I don't think I agree with this.  If you declared a leaf node, you 
|would be prohibiting further reuse of the class by deriving from it.
|How do you know that another object with slightly more specialized
|behavior will never be needed in the future?
|
|Someone needing this capability while also using your code would be
|required to duplicate the existing code with new names, then create
|conversion functions to your old class hierarchy.  This seems like a
|steep price to pay for some special-case micro-optimization.

C++ already provides a way for a class designer to prevent others from
deriving from it:  Just make the constructors private.  Given that C++
already gives class designers the right to specify that no class can
be derived from a given class, should compilers be allowed to optimize
on that class's leaf-ness?  Should there be a friendlier way to specify
leaf-ness rather than have to specify constructors private?

---

Besides leaf-ness, there are other good reasons why a class designer might
want to prohibit direct derivation.  For example, it might be known that
the base class's member variable list might not be very stable, so that
it is desirable to be able to change the size of the base class without
recompiling all the subclasses.  At the cost of a new, and a pointer-
indirection, the derived class can inherit from a reference-to-the-base-class
class, maintaining identical usage syntax, but allowing the size of the
base class implementation to grow without recompiling the derived classes.

sarima@tdatirv.UUCP (Stanley Friesen) (08/28/90)

In article <AMANDA.90Aug25200023@ra.iesd.auc.dk> amanda@iesd.auc.dk (Per Abrahamsen) writes:
 
>Stephen> How do you know that another object with slightly more specialized
>Stephen> behavior will never be needed in the future?
 
>In my programs, such a situation would indicate the need for a new
>abstract class, which provided the functionality shared by both
>classes. 

	This is quite fine for a situation where the whole project is being
done in one organization.  It fails big where a large part of the project
is in the form of libraries from other groups.   In order to introduce a new
abstract class above an existing class, the source code to the leaf class is
required, so it can be recoded as a derived class of the new base class.
(Note this requires modification of the implementation of the leaf class).

	So, if you want to do this, do NOT ever put such a leaf class in a
library that is distributed to anyone else.  This way the users of the
libarary will not be limited in how they use it, by being required to
duplicate its functionality elsewhere.

glenn@huxley.huxley.bitstream.com (Glenn P. Parker) (08/29/90)

In article <56971@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
> C++ already provides a way for a class designer to prevent others from
> deriving from it:  Just make the constructors private.  Given that C++
> already gives class designers the right to specify that no class can
> be derived from a given class, should compilers be allowed to optimize
> on that class's leaf-ness?  Should there be a friendlier way to specify
> leaf-ness rather than have to specify constructors private?

The question was about "leaf" classes, right?  Private constructors alone
do not imply leaf-ness because they do not completely prevent derivation.
A child class may be a friend of the class with private constructors.  The
only thing private constructors do is prevent users from directly creating
instances of the class.

This technique is useful when you have a class that must be visible to the
outside world, but which must be instantiated in a context controlled by
the programmer (either through a derived class, or through a friend class
or friend function).  Even if this could be construed as implying
leaf-ness, it seems that it would make leaf classes pretty useless, except
as an obscure optimization.  The user would be unable to create instances
of the leaf class without going through some contrived protocol.
--
--------------------------------------------------------------------------
Glenn P. Parker        Bitstream, Inc.
uunet!huxley!glenn     215 First Street
glenn@bitstream.com    Cambridge, MA 02142-1270

adamsf@turing.cs.rpi.edu (Frank Adams) (09/01/90)

In article <AMANDA.90Aug23194233@ra.iesd.auc.dk> amanda@iesd.auc.dk (Per Abrahamsen) writes:
>It would be nice to be able to declare a class or even a member
>function as "leaf". Even if we disregard the possible optimizations,
>it would add to the programmers ability to document the classes.

I proposed exactly this about a year ago in this forum.  The only response
I got was a note from Stroustrap, saying in effect that he was unwilling to
put (more) pure performance hacks in the language.

(Actually, I proposed declaring a member function or even a class as "leaf".
The former seems more natural to me.)

An alternative approach might be to use a pragma:

#pragma leaf foo	// class foo is a leaf
#pragma leaf foo::bar()	// method bar in class foo is a leaf

I'm not sure what the C standard says about pragmas (I remember some language
like "cannot change the semantics of the program", but I don't know if that
made the final draft.)  In any event, this isn't C.

It seems to me that a reasonable constraint of this sort on pragmas is that
the pragma cannot be allowed to change an incorrect program into a correct
program.  Except for the fact that compiling and running a program does not
find all errors, this is equivalent to saying that a program that works with
the pragma will still work if it is removed/ignored.
-----------

Another approach would be to allow "exact" or "non-virtual" pointers.  Then

foo exact * p1;
foo * p2;

would declare p1 a pointer specifically to a foo, while p2 would point to
an instance of foo or one of its subclasses.  Exact pointers can be converted
freely to ordinary pointers to the class (and thence to superclasses), but
the reverse conversion should require a type-cast.

This puts a bit more of the burden on the programmer than the "leaf"
declaration, since each pointer must be so annotated, not just the class.
On the other hand, it is applicable in cases where the former is not; I
might know that a particular pointer is to a foo (perhaps because I just
created it), even if foo does have subclasses.

Another use for an exact keyword might be for member declarations: one can
declare a member function which is not inherited by subclasses.

class foo {
	int i;
public:
	virtual void zap() {i = -1;}
	exact void zap() {i = 0;}
}

Then foo::zap() would set i to -1, but the method for foo inherited by
subclasses would set i to 0.  This particular example is contrived, but
there have been times when I have wanted to do such a thing.  (Creating
an abstract class above foo is an alternative way to accomplish the same
end.)

amanda@iesd.auc.dk (Per Abrahamsen) (09/01/90)

>>>>> On 31 Aug 90 22:09:16 GMT, adamsf@turing.cs.rpi.edu (Frank Adams) said:

amanda>It would be nice to be able to declare a class or even a member
amanda>function as "leaf". Even if we disregard the possible optimizations,
amanda>it would add to the programmers ability to document the classes.

adamsf> I proposed exactly this about a year ago in this forum.  The
adamsf> only response I got was a note from Stroustrap, saying in
adamsf> effect that he was unwilling to put (more) pure performance
adamsf> hacks in the language.

Does nobody else regard the ability to declare something as "leaf" as
anything more than a performance hack?  Or are performance hacks the
only thing which really interest C++ programmers?

I regard "leaf" to serve the same function as "private", "protected",
and "const": They all provide valuable information to the person who
is maintaining or (in the case of a class library) using the code.
They also, as an extra bonus, sometimes allow the compiler to produce better
code.

On the down side, they also make it more difficult to do
"unconstrained reuse" of existing code, as known from Smalltalk.  They
trade flexibility for reliability, and put a large burden upon the
class designer.

mat@mole-end.UUCP (Mark A Terribile) (09/03/90)

> |Paul> The problem then is that the compiler needs to know that subclass has
> |Paul> nothing derived from it. This is not possible with separate compilation.
> 
> |It would be nice to be able to declare a class or even a member
> |function as "leaf". Even if we disregard the possible optimizations,
> |it would add to the programmers ability to document the classes.
> 
> I don't think I agree with this.  If you declared a leaf node, you 
> would be prohibiting further reuse of the class by deriving from it.
> How do you know that another object with slightly more specialized
> behavior will never be needed in the future?

Well, from the point of view of the developer who is trying to build
a planned, engineered, and maintainable product, this inability may
indeed be a good thing.  All that is necessary in your case is to
create an intermediate class from which further derivation is possible,
and a leaf class with no changes from the `derivable' class.

An alternative would be to block the derivation of all classes except
those declared in the `leaf' declaration.

If you DON'T know what does or does not derive from a class, your ability
to alter it once the product is released is very limited, just as you must
be careful not to alter a global variable upon which anything may depend.

In the real world, some degree of special care is needed to design and
create a class with a good protected interface.  Inheritance is an
engineering tool, not a means to avoid engineering.  Dare I say that
there is `no silver bullet'?  This is especially true for a language
that is meant for large projects.  SmallTalk may have been meant for
rapid prototyping or for teaching; C++ seems better as a way to move
some of the mid-level specs and architectural design into the code itself.
 
> Someone needing this capability while also using your code would be
> required to duplicate the existing code with new names, then create
> conversion functions to your old class hierarchy.  This seems like a
> steep price to pay for some special-case micro-optimization.

Why the conversion functions?  Yes, you would have to duplicate code, but
you could still derived the class from the leaf class's ultimate base
class(es).
 
> I believe the original poster wanted to optimize calls to a virtual
> function of the current class.  While this is potentially dangerous,
> the language already provides the mechanism to do this.  Example:
  . . .
> C::bar()
> {
>     ...
>     foo();	// virtual call
>     ...
>     C::foo();	// not a virtual call

Actually, I believe that the point was to allow the compiler to perform
the optimization.  Forcing it on the programmer is unsafe; as much as I
hate duplicated code, I would prefer my efficiencies to be properly
engineered rather than either hacked in as a hard-to-find derivation or
built up like a house of cards in my code.
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

johnb@srchtec.UUCP (John Baldwin) (09/07/90)

In article <7&`%1J-@rpi.edu> adamsf@turing.cs.rpi.edu (Frank Adams) writes:
  [referring to how to inform the compiler something is at the end
   of the inheritance tree]...
>
>Another approach would be to allow "exact" or "non-virtual" pointers.  Then
>
>foo exact * p1;
>foo * p2;
>
>would declare p1 a pointer specifically to a foo, while p2 would point to
>an instance of foo or one of its subclasses.  Exact pointers can be converted
>freely to ordinary pointers to the class (and thence to superclasses), but
>the reverse conversion should require a type-cast.

This seems like a very appropriate addition to C++.  It is a lot like other
"type modifiers" (const, etc.) and has a very "C++ -ish flavor" to it.
That is, it can both be used to inform the compiler of cases where
additional optimizations may be performed, and it also has the effect
of allowing a greater degree of control over the code.

I would view the "exact" modifier as an additional control-of-scope,
especially if it could be applied to member functions in the same way
that "virtual" is.

I'm not sure it would be a good idea to allow both directions of conversion
as mentioned above by Mr. Adams.  To me, "exact" is of a constraining
nature, like "const."   It's generally not considered to be a good idea
to be able to cast a const-something to a non-const-something and then
change it.       :-)

Where else could the hypothetical keyword "exact" be used to good effect?

-- 
John T. Baldwin                      |  johnb%srchtec.uucp@mathcs.emory.edu
Search Technology, Inc.              | 
                                     | "... I had an infinite loop,
My opinions; not my employers'.      |  but it was only for a little while..."