[comp.lang.c++] Efficiency of Multiple Inheritance

bothner@decwrl.dec.com (Per Bothner) (06/12/89)

If one adds a feature to a language, a general goal is that
a user should only have to pay to cost of the feature when
using it. Adding multiple inheritance to C++ violates this
rule: Now \every/ virtual function call must do an extra addition,
to adjust the 'this' pointer properly. In most cases (including
all cases of single inheritance), the adjustment is zero,
but that cannot be determined at compile time. So when a virtual
function is called, the adjustment is taken from the virtual
function table, added to the 'this' pointer, and the result passed
to the called routine.

An alternative is to have the \called/ routine do the adjustment,
instead of the \caller/. The trick is to have stub routines,
which adjust 'this' and then jump to the real routine.

E.g.:

class A {
   ...
};

class B {
   virtual void f();
   virtual void g();
};

class D : public A, public B {
   virtual void f();
};

Consider the virtual function table used when a D instance
is pointed to by a B pointer. It has two slots:
The g slot points directly to B::g.
However, the f slot points to this compiler-generated stub:

void __D_via_B_f(THIS)
{
    ((D*)((char*)THIS - sizeof(A))->D::f(); // essentially
}

The obvious problem is that this costs an extra stack frame,
copying all the parameters, and an extra procedure call.
However, if you are generating machine instructions directly
(instead of translating to C), you can simplify each stub to two
instructions: an add (of a constant to a parameter register or
stack location), and a jump (to the known address of D::f). (*)
Even if you are translating to C, you can win it your target
C compiler does proper tail-call elimination (which is
required for Scheme compilers, but unusual for C compilers).

Another advantage of this scheme is that the virtual function
table does not need to store the adjustment values, so each
slot remains 32 bits (typically) instead of doubling to 64 bits.

Comments? Am I missing something?

(*) On some machines (including ones made by a well-known mini-computer
manufacturer), the start of a procedure is not its first instruction,
but an entry mask instead. This complicates life. The stub routine
must jump to the first instruction, skipping the entry mask.
More bothersome is that we must make sure that the stub routine
has the same entry mask as the real routine. It is possible to
copy over the entry mask when the program is initialized (using
mechanisms similar to those used to call constructores for static
instances). [The entry mask is not known at compile time, though
it is known at link time. Suitable linker magic could work.]
-- 
	--Per Bothner
Western Software Lab, Digital Equipment, 100 Hamilton Ave, Palo Alto CA 94301
bothner@wsl.dec.com ...!decwrl!bothner

bs@alice.UUCP (Bjarne Stroustrup) (06/12/89)

MI can indeed  be implemented that way. The technique is often referred
to as using `thunks' because a similarity to the technique for inplementing
call by name in Algol60. It has the obvious benefit of saving a memory
reference and an addition in each virtual function call and a word in
every virtual table entry. You don't in general require an extra stack
frame (though certain well known computer manufactures does make life
difficult), a jump suffices on most architectures.

I use the `add a delta' technique in my implementation for portability.
I mention the principle of `not requiring a user to pay for what he/she
doesn't use' in my paper `Multiple Inheritance for C++' (Proc EUUG conference
May 1987, Helsinki and the release notes for release 2.0). I had to
chose between the memory reference + addition cost and adding several
days to the porting time (from a base of a day or so). I chose portability.
Other implementor will have different constrints and goals and may chose
differently.

ark@alice.UUCP (Andrew Koenig) (06/12/89)

In article <1484@bacchus.dec.com>, bothner@decwrl.dec.com (Per Bothner) writes:

> If one adds a feature to a language, a general goal is that
> a user should only have to pay to cost of the feature when
> using it. Adding multiple inheritance to C++ violates this
> rule: Now \every/ virtual function call must do an extra addition,
> to adjust the 'this' pointer properly.

This is an implementation issue, not as a language issue,
as you point out yourself later on...

> An alternative is to have the \called/ routine do the adjustment,
> instead of the \caller/. The trick is to have stub routines,
> which adjust 'this' and then jump to the real routine.

The main problem with your scheme is that it becomes hard to generate
code for these stub functions when they contain s `...'  argument list.
The stub may have to copy the whole argument list, but it doesn't know how
big it is!  There's no problem generating machine language code do it
for most machines, but we haven't been able to figure out how to
generate appropriate C that has any chance of running on more than one
machine.

Therefore cfront takes the porable route despite the slight overhead;
presumably other, more machine-dependent implementations will
choose the most efficient route for their machine.
-- 
				--Andrew Koenig
				  ark@europa.att.com