[comp.std.c++] Proposed implementation of contravariance on function return type

cok@islsun.Kodak.COM (David Cok) (04/01/91)
In a recent post (<1991Mar25.220712.71518@microsoft.UUCP>) Jim Adcock claimed 
that contravariance on function return type causes costs for all functions, 
not just contravariant ones.  In this article I dispute that by providing an 
implementation which
	-- adds cost only to functions which use contravariance
	-- the additional cost is very minimal (an add or two per call of
		a contravariant function) 
	-- fits in directly with current typical implementations of virtual
		functions
	-- works in the environment of separately compilable source files

If implementation problems are what is preventing this concept from being
included at the ANSI level, someone please tell me whom to contact.  I firmly
believe that adding contravariance on return type adds simplicity and elegance
to programs which need it, and that the need is not uncommon: I can't see how
to write a sensible clone function in C++ without it (by sensible I mean 
without using casts or an additional hidden helper function and even those do
not work for clone in MI where more than one base class defines the same 
function).

BACKGROUND

Consider a chain of classes D0, D1, ... Dn ... where Di is derived from D(i-1)
and perhaps from other classes (i.e. MI is allowed here), and that each class
in the chain defines a virtual function f with the same argument types.  Then
function Di::f overrides D(i-1)::f, and Di::f is executed on an object of
class Di even if it is accessed by means of a pointer or reference to Dj  (j<i).
This is conventional inheritance.

The current rules (ARM 10.2 p.208 "It is an error for a derived class function
to differ from a base class' virtual function in return type only.") require
that the return types of the various functions Di::f must all be the same.  The
proposal of "contravariance on function return type" means (to me) to relax
this rule to read something like

"The return type of the derived class function must be either
	a) the same as the return type of the base class function, or
	b) be a pointer to a class X, where the base class function returns a
	   pointer to a class Y and X is derived (perhaps indirectly) from Y, or
	c) be a reference to a class X, where the base class function returns a
	   reference to a class Y and X is derived (perhaps indirectly) from Y."

Note that the argument types of all the functions must still be the same.  There
have been proposals to relax that rule, but I am not proposing that.

Case a above is the current rule.  No changes are necessary to implement those
virtual functions.  Case b is discussed below and case c is completely 
analogous to case b.

In general we do not have a chain of classes but a DAG of classes which 
inherit the function f from each other.  Two complicating situations are
(1) a class may inherit the function f from more than one of its base classes
and (2) there may be virtual base classes involved among the function return
types.


SEPARATE COMPILATION

Separate compilation requires that some parts of code be able to be compiled 
without knowledge of other parts.  In particular, the entire base class D0 and
the definitions of its functions, including f, must be able to be compiled
without knowledge of any of the classes derived from it.

However, a derived class Di inherits from D(i-1) and indirectly from all the
Dj with j<i.  The declaration of class Di is not legal unless there have been
prior declarations of the Dj (j<i) within the same source file (commonly
by means of include files).  Thus when the compiler is compiling the declaration
of Di, it can know whether there is any case of contravariance in Di or the
Dj from which Di inherits.

Similarly, any use or definition of Di::f must be preceded by the declarations
of Di and Dj (j<i), so it can be known when Di::f is used or defined that there
is contravariance to be dealt with.  However, when f is invoked on a D0*, it is
not known at compile time which Di D0* points to.

The corollary to this is that there can be  no change to the implementation of
the class Di and the syntactic use or definition of the virtual function Di::f
so long as the return type of Di::f is the same as the return type of D0::f 
for j<i.  Any changes may only be in the chain at and after the point of the
first contravariant function.

IMPLEMENTATION

The implementation will be presented in full generality first.  Rest assured 
that, as the examples which follow demonstrate, simple cases actually turn out 
to have simple modifications.  My first attempt at this fell into the trap of
a case analysis where the cases were situations of increasing complexity.  I
realized that the solutions for all situations were special cases of one
general set of rules, which made for one elegant prescription for implementing
contravariance.  Unfortunately, the rationale for the general case is not so
obvious until one works through some examples.

We are concerned with the DAG of classes which inherit the virtual function
f from each other.  For a class c, the return type of c::f is denoted by Rc*.
I will say that c multiply-inherits f if c inherits f from more than one of
its base classes.  I will say that class a is an ancestor of class c if c
is derived from a, possibly through other intermediate classes.  Class c is
virtually-inherited from a if at least one of the classes in the inheritance
chain from a to c is a virtual base class of the next class on the chain.

For classes b and c within the DAG for function f,
if b is a base class of c, then Rb is equal to or an ancestor of Rc (by the
rules for contravariance).  Then if b is an ancestor of c, Rb is equal to
or an ancestor of Rc.

Recall that each object of class c contains within it all its ancestors,
and that for each ancestor a there is a (in typical implementations) virtual
function table which indicates, for all the functions inherited from a, which
actual function to call on the object of class c.  In practice, many of these
virtual function tables are shared and even the pointers to the virtual
function tables are shared.  The problem in contravariance begins with the
fact that the functions d::f for the descendants d of a class a return Rd* so
the function dynamically called via an_a->f() on a pointer to an object of
class a returns a variable type.  The implementation described below corrects
this problem by changing the return types to something fixed, namely Ra'*, and
then using (automatically introduced) casting in the program text to bring
the return type back to what the programmer originally specified.

The implementation of contravariance consists in making the following 
modifications to the programs and the virtual function tables.

For each class c in the DAG for f, choose a class Rc' which satisfies
	a) Rc' is equal to or an ancestor of Rc
	b) Rc is not virtually-inherited from Rc'
	[ I give below a prescription for the Rc' which optimizes the
	  implementation]

Then make the following modifications:

1) Change the return value of c::f from Rc* to Rc'*.  This is a legal cast
since Rc' is equal to or an ancestor of Rc.  In implementation it is a simple
add of a compile-time constant to a pointer.  The cast is unnecessary if Rc
is the same as Rc'.

2) Change the virtual function table entry for function f of class a in an
object of class c: ordinarily it would be &c::f (which has been modified in 
rule 1 to return a  pointer to the Rc' part of an Rc object; instead make the
vtable entry point to the thunk   (Ra'*)(Rc*)c::f
[ Before you get worried, I'll promise that most of the time this will be
equal to &c::f and the vtable will not have to be changed.]
Since c::f, as modified by rule 1, returns a pointer to the Rc' part of an 
Rc object and Rc is not virtually-inherited from Rc', the (Rc*) cast is ok;
since Ra' is equal to or an ancestor of Ra, which is equal to or an ancestor of
Rc, the (Ra'*) cast is legal.  If Ra' is equal to Rc' the casts do nothing 
and &c::f can be left in the vtable.

3) Change every use of f in an expression as follows:  Let c be the STATIC
class of the object to which f is being applied (after applying this rule to
the expression to which f is being applied, if appropriate); then cast the 
result of applying f with the cast (Rc*).  This makes sure that the type of 
the final result of applying f to an object of class c is Rc*, as the original
declaration of the function called for.  For example if c_ptr is statically 
declared to be a variable of class c*, then (c_ptr->f()) becomes 
((Rc*)(c_ptr->f())); also c_ptr->f()->f() becomes
		((Rd*) ( ((Rc*)(c_ptr->f())) -> f() ) )
where d == Rc.  This cast is again a simple add of a compile time constant
to the pointer which is returned by the function f.  If the function f is
applied to an object of class c, the thunk (Rc'*)(Rc*)c::f is invoked which
returns a pointer to the Rc' part of an Rc object.  If the function f is 
applied to a pointer to the c part of a d object, the thunk (Rc'*)(Rd*)d::f is 
invoked, which returns a pointer to the Rc' part of a Rd object (c being an
ancestor of d).  In either case the cast from Rc'* to Rc* is ok and
accomplishes what the user intended, namely that c_ptr->f() should return an
Rc*.  This cast is unnecessary if Rc is the same as Rc'.
NOTE: If f is used by having its address taken, a new function will have to
be created which implements (Rc*)f, and its address taken instead.

How should the Rc' be picked?  The criteria are (1) to require no changes
to functions or vtables in cases (given separate compilation) where it is not
known that contravariance exists, (2) to avoid the necessity for thunks when
possible, and (3) to retain sharing of virtual function tables and pointers.
Criterion 1 is required.  Criteria 2 and 3 can be accomplished in all cases in 
which f is not multiply inherited and there are no virtual base classes 
involved; in particular, if there is no MI (and thus no need for virtual base 
classes) no thunks are necessary and this scheme could be implemented by a 
front-end to CC (that is by a CCfront!).  The key is to choose Rc' to be equal 
to Ra' in rule 2 above.  The following procedure accomplishes that.


	Beginning at the top (most Base-ic classes) of the DAG for the function
	f, choose Rc' as follows:

	Case 1: c does not inherit f from any of its base classes (i.e. is a 
	   leaf of the DAG for f): then choose Rc' = Rc.  [This rule is 
	   necessary since when the compiler has seen only this class, it does 
	   not know that there is any contravariance in f, so it will know 
	   neither to apply rule 1 to the definition of f nor to apply rule 3 
	   to the uses of f.]

	Case 2: c inherits f from some number of base classes b1, b2 ,...:
	a) If Rc is virtually-inherited from all of the Rbi', then Rc' = Rc.
	b) If Rc is non-virtually inherited from at least one Rbi', then
		Rc' = Rb' where b is the first class in the list of base classes
		for c for which Rc is not virtually-inherited from Rb'.
		[ The reason for picking the class nearest the beginning of the
		list of base classes for c is that if b is actually the first
		class in the list, the virtual function table for a b object
		within a c object can still be shared with the vtable for the
		c object.]
	NOTE: It does not matter whether there are virtual base classes among
	the b1..bn, just whether there are virtual base classes involved in the
	function return types.

EXAMPLE 1: No multiple-inheritance of f and no virtual base classes.

Consider this example of a clone function in a simple hierarchy.
(Assume that neither X nor Y declare or inherit a virtual member function f.)

class D0 {
	D0();
	D0(const D0&);
	virtual D0* clone() { return new D0(*this); }
};

class D1: public X, public D0 {
	D1();
	D1(const D1&);
	virtual D1* clone() { return new D1(*this); }
	virtual D0* up() { return (D0*)this;}
};

class D2: public Y, public D1 {
	D2();
	D2(const D2&);
	virtual D2* clone() { return new D2(*this); }
	virtual D1* up() { return (D1*)this;}
};

Then a program could include these statements and expressions:
	D0 d0; D1 d1; D2 d2;
	D0* d0p; D1* d1p; D2* d2p;

	d0.clone(); // returns a pointer to a cloned D0
	d1.clone(); // returns a pointer to a cloned D1
	d0p = &d1;   // a pointer to the D0 part of a D1
	d1p = &d1;   // a pointer to a D1
	d0p->clone(); // returns a pointer to the D0 part of a cloned D1 
			// 					(copy of d1)
	d1p->clone(); // returns a pointer to a cloned D1 (copy of d1)
	d0p = &d2;   // a pointer to the D0 part of a D2
	d1p = &d2;   // a pointer to the D1 part of a D2
	d2p = &d2;   // a pointer to a D2
	d0p->clone(); // returns a pointer to the D0 part of a cloned D2 
			// 					(copy of d2)
	d1p->clone(); // returns a pointer to the D1 part of a cloned D2 
			// 					(copy of d2)
	d2p->clone(); // returns a pointer to a cloned D2 (copy of d2)
	d2p->up(); // returns a pointer to the D1 part of d2
	d2p->up()->up(); // returns a pointer to the D0 part of d2
	d1p->up(); // returns a pointer to the D0 part of d2

For clone():
We have R0 = D0, R1 = D1, and R2 = D2.  The rules for choosing the R' dictate
that R0' = R0 = D0, R1' = R0' = D0, and R2' = R1' = D0.  
For up():
R1 = D0, R2 = D1.  R1' = R1 = D0, R2' = R1' = D0.
Indeed, for any hierarchy for a contravariant function without virtual base 
classes and without the function being multiply-inherited, all the R' will be 
equal to the return type of the most basic class.

The application of the implementation does this:
rule 1) in the definition of each clone function, change the result type to D0*.
	in the definition of each up function, change the result type to D0*.
rule 2) since Ra' = Rc' for any combination of classes, no changes to the
	virtual function tables are needed and no thunks are introduced,
rule 3) in each use of the clone function, cast the result returned by the
	clone function to the static type of the object to which it is applied.
	in each use of the up function, cast the result returned by the up
	function to the base class of the static type of the object to which it
	is applied.

For the example given above, each clone and up function would be modified to 
return a (D0*), and the subsequent code would be modified to be this:

	d0.clone(); // returns a pointer to a cloned D0
	((D1*)(d1.clone())); // returns a pointer to a cloned D1
			// d1.clone() returns a pointer to the D0 part of a D1
			// which is then cast back to a pointer to the D1
	d0p = &d1;   // a pointer to the D0 part of a D1
	d1p = &d1;   // a pointer to a D1
	d0p->clone(); // returns a pointer to the D0 part of a cloned D1 
			// 					(copy of d1)
	((D1*)(d1p->clone())); // returns a pointer to a cloned D1 (copy of d1)
			// d1p->clone() returns a pointer to the D0 part of a D1
			// which is then cast back to a pointer to the D1
	d0p = &d2;   // a pointer to the D0 part of a D2
	d1p = &d2;   // a pointer to the D1 part of a D2
	d2p = &d2;   // a pointer to a D2
	d0p->clone(); // returns a pointer to the D0 part of a cloned D2 
			// 					(copy of d2)
	((D1*)(d1p->clone()); // returns a pointer to the D1 part of a cloned D2
			// 					(copy of d2)
			// d1p->clone() returns a pointer to the D0 part of a D2
			// which is then cast back to a pointer to the D1 part
	((D2*)(d2p->clone()); // returns a pointer to a cloned D2 (copy of d2)
			// d2p->clone() returns a pointer to the D0 part of a D2
			// which is then cast back to a pointer to the D2
	(D1*)(d2p->up()); // returns a pointer to the D1 part of d2
			// d2p->up now returns a D0* which is cast back to D1*
	((D0*)( ((D1*)(d2p->up())) ->up()); 
			// returns a pointer to the D0 part of d2
			// The static type of ((D1*)(d2p->up())) is D1* though
			// the D1 pointed to is part of a D2 object; so the
			// result of applying up() to this D1* is cast to R1*
			// which is D0*
	d1p->up(); // returns a pointer to the D0 part of d2; contravariance is
			// not necessarily known when this function is seen

In Jim Adcock's example:

> Base
> {
> ....
> public:
> 	virtual  Base* doSomething();
> ....
> };
> 
> void usesDoSomething(Base* bp)
> {
> 	bp = bp->doSomething();
> 	bp = bp->doSomething();
> }
> 
> class Derived: public Foo, public Base
> {
> 	Derived* doSomething();
> }

the function usesDoSomething is not modified at all, which is great because, as
he pointed out, the compiler  wouldn't know that it needed to.  Instead 
Derived::doSomething() is modified to return a Base* and the user's code

void usesDoSomething(Derived *dp) {
	dp = dp->doSomething();
	dp = dp->doSomething();
}

is modified automatically by the compiler to

void usesDoSomething(Derived *dp) {
	dp = ((Derived*)(dp->doSomething());  // doSomething now returns a Base*
	dp = ((Derived*)(dp->doSomething());  // doSomething now returns a Base*
}

No runtime casts are needed because the virtual functions (doSomething) are
modified to return Base* (i.e. D0*).  The casting to the type the user wanted 
can be done statically.  The user could do it him or herself, but only by
putting casts everywhere in the code which is unsafe and unsightly.


EXAMPLE 2. -- multiply inherited functions

Now suppose that a class D1 inherits f from both Da and Db.  If the 
corresponding Ra and Rb are identical, we can proceed as in case 1.  The
additional difficulty comes when Ra and Rb are different.  R1 is required
to be a descendant of both of them.  The clone function is a good example:

class Da {
	Da* clone() { return new Da(*this); }
};

class Db {
	Db* clone() { return new Db(*this); }
};

class D1: public Da, public Db {
	D1* clone() { return new D1(*this); }
};

class D2: public D1 {
	D2* clone() { return new D2(*this); }
};
{
D1 d1;
Da* dap = &d1; Db* dbp = &d1; D1* d1p = &d1;

d1p->clone(); // a pointer to a D1 which is a copy of d1
dap->clone();// a pointer to the Da part of a D1 which is a copy of d1
dbp->clone();// a pointer to the Db part of a D1 which is a copy of d1
}{
D2 d2;
Da* dap = &d2; Db* dbp = &d2; D1* d1p = &d2; D2* d2p = &d2;

d2p->clone(); // a pointer to a D2 which is a copy of d2
d1p->clone(); // a pointer to the D1 part of a D2 which is a copy of d2
dap->clone();// a pointer to the Da part of a D2 which is a copy of d2
dbp->clone();// a pointer to the Db part of a D2 which is a copy of d2
}

According the the prescription above, Ra' = Ra = Da, Rb' = Rb = Db,
R1' = Ra' = Da (since Da is before Db in D1's list of base classes), and
R2' = R1' = Da.

The definition and use of Da::f and Db::f and the vtables in Da and Dob
are unchanged.  For a D1 object we (1) modify D1::clone to return a (Da*),
and cast the result of each application of clone to objects with static type
D1 to (D1*); the virtual function tables end up like this:

		    	     __________
	D1 object:   Da part |        |
			     |   -----|-- vtable for Da in a D1 object 
			     |        |      and D1 in D1:
			     |        |		clone: &D1:clone
		    	     |________|
	             Db part |        |
			     |   -----|-- vtable for Db in a D1 object:
			     |        |		clone: (Db*)(D1*)D1::clone
		    	     |________|
		     D1 part |        |
			     |        |
		    	     |________|

The reason for picking R1' to be the same as Ra' was that then D1 and Da
could still share the vtable, and no thunk needed to be introduced.  A thunk
is unavoidable for Db however.  Note the reason for the double cast:
D1::clone returns a Da* pointer to a D1 object, which is cast to
a D1* and then up the other branch of the inheritance DAG to a Db*; obviously
this can be optimized to just one pointer offset.

The cloning example above is modified by the compiler automatically to

class Da {
	Da* clone() { return new Da; }
};		// no change -- we don't know about contravariance yet

class Db {
	Db* clone() { return new Db; }
};		// no change -- we don't know about contravariance yet

class D1: public Da, public Db {
	Da* clone() { return (Da*)new D1; } // change return type to Da*
}
class D2: public D1 {
	Da* clone() { return (Da*)new D2; } // change return type to Da*

D1 d1;
Da* dap = &d1; Db* dbp = &d1; D1* d1p = &d1;

(D1*)(d1p->clone()); // a pointer to a D1 which is a copy of d1
		// d1p->clone() returns a ptr to the Da part of a D1
		// which is then cast to a D1* (adjusting the pointer value)
dap->clone();// a pointer to the Da part of a D1 which is a copy of d1
		// no change to this -- it could occur in a context where dap
		// pointed to a Da object and there was no knowledge of 
		// D1 or contravariance; in this case D1::clone is invoked
		// which has been modified to return a Da*
dbp->clone();// a pointer to the Db part of a D1 which is a copy of d1
		// no change to this -- it could occur in a context where dbp
		// pointed to a Db object and there was no knowledge of 
		// D1 or contravariance; in this case the thunk
		//		(Db*)(D*)(D::clone())
		// is invoked.  D::clone() returns a ptr to the Da part of a
		// D1 which gets cast in two steps to the Db part of the same
		// D1.

Similarly,

(D2*)(d2p->clone()); // a pointer to a D2 which is a copy of d2
(D1*)(d1p->clone()); // a pointer to the D1 part of a D2 which is a copy of d2
dap->clone();// a pointer to the Da part of a D2 which is a copy of d2
dbp->clone();// a pointer to the Db part of a D2 which is a copy of d2

EXAMPLE 3 -- virtual base classes

Virtual base classes would pose no additional complication if it were possible
to cast from a virtual base class pointer to an object back down to a pointer
to the object.  However, if B is a virtual base class of D, and B* b = new D,
it is illegal to apply the cast (D*)b, even though b does point to the B part
of a D object.  This is because it is not known at runtime where in the D
object the B part might be.  However, the implementation described above
still works at the cost of introducing some thunks.  Here is an example.


class D0 { D0* clone() { new D0; } };
class D1a: virtual public D0 { 	D1a* clone() { new D1a; }};
class D1b: virtual public D0 { 	D1b* clone() { new D1b; }};
class D2: public D1a, public D1b { D2* clone() { new D2; }};

D2 d2;
D0* d0p = &d2;
D1a* d1ap = &d2;
D1b* d1bp = &d2;
D2* = &d2;

d0p->clone(); // a pointer to the D0 part of a D2 (a copy of d2)
d1ap->clone(); // a pointer to the D1a part of a D2 (a copy of d2)
d1bp->clone(); // a pointer to the D1b part of a D2 (a copy of d2)
d2->clone(); // a pointer to a new D2 (copy of d2)

Following the rules above, R0' = R0 = D0, R1a' = R1a = D1a ( because D1a is 
virtually-inherited from D0), R1b' = R1b = D1b, and R2' = R1a' = D1a.

Only the definition of D2::clone ends up being changed -- to have a return 
type of D1a*.

The virtual function tables are modified like this:

D0 -- unchanged: the slot for clone in D0 contains &D0::clone()

			   _______
D1a object:	D1a part: |       |
			  |   ----|-----vtable: clone:  &D1a::clone
			  |       |
			  |       |
			  |_______|
		D0 part:  |       |
			  |   ----|-----vtable: clone:  (D0*)(D1a*)D1a::clone
			  |       |   [ vtables could have been shared in the
			  |_______|    absence of contravariance]

D1b -- changed analogously to D1a

			   _______
D2 object:	D1a part: |       |
			  |   ----|-----vtable: clone:  &D2::clone
			  |       |
			  |       |
			  |_______|
		D1b part: |       |
			  |   ----|-----vtable: clone:  (D1b*)(D2*)D2::clone
			  |       |
			  |_______|
		D2 part:  |       |
			  |   ----|-----vtable: clone:  &D2::clone
			  |       |	[ Would be shared with D1a; could also
			  |_______|	use same structure element for the ptr]
		D0 part:  |       |
			  |   ----|-----vtable: clone:  (D0*)(D2*)D2::clone
			  |       |
			  |_______|

	Because the modified return type of D2::clone was chosen to be the
	same as the modified return type of D1a::clone (namely D1a*), the
	vtable for D2 and the vtable for D1a can still be shared.

The lines of code become

d0p->clone(); // no change
		// on execution it calls the thunk (D0*)(D2*)D2::clone
		// D2::clone returns a D1a* which points to the D1a part of a 
		// D2 object; this is cast to a pointer to the D0 part of the
		// D2 object 
d1ap->clone(); // a pointer to the D1a part of a D2 (a copy of d2)
		// on execution it calls D2::clone which returns a D1a*
		// which is eventually cast to a D0*
d1bp->clone(); // a pointer to the D1b part of a D2 (a copy of d2)
		// on execution it calls the thunk (D1b*)(D2*)D2::clone() which
		// (after casting) is a pointer to the D1b part of the new D2
		// object
((D2*)(d2->clone())); // a pointer to a new D2 (copy of d2)
		// This calls D2::clone which returns a pointer to the D1a part
		// of a D2 object; this is cast to be a pointer to the D2 part
		// of the same object.

Note that there would not be any problem with virtual base classes among the 
return types (beyond that caused by multiply-inheriting the contravariant
function) if it were possible to cast from the virtual base class to the
derived class.

CONCLUSION

-- There is a low-cost implementation of contravariance (lower cost than
   multiple inheritance or virtual functions themselves)
-- It does not add any cost to functions which are not contravariant
-- The implementation works in the context of separate compilation
-- For single inheritance, the execution time cost is only at most an extra
   add of a compile-time constant in the execution of a virtual function and 
   an extra add of a compile-time constant at each location where such a 
   function is used.
-- In some multiple inheritance cases, thunks must be introduced, but in most
   cases, virtual function tables and pointers to such tables which could be 
   shared in the absence of contravariance, can still be shared.

The utility of contravariant functions -- such as for clone -- is high.
It simplifies programs and reduces the amount of casting a programmer needs to
do.  Another example I have encountered in real problems is providing numeric
operations for structured types: Base class NumericImage, derived classes
DoubleImage, FloatImage, ...; an operator-() should return a NumericImage&
when applied to a NumericImage, a DoubleImage& when applied to a DoubleImage,
etc.

What is preventing contravariance on function return type from being adopted?

David R. Cok
Imaging Science Lab
Eastman Kodak Company
cok@Kodak.COM