[gnu.g++] On relaxing an unnecessary restriction.

rfg@ics.uci.edu (Ron Guilmette) (12/14/89)

Recently, as part of the research for a paper I'm writing, I had occasion
to read an M.S. thesis by Justin Graver (of the Univ. of Illinois) on
"Type-Checking and Type-Inference for Object-Oriented Programming Languages".

This mostly has to do with a "typed" version of Smalltalk-80 that Ralph
Johnson, Justin Graver, and others, have been working on at UIUC.

One comment in particular struck me, especially since it draws attention to
one aspect of the C++ language rules:

    "One problem with Borning and Ingalls' approach is the restriction they
     place on the return type of certain methods.  If an inherited method
     is redefined by a class, they require that its return type be a subtype
     of the return type of the method being overridden.  This is not a
     requirement of Smalltalk, nor is it even a convention."

After reading this, I got to thinking that this "restriction" that Graver is
describing as a "problem" is one that many C++ programmers would love to have.

As things stand right now, in C++ if you redefine an inherited "method" in
a derived class, the redefinition must have a parameter-type & result-type
profile which is IDENTICAL to that of the method being overridden.

Could this restriction be relaxed in C++?  I don't see why not.

Basically, I see no reason why it would be either difficult or impossible
to allow derived methods to overload base methods in more flexible ways
so long as the following rule is obeyed:

    The return type of a derived class method which overrides a base class
    method must be "return-type-compatible" with the return type of the
    base class method being overridden.

    There are three ways in which return types may be "return-type-compatible".

    (1) The two types are identical.

    (2) The return type of the base method is of type "pointer-to-class"
    and the return type of the derived method is also of type "pointer-to-
    class" and the "pointed-at" class for the return type of the base method
    is a superclass of the "pointed-at" class for the return type of the
    derived class method.

    (3) The return type of the base method is of type "reference-to-class"
    and the return type of the derived method is also of type "reference-to-
    class" and the "referenced" class for the return type of the base method
    is a superclass of the "referenced" class for the return type of the
    derived class method.

I know that may all sound like gibberish, but here is a concrete (and short)
example of exactly what I am proposing:

-----------------------------------------------------------------------
class base1 { ... };

class derived1 : public base1 { ... };

class base2 {
	// ...
public:
	virtual base1& function (void);
};

class derived2 : public base2 {
	// ...
public:

	virtual derived1& function (void);
};
-----------------------------------------------------------------------

In the example above the function called `function' is defined in the
base class `base2' and then is redefined for the derived class `derived2'.

In this redefinition, the return type is different from that originally
defined for `function' in the base class.  These two return types are
however sufficiently `related' so as to satisfy the rules above.

I cannot see how allowing this sort of redefinition would cause any problems
for anybody (including compiler writers).

Note that if `function' is called for an object which is unambiguously
of type `base2' then we get back a value which is unambiguously of
type `base1&'.  Thus, the compiler can still do all normal type checking.

Likewise, if `function' is called for an object which is unambiguously
of type `derived2' then we get back a value which is unambiguously of type
`derived1&'.  Again, all compile-time type checking is undisturbed.

Finally, if we call `function' for an object where the compiler cannot tell
(at compile-time) whether the object is a `base2' object or a `derived2'
object, then the current rules (which are already being used) call for the
compiler to make the minimal assumption that the object in question is
(at least) a `base2'.  In such cases, the compiler could likewise make
the (perfectly safe) assumption that the value returned is (at least)
of type `base1&' and do all normal type checking accordingly.

Note that I only propose that this extension be permitted for return types
which are reference or pointer types.  The reason for this is that return
values of reference and pointer types are always of (compile-time) known
(and usually identical) sizes, and thus, the issue of varying storage
allocations (for varying types of return values) does not even enter
into the question.

I solicit comments from C++ implementors on this (very minor) enhancement
regarding permissible forms of overloading.  Specifically, I'm curious
to know why this isn't already allowed!  It seems so potentially useful
and yet seems so trivial to implement.

One final note.  The additional flexibility I have describe here for return
types could likewise be made available for parameter types.  If this were
done however, the rules would have to be exactly *backwards* for parameter
types (relative to those for return types).  That is to say that the type
"referred to" by the type of a given parameter of a base class method would
have to be a subtype of the type "referred to" by the type of the corresponding
parameter for the derived class method.  It is left as an exercise for the
reader to determine why this is so.