[comp.lang.c++] Seeking neat way to do binary "virtual" functions.

ngo@tammy.harvard.edu (Tom Ngo) (03/20/91)

I am looking for a pretty way to accomplish the following task, which
must be fairly common.  I have a base class B with publicly derived
classes D1 and D2.  And I have a class BHandler that handles
homogeneous collections of B's, i.e. collections of B's which are
either all D1 or all D2.

BHandler causes B's to undergo certain virtual unary operations, i.e.
ones that involve only one B.  These are easy to implement.  To sketch
what I mean in case I'm being unclear:

    class B {
        void unary_op() =0;
    };
    class D1 : public B {
        void unary_op();
    }
    class BHandler {
        B *b0, *b1;
        void do_unary_op();
    }
    void BHandler::do_unary_op()
    {
        b0->unary_op();
    }

Now my problem is, BHandler needs to make B's undergo binary
operations, and I have been struggling to find a way to implement
these in an elegant manner.  Here are a few of the solutions I have
found:

(1) Cast B to D1 within a D1 member function:

    void BHandler::do_binary_op()
    {
        b0->binary_op(b1);
    }
    void D1::binary_op(B* that_)
    {
        D1 *that = (D1 *) that_;
        // do stuff with this and that
    }

    I don't like this solution very much because even though I know
    that_ can be converted to a D1*, the cast still seems dangerous.

(2) Take advantage of the fact that the only truly safe way to convert
    a B* to a D1* is through a virtual function call:

    void BHandler::do_binary_op()
    {
        b1->operate_with();
        b0->binary_op();
    }
    D1::operate_with()
    {
        that = this;  // then, that is a static member of D1, of type D1*
    }
    D1::binary_op()
    {
        // do stuff with this and that
    }

    This is even worse because in my application some calls to
    binary_op() need to be nested, so I still have to put a copy of
    that in an automatic variable, i.e.

    D1::binary_op()
    {
        D1* mythat = that;
        // do stuff with this and mythat
    }

(3) The safest implementation I have thought of seems less error-prone
    but is somewhat cumbersome:

    class BHandler {
        B *b0, b1;
        void do_binary_op();
        void provide_that();
    }
    void BHandler::provide_that()
    {
        b1->is_that();
    }
    class B {
        void is_that() =0;
        void request_that() =0;
    }
    class D1 : public B {
        D1 (BHandler* h_) : h(h_) {}
        BHandler* h;
        D1* that;
        static D1* static_that;
        void is_that() { static_that = this;}
        void request_that()
          { h->provide_that(); that = static_that; }
        void binary_op();
    }
    void D1::binary_op()
    {
        request_that();
        // do stuff with this and that
    }    

    This is my favorite implementation because by merely glancing at
    the code one can tell it is safe... but at what cost?

I would appreciate any suggestions.  Do people consider this a
fundamental limitation of C++, that the only way to safely convert a 
pointer to a class to a pointer to one of its derived classes is
through a virtual function call?

--
  Tom Ngo
  ngo@harvard.harvard.edu
  617/495-1768 lab number, leave message

cok@islsun.Kodak.COM (David Cok) (03/20/91)

In article <NGO.91Mar19114016@tammy.harvard.edu> ngo@tammy.harvard.edu (Tom Ngo) writes:
>
>I am looking for a pretty way to accomplish the following task, which
>must be fairly common.  I have a base class B with publicly derived
>classes D1 and D2.  And I have a class BHandler that handles
>homogeneous collections of B's, i.e. collections of B's which are
>either all D1 or all D2.
>
	... background stuff deleted
>
>Now my problem is, BHandler needs to make B's undergo binary
>operations, and I have been struggling to find a way to implement
>these in an elegant manner.  Here are a few of the solutions I have
>found:
>
	.. his examples included at end

>I would appreciate any suggestions. 

You'll undoubtedly get many suggestions, but I'll pitch in one:

Try implementing your own type-safe downcast:

class B {
	virtual void binary_op(B*) = 0;
	virtual D1* safe_cast_to_D1() { return (D1*)0; }
	virtual D2* safe_cast_to_D2() { return (D2*)0; }
};

class D1: public B {
	virtual void binary_op(B*) {
		D1* d = b2->safe_cast_to_D1();
		if (d == (D1*)0) cout << "UhOh!";
		else {
			// do binary op with this and d
		}
	}
	virtual D1* safe_cast_to_D1() { return this; }
};

class D2: public B {
	virtual void binary_op(B*); // complementary implementation to D1
	virtual D2* safe_cast_to_D2() { return this; }
};

class BHandler {
	B *b0,*b1;
	void do_binary_op() { b0->binary_op(b1); }
};

>Do people consider this a
>fundamental limitation of C++, that the only way to safely convert a 
>pointer to a class to a pointer to one of its derived classes is
>through a virtual function call?
>
>--
>  Tom Ngo
>  ngo@harvard.harvard.edu
>  617/495-1768 lab number, leave message

This uses virtual functions, as does your preferred solution [3] below, but
seems to me to be clearer.  What I do not like about both of these
is that they require the base class to
know about the derived classes.  At least the base class does not have to know
about the non-inherited member functions of the derived classes, but it still
seems to me to be a lack in the language.  This lack could be corrected by
the simple addition of type-safe down casting, namely requiring (D1*)b2 to
do what b2->safe_cast_to_D1() does above.  This would not break any
programs which do not already have bugs (from down casting  to the wrong
derived type).

A similar problem arises in virtual functions which return pointers to the
base type.  In the derived class, they must also return pointers to the
base class, when often one wants the derived class function to return a
pointer to a derived class object.  Again, one can provide this in C++ by
adding an additional helper function, but it requires source code access to
the Base class, which to my mind should not be necessary.  I also believe that
this is a lack in the language: it should allow contravariace on the function
return type and this problem would go away.  Again this solution does not
break existing programs, does not add keywords, and would make some of my
programs 10s of percent more concise.

A third place this problem comes up is on getting to derived class members
from a pointer to a Base class.  That is, with

class B {
	...
};

class D1: public B {
	...
	void g(); // something specific to D1's
};

class D2: public B {
	...
	void h(); // something specific to D2's
};

B* b1 = new D1;

how do I apply D1::g() to b1?  A simple cast is not safe in the general case 
where b1 may have a long and complicated history.  Some recent discussion on 
the net advocated adding a virtual function g() to B which would give an 
error message  when applied to a true B* (and hence to a D2*).
I dislike this immensely because it confuses the base class with
what should be derived class concerns.  It also requires source code access
to the base class.  Providing type safe down casting would make this trivial
and would satisfy at least one part of why some people want to ask objects for 
their type.

David R. Cok
Eastman Kodak Company
cok@Kodak.COM

Appendix:
>(1) Cast B to D1 within a D1 member function:
>
>    void BHandler::do_binary_op()
>    {
>        b0->binary_op(b1);
>    }
>    void D1::binary_op(B* that_)
>    {
>        D1 *that = (D1 *) that_;
>        // do stuff with this and that
>    }
>
>    I don't like this solution very much because even though I know
>    that_ can be converted to a D1*, the cast still seems dangerous.
>
>(2) Take advantage of the fact that the only truly safe way to convert
>    a B* to a D1* is through a virtual function call:
>
>    void BHandler::do_binary_op()
>    {
>        b1->operate_with();
>        b0->binary_op();
>    }
>    D1::operate_with()
>    {
>        that = this;  // then, that is a static member of D1, of type D1*
>    }
>    D1::binary_op()
>    {
>        // do stuff with this and that
>    }
>
>    This is even worse because in my application some calls to
>    binary_op() need to be nested, so I still have to put a copy of
>    that in an automatic variable, i.e.
>
>    D1::binary_op()
>    {
>        D1* mythat = that;
>        // do stuff with this and mythat
>    }
>
>(3) The safest implementation I have thought of seems less error-prone
>    but is somewhat cumbersome:
>
>    class BHandler {
>        B *b0, b1;
>        void do_binary_op();
>        void provide_that();
>    }
>    void BHandler::provide_that()
>    {
>        b1->is_that();
>    }
>    class B {
>        void is_that() =0;
>        void request_that() =0;
>    }
>    class D1 : public B {
>        D1 (BHandler* h_) : h(h_) {}
>        BHandler* h;
>        D1* that;
>        static D1* static_that;
>        void is_that() { static_that = this;}
>        void request_that()
>          { h->provide_that(); that = static_that; }
>        void binary_op();
>    }
>    void D1::binary_op()
>    {
>        request_that();
>        // do stuff with this and that
>    }    
>
>    This is my favorite implementation because by merely glancing at
>    the code one can tell it is safe... but at what cost?
>

ericg@ucschu.ucsc.edu (Eric Goodman) (03/22/91)

In article <NGO.91Mar19114016@tammy.harvard.edu> ngo@tammy.harvard.edu 
(Tom Ngo) writes:
> I would appreciate any suggestions.  Do people consider this a
> fundamental limitation of C++, that the only way to safely convert a 
> pointer to a class to a pointer to one of its derived classes is
> through a virtual function call?

I've been running into the problem myself.  With binary operators of 
unknown actual type only the first object gets run time determined.  To test
type compatibility for two arbitrary base class references there is no way to
determine whether or not they are "the same" other than that they are derived
from the same class.

What I want:

class B {
     virtual B& operator+(B&);
};
class D: public B{
     B& operator+(D&);
};

the D::operator+() function is not overloaded.  I'd like a mechanism that 
would check the actual type of the second arguent, and call the correct 
function at run time:

B& f(B& one, B& two) {
     return (one+two); // if (either or) both are B's, use B::operator+(B&)
                                    //  if both are D's, use 
                                    //  D::operator+(D&)
};
I put the "either or" in parentheses because it is not apparent to me that 
this is the correct thing to do in the case of an incomplete match.  
Perhaps B should define an "operator+mismatch()" that gets called in this 
case? 

I almost consider it a limitation in the language, but the overhead of 
doing something like this implicitly seems to me to be very high.  Not 
being a compiler writer myself, I hesitate to condemn others for a problem I 
couldn't fix myself :-).

Eric Goodman, UC Santa Cruz

ericg@ucschu.ucsc.edu  or  @ucschu.bitnet
Eric_Goodman.staff@macmail.ucsc.edu

bgbg@cbnewsd.att.com (brian.g.beuning) (03/24/91)

From article <13683@darkstar.ucsc.edu>, by ericg@ucschu.ucsc.edu (Eric Goodman):
> ...  With binary operators of 
> unknown actual type only the first object gets run time determined.
> To test type compatibility for two arbitrary base class references
> there is no way to determine whether or not they are "the same" other
> than that they are derived from the same class.

There was an article in a recent Journal of OOP (JOOP) about
supporting arithmetic in C++ that got into this topic.
One suggestion was to use:

class B {
public:
	virtual B& operator+( B& );
	virtual B& add_to_D( class D& );
	virtual B& add_to_E( class E& );
};
class D: public B {
public:
	B& operator+( B& );
	B& add_to_D( D& );	// D + D
	B& add_to_E( E& );	// E + D
};
class E: public B {
public:
	B& operator+( B& );
	B& add_to_D( D& );	// D + E
	B& add_to_E( E& );	// E + E
};

B&
D::operator+( B& arg )
{
	return( arg.add_to_D( *this ) );	// second run-time lookup
}
B&
E::operator+( B& arg )
{
	return( arg.add_to_E( *this ) );
}

If you have N derived classes, you end up with about N^2 methods.
But you can do it.  It is also extendable to more than 2 arguments.

		Brian Beuning

davidm@uunet.UU.NET (David S. Masterson) (03/25/91)

>>>>> On 21 Mar 91 21:14:22 GMT, ericg@ucschu.ucsc.edu (Eric Goodman) said:

Eric> What I want:

Eric> class B {
Eric>      virtual B& operator+(B&);
Eric> };
Eric> class D: public B{
Eric>      B& operator+(D&);
Eric> };

Eric> the D::operator+() function is not overloaded.  I'd like a mechanism
Eric> that would check the actual type of the second arguent, and call the
Eric> correct function at run time:

Eric> B& f(B& one, B& two) {
Eric>      return (one+two);       // if (either or) both are B's, use
Eric>                                     // B::operator+(B&)
Eric>                                     //  if both are D's, use 
Eric>                                     //  D::operator+(D&)
Eric> };

Something that might be of interest is the article on "Generalized Arithmetic
in C++" by Tim Budd in the February issue of the Journal of Object-Oriented
Programming.  It describes a couple of techniques (coercive generality and
double polymorphism) that might show good ways of doing this and the pitfalls.
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

jimad@microsoft.UUCP (Jim ADCOCK) (03/26/91)

In article <1991Mar19.200816.10480@kodak.kodak.com> cok@islsun.Kodak.COM (David Cok) writes:
|A similar problem arises in virtual functions which return pointers to the
|base type.  In the derived class, they must also return pointers to the
|base class, when often one wants the derived class function to return a
|pointer to a derived class object.  Again, one can provide this in C++ by
|adding an additional helper function, but it requires source code access to
|the Base class, which to my mind should not be necessary.  I also believe that
|this is a lack in the language: it should allow contravariace on the function
|return type and this problem would go away.  Again this solution does not
|break existing programs, does not add keywords, and would make some of my
|programs 10s of percent more concise.

If we look at the contravariance issue for a few moments, I think you'll
see that it ties in with a lot of the recent talk about run-time type
information and type-casting.

First, what would it mean to support contravariance on the return type for
C++?  I don't think you're much interested in returning a Derived where
the Base class specifies a Base -- you're really interested in returning
a Derived* or Derived& where the Base class specifies a Base* or a Base&.
What would it take to support this?  At first thought the problem seems
trivial -- until one considers multiple inheritence with virtual functions:

Base
{
....
public:
	virtual  Base* doSomething();
....
};

void usesDoSomething(Base* bp)
{
	bp = bp->doSomething();
	bp = bp->doSomething();
}

class Derived: public Foo, public Base
{
	Derived* doSomething();
}

Do you see the problem?  Since usesDoSomething(Base*) occurs before
Derived, today's compilers would assume that the address returned from
doSomething() the first time represents the address to dispatch relative
to the second time.  However, in Derived the Base part is typically 
generated as the second part of the Derived structure -- thus the
wrong part of Derived is being referred to by bp in order to correctly
dispatch doSomething() the second time.  This problem could be solved
if the compiler automatically generated code something like as follows:

void usesDoSomething(Base* bp) /* as automatically generated by the compiler */ 
{
	bp = (bp->doSomething())->runtimeCastToBasePtr();
	bp = (bp->doSomething())->runtimeCastToBasePtr();
}

If this doesn't seem too troublesome, consider that contravarience implies
the following would also have to be generated [given some slight changes
to usesDoSomething ]

void usesDoSomething(Derived* dp) /*as automatically generated by the compiler*/
{
	dp = (dp->doSomething())->runtimeCastToDerivedPtr();
	dp = (dp->doSomething())->runtimeCastToDerivedPtr();
}

All these cast functions are virtual, and compilers would have to automatically
generate such for all public base classes of a derived class with vptrs,
and also would have to implement the dynamic cast of a class to itself.

People would have to pay this cost whether they use contravariance or not --
since the compiler cannot determine this when usesDoSomething(Base*) is
being compiled.

Also, then do compilers automatically generate such virtual functions to 
also automatically perform downcasting???  If so, this would require a
portion of each class's vtable to be generated at link time, as opposed
to compile time.  These virtual-function downcasts might be implemented
by the compiler something like:

	dp = bp->RuntimeClassInfo()->runtimeCastToDerivedPtr();

where say RuntimeClassInfo() dispatches via "slot 0" of a vtable to 
a secondary table of "runtime class information" functions that might 
have to be generated by the compiler at link time.

----

What I'd really like to see is an approach that doesn't cost people anything
if they don't do downcasting or contravariance, but would require the 
compiler to "do the right thing" if such were used.  [I can't think of such
an approach off the top of my head.]

PS:

If you don't see the problem for a compiler trying to generate these
virtual function runtime casts -- try doing them yourself by hand.
See what problems you run into, while [of course] maintaining C++'s traditional
modular compilations.