[comp.lang.c++] VIRTUAL-NESS LOST - A C++ mystery involving funcs returning objects not ptrs

charlie@genrad.com (Charlie D. Havener) (03/10/90)

WHEN IS "VIRTUAL-NESS LOST"?

A puzzle involving overloaded
operator functions, multiple inheritance, and functions returning
class objects, instead of pointers or references to an object.

------------------------
SYNOPSIS: If you return a derived class object from a
function, can you then invoke a virtual function on it and 
expect C++ to find the right one?

Class Base { virtual print(); }
Class Der : public Base { virtual print(); }

Base func() { Der x; return x; }

main()  { func().print(); }  // should it use the Base or Der print?
		       // cfront uses Der print, Zortech 2.06 uses Base print
if it was
main()  { Base t; t = func(); t.print(); } // Clearly it should use the Base print
-------------------------

I can find no definitive answer in any of my books or the
C++ reference manual. Section 12.2 in the Ref manual about Temporary
Objects comes close. It seems to say this might be implementation
dependent. 

It would be most useful if I could count on the virtual nature of
a returned object to be alive as long as I don't actually assign
it to something of the base type. Consider the following real
application. First I will describe it, and then provide a shar
archive of source code suitable for experimenting on a Sun.
( Ok for Zortech too but you must remove the complex type or
add a small complex implementation for Zortech )

EXAMPLE:

Extend the Abstract Syntax Tree interpreter example from the 
Dewhurst and Stark textbook page 110 to handle types other than
plain vanilla integers. In a Binop node you will have an
expression like:
	return left->eval() + right->eval();
In order to avoid garbage collection and frequent inefficient
invocations of new, it is desirable to return an actual Data
object. Thus the eval() function returns 'Data' and things like
Int, Real, and Cpx are derived from Data.  Multiple inheritance
is used since Int, Real, etc derive from both Node and Data.
The eval() function actually returns, for example an Int. Then the
'+' operator becomes the overloaded virtual '+' that is a member
function of the Int class. If virtual-ness is lost when the function
returns the Int, then the default Data class '+' gets invoked. This
is what happens in Zortech 2.06. So, who is right? What can I
count on? 

Charlie Havener - charlie@genrad.com  - 508-369-4400 x3302

Shar archive below
---------------------

echo x - Nodes.C
cat >Nodes.C <<'!Funky!Stuff!'
// Nodes.C -  Implementation of Nodes.h for AST

#include <Nodes.h>
#include <stdlib.h>

Data Node::eval()  { cerr << "In Node virtual eval()"; return Int(0);}

Data Data::operator+(const Data&)
    { cerr << "virtual Data::operator+()!!!\n"; return Int(0); }

Data Int::operator+(const Data& y)
    {
    switch ( y.GetType() )  // GetType is virtual, so not really inline
	{
	case INT:
	    return Int(u.ival + y.u.ival);
	case REAL:
	    return Real(u.ival + y.u.rval);
	case UNKNOWN:
	case COMPLEX:
	default:
	    cerr << "unknown rhs type in int'+' operator \n";
	    return Int(0);
	}
    }

Data Real::operator+(const Data& y)
    {
    switch ( y.GetType())
	{
	case INT:
	    return Real(u.rval + y.u.ival);
        case REAL:
	    return Real(u.rval + y.u.rval);
	case UNKNOWN:
	case COMPLEX:
	default:
	    cerr << "unknown rhs type in real '+' operator \n";
	    return Int(0);
	}
    }

Data Cpx::operator+(const Data& y)
    {
    switch ( y.GetType())
	{
	case INT:
	    return Cpx(u.cpxval + y.u.ival);
        case REAL:
	    return Cpx(u.cpxval + y.u.rval);
	case COMPLEX:
	    return Cpx(u.cpxval + y.u.cpxval);
	case UNKNOWN:
	default:
	    cerr << "unknown rhs type in real '+' operator \n";
	    return Int(0);
	}
    }

!Funky!Stuff!
echo x - tiny.C
cat >tiny.C <<'!Funky!Stuff!'
// This example is a part of an Abstract Syntax Tree
// as described in Dewhurst and Stark book
// extended to handle Complex Numbers,& Reals

#include <stream.h>
#include <Nodes.h>
#include <complex.h>
main()
    {
    Node *np;

    cout << "Tiny expression interpreter running\n";

    np = new Int(7);
    (np->eval()).print();
    delete np;
    cout << "\n";

    np = new Plus(new Real(5.42), new Int(67));
    (np->eval()).print();
    delete np;
    cout << "\n";

    np = new Plus(new Cpx(5), new Cpx(93));
    (np->eval()).print();
    delete np;
    cout << "\n";

    np = new Plus(new Cpx(5), new Real(6.4));
    (np->eval()).print();
    delete np;
    cout << "\n";
    }
!Funky!Stuff!
echo x - Nodes.h
cat >Nodes.h <<'!Funky!Stuff!'
/* Nodes for Abstract Syntax Tree  */
#ifndef NODES_H
#define NODES_H
#include <stream.h>
#include <complex.h>

//---------------- Data Related classes follow -----

class Value
    {
public:
    union  // anonymous union!
	{
	int ival;
	double rval;
	};
    complex cpxval; // can't have class object with ctor in a union 
    };


class Data  // an impure abstract class
    {
    // could put type info here but that makes the Data object bigger
protected:
    Data() {} // forbid direct instantiation
    Value u;  // derived classes can access it only if friends
    friend class Int;
    friend class Real;
    friend class Cpx;
public:
    enum DataType { INT, REAL, COMPLEX, UNKNOWN };
    // cannot make pure virtual funcs below in C++ 2.0
    // because no func can return a pure abstract class like Data
    virtual DataType GetType() const { return UNKNOWN; }
    virtual ~Data() {}
    virtual Data operator+(const Data&);
    virtual void print(void) { cout << "oops! virtual data base class " ; }
    virtual Data eval() { cerr << "Data eval() base class\n"; return *this;}
    };
    
// -------------  Parse tree nodes ----------------
class Node
    {
protected:
//    Node(){}
public:
    Node(){}
    virtual ~Node() {}
    virtual Data eval();
    virtual void print() { cerr << "No print() defined\n"; }
    };

class Binop : public Node
    {
protected:
    Node *left;
    Node *right;
    Binop(Node *l,Node *r) { left = l; right = r;}
    ~Binop() { delete left; delete right; }
    };

class Plus : public Binop
    {
public:
    Plus(Node *l,Node *r) : (l,r) {};
    Data eval() { return ( left->eval() + right->eval()); }
    };

//---------------- Data Related classes follow -----
// class Int : public Node, public Data  // this order fails
class Int : public Data, public Node
    {
public:
    Int(int t = 0) { u.ival = t; }
    // no dtor needed since ctor didn't use new
    DataType GetType() const { return INT; }
    Data eval() { return *this; }
    void print(void) { cout << u.ival ; }
    Data operator+(const Data&);
    };

class Real : public Data, public Node
    {
public:
    Real(double t) { u.rval = t; }
    DataType GetType() const { return REAL; }
    void print(void) { cout << u.rval ; }
    Data eval() { return *this; }
    Data operator+(const Data&);
    };

       
class Cpx : public Data, public Node
    {
public:
    Cpx(complex t) { u.cpxval = t; }
    DataType GetType() const { return COMPLEX; }
    void print(void) { cout << u.cpxval ; }
    Data eval() { return *this; }
    Data operator+(const Data&);
    };

#endif
!Funky!Stuff!
echo x - makefile
cat >makefile <<'!Funky!Stuff!'

#makefile for the object oriented design expression
#interpreter. Uses the Sun C++ 2.0 
#Charlie Havener
SHELL= /bin/sh
CC= /usr/local/CC/sun3/CC
FLAGS= 

tiny: tiny.o Nodes.o
	$(CC) $(FLAGS) -otiny tiny.o Nodes.o -lcomplex

Nodes.o: Nodes.C Nodes.h
	$(CC) -c -I. $(FLAGS) Nodes.C

tiny.o: tiny.C Nodes.h
	$(CC) -c -I. $(FLAGS) tiny.C

listing: Nodes.C tiny.C Nodes.h makefile
	pr -f  makefile Nodes.h tiny.C Nodes.C > listing

archive:  Nodes.C tiny.C Nodes.h makefile 
	rm archive
	shar archive Nodes.C tiny.C Nodes.h makefile

clean:
	rm Nodes.o tiny.o tiny 
!Funky!Stuff!

charlie@genrad.com (Charlie D. Havener) (03/16/90)

This is a summary of the three E-mail replies I received
about VIRTUAL-NESS LOST. The concensus is that the 'feature'
in cfront that keeps virtual-ness alive for a returned object
is not part of the C++ language definition. I suspected this.
I have been experimenting with returning pointers to Data
rather than the object, but even with a class specific overload
of the new operator, the expression interpreter takes twice as
long. Sigh....    Thank you for responding

// this seems to be what one must do in order to avoid
// garbage build up. I wish there was a more elegant/efficient solution.
Data *Plus::eval()  
    {
    Data *l;
    Data *r;
    Data *result;
    result =  (*(l = left->eval()) + *(r = right->eval())); 
    delete l;
    delete r;
    return result;
    }
----------------------------------------
Partial E-mail from:
Scott
sdm@cs.brown.edu

In article <33629@genrad.UUCP> you write:
> SYNOPSIS: If you return a derived class object from a
> function, can you then invoke a virtual function on it and 
> expect C++ to find the right one?
>
> Class Base { virtual print(); }
> Class Der : public Base { virtual print(); }
>
> Base func() { Der x; return x; }
>
> main()  { func().print(); }  // should it use the Base or Der print?
>	 	       // cfront uses Der print, Zortech 2.06 uses Base print
> if it was
> main()  { Base t; t = func(); t.print(); } // Clearly it should use the Base print
>
>I can find no definitive answer in any of my books or the
>C++ reference manual. Section 12.2 in the Ref manual about Temporary
>Objects comes close. It seems to say this might be implementation
>dependent. 

cfront is wrong, Zortech is right.  func() returns a Base, not a pointer to
a Base and not a reference to a Base, so functions called on the result of
func() will be statically bound.  Always.  More technically, the result of
func() is an object of type Base that is initialized with an object of type
Der.  Check the manuals under object initialization.

I ran across this same problem on the "other side" of the function call,
i.e., passing parameters into functions.  This was my posting:

> Article 5164 of comp.lang.c++:
> Subject: When polymorphic, when not?
> Date: 7 Dec 89 19:12:10 GMT
> 
---------------------------------------------------

Reply-To: mit-eddie!Sun.COM!tiemann

Virtualness is lost in GNU C++ as well.  I believe that GNU and
Zortech are right, and that cfront is wrong.

Michael
--------------------------------------------------

From mit-eddie!mcc.com!mit-eddie!@MCC.COM:vaughan%cadillac.cad.mcc.com Mon Mar 12 12:12:49 1990

When dealing with on object, rather than a reference or pointer to an
object, calls to function members are never "virtual".  The compiler
generally assumes that the object is exactly the type that it thinks
it is (it had to allocate space for it after all) and avoids the extra
overhead for looking them up in the virtual function table.  In your
example, the Base portion of the Der x object is copied and returned
as a Base object.  I would hope that the virtual function table
pointer is also made appropriate for a Base object.  Note that if this
were not the case, virtual functions redefined on Der that access
slots defined on Der but not on Base might be called.  Such slots
don't exist for the returned object.
-- 
 Paul Vaughan, MCC CAD Program | ARPA: vaughan@mcc.com | Phone: [512] 338-3639
 Box 200195, Austin, TX 78720  | UUCP: ...!cs.utexas.edu!milano!cadillac!vaughan

-------------------------------------------------------

creemer@ohm.sw.mcc.com (David Zachary Creemer) (03/17/90)

In article <33897@genrad.UUCP>, charlie@genrad.com (Charlie D. Havener) writes:
> In article <33629@genrad.UUCP> you write:
> > SYNOPSIS: If you return a derived class object from a
> > function, can you then invoke a virtual function on it and 
> > expect C++ to find the right one?
> >
> > Class Base { virtual print(); }
> > Class Der : public Base { virtual print(); }
> >
> > Base func() { Der x; return x; }
> >
> > main()  { func().print(); }  // should it use the Base or Der print?
> >	 	       // cfront uses Der print, Zortech 2.06 uses Base print
> > if it was
> > main()  { Base t; t = func(); t.print(); } // Clearly it should use the Base print
> >
> >I can find no definitive answer in any of my books or the
> >C++ reference manual. Section 12.2 in the Ref manual about Temporary
> >Objects comes close. It seems to say this might be implementation
> >dependent. 
> 
> cfront is wrong, Zortech is right.  func() returns a Base, not a pointer to
...
> 
> Virtualness is lost in GNU C++ as well.  I believe that GNU and
> Zortech are right, and that cfront is wrong.
> 
> Michael
> --------------------------------------------------
> 
> From mit-eddie!mcc.com!mit-eddie!@MCC.COM:vaughan%cadillac.cad.mcc.com Mon Mar 12 12:12:49 1990
... another opinion
> -- 
>  Paul Vaughan, MCC CAD Program | ARPA: vaughan@mcc.com | Phone: [512] 338-3639
>  Box 200195, Austin, TX 78720  | UUCP: ...!cs.utexas.edu!milano!cadillac!vaughan
> 
> -------------------------------------------------------


Yes, but, the AT&T 2.0 language reference manual (the one I get with
my Sun CC compiler) reads,
"The interpretation of the call of a virtual function depends on the
type of an object for which it was called,..."

It makes no mention of the pointer-to-object vs. object argument going
on here. So, as I read it, cfront is behaving exactly as the Language
Ref. manual says it should., Furthermore, given the following:

class Base {
  public:
    virtual void print() { cout << "base\n"; }
};

class Der : public Base {
  public:
    virtual void print() { cout << "der\n"; }
};

Base& func() { Der *x = new Der; return *x; }

main() {   func().print(); }

both g++ (1.37.something) and CC call the derived class' print method.
I argue that this approach is best.  References are supposed to give
"pointer like" behavior (call by reference) without having to be
bothered by pointer dereferencing problems. In any event, it's a
curious fact which will undoubtedly cause someone a headache at some
time.

-- David

David Creemer | MCC Software Technology Program |    creemer@mcc.com 
512 338-3403  | 9390 Research, Kaleido II Bldg., Austin, Texas 78759

ark@alice.UUCP (Andrew Koenig) (03/17/90)

In article <2887@ohm.sw.mcc.com>, creemer@ohm.sw.mcc.com (David Zachary Creemer) writes:

> Yes, but, the AT&T 2.0 language reference manual (the one I get with
> my Sun CC compiler) reads,
> "The interpretation of the call of a virtual function depends on the
> type of an object for which it was called,..."

Did my earlier posting on this subject get lost?

This example is a bug in cfront 2.0.
It's fixed in cfront 2.1.

The statement in the manual is correct.  HOWEVER, if you have a function

	void f(Base b) { /* stuff that deals with b */ }

then calling f with a derived object:

	Derived d;
	f(d);

is a request to use the Base(const Base&) constructor to form a copy
of the Base part of d and use that as the argument to f.

Once inside f, the formal parameter is a Base, not a Derived.
If you had said

	void f(Base& b) { /* stuff that deals with b */ }

the situation would be quite different.

> It makes no mention of the pointer-to-object vs. object argument going
> on here. So, as I read it, cfront is behaving exactly as the Language
> Ref. manual says it should., Furthermore, given the following:

> class Base {
>   public:
>     virtual void print() { cout << "base\n"; }
> };

> class Der : public Base {
>   public:
>     virtual void print() { cout << "der\n"; }
> };

> Base& func() { Der *x = new Der; return *x; }

> main() {   func().print(); }

> both g++ (1.37.something) and CC call the derived class' print method.
> I argue that this approach is best.  References are supposed to give
> "pointer like" behavior (call by reference) without having to be
> bothered by pointer dereferencing problems. In any event, it's a
> curious fact which will undoubtedly cause someone a headache at some
> time.

Both g++ and CC are correct this time.
-- 
				--Andrew Koenig
				  ark@europa.att.com