kelleymt@luther.cs.unc.edu (Michael T. Kelley) (07/26/89)
I've been working on a set of classes for manipulating graphics primitives. For many of the sub-classes, there are members I don't want to "stub-out" in the parent class. The problem comes when grouping redefined functions in the sub-class with functions not stubbed out in the parent into a single expression: #include <stream.h> class Base { public: Base() { } virtual Base& hello() { cout << "hello "; return *this; } }; class Derived : public Base { public: Derived() { } Base& hello() { cout << "g'day "; return *this; } // ugh. void goodbye() { cout << "see ya!\n" } }; main() { Derived d; d.hello().goodbye(); // ERROR: goodbye undefined } In this case, I know I'm dealing with a Derived, so I'd like to be able to use goodbye(). Can someone explain the harm in allowing hello() to return a Derived& in the Derived class? Or is there an implementation issue lurking underneath? Michael T. Kelley University of North Carolina at Chapel Hill kelleymt@cs.unc.edu uunet!mcnc!unc!kelleymt (919) 962-1761
dlw@odi.com (Dan Weinreb) (07/28/89)
In article <8975@thorin.cs.unc.edu> kelleymt@luther.cs.unc.edu (Michael T. Kelley) writes:
In this case, I know I'm dealing with a Derived, so I'd like to be able
to use goodbye(). Can someone explain the harm in allowing hello()
to return a Derived& in the Derived class? Or is there an
implementation issue lurking underneath?
My colleagues and I have seen this problem several times, in slightly
different guises. As far as we can tell, there is no good solution;
you have to use explicit casts. In my experience so far with C++,
this is the biggest problem caused by C++'s mixture of run-time and
compile-time type checking. (More accurately, its mixture of
run-time-polymorphic objects and typed variables.) There's no such
problem in languages like Smalltalk-80 and Lisp/CLOS, in which
variables are untyped. (I recognize that typed variables have
advantages too, and I'm not trying to provoke a general discussion of
the virtues and drawbacks of typed variables!)
The underlying philosophical problem is that d.hello().goodbye(); is
doing something that is perfectly meaningful, but the C++ compile-time
type checking system cannot prove in advance that the code meaningful,
so the compiler must tag the code as an error. You have to put in an
explicit cast into the code in order to assert to C++ that
everything's really all legal.
I don't see any way to fix this that is consonant with the general C++
language design. It looks like a tough problem. Is there anybody out
there who can tell us whether Trellis/Owl has a way to deal with this?
Are there other object-oriented languages with typed variables in
which an analogous construct can be written?
Dan Weinreb Object Design, Inc.
alonzo@microsoft.UUCP (Alonzo Gariepy) (07/28/89)
In article <8975@thorin.cs.unc.edu> kelleymt@luther.cs.unc.edu (Michael T. Kelley) writes: | I've been working on a set of classes for manipulating graphics | primitives. For many of the sub-classes, there are members I don't ^^^^ [i.e., goodbye(), below] | want to "stub-out" in the parent class. The problem comes when | grouping redefined functions in the sub-class with functions | not stubbed out in the parent into a single expression: | | #include <stream.h> | | class Base { | public: | Base() { } | virtual Base& hello() { cout << "hello "; return *this; } | }; | | class Derived : public Base { | public: | Derived() { } | Base& hello() { cout << "g'day "; return *this; } // ugh. | void goodbye() { cout << "see ya!\n" } | }; | | main() | { | Derived d; | d.hello().goodbye(); // ERROR: goodbye undefined | } The problem you are looking at is really one of typechecking. If you do have Derived::hello() return a Derived& then your example makes some sense, but what about: Derived d; Base &b = d; b.hello().goodbye(); This would run and return a Derived& but it doesn't make much sense. The semantics you want involve the notion of This (This-Class as opposed to this-object). Your functions become: virtual This& Base::hello() { cout << "hello "; return *this;} and virtual This& Derived::hello() { cout << "g'day "; return *this;} and consider This& Base::somebasefunc() {blah, blah, ...;} This makes these function of identical type and formalizes the idea of passing the invocation type through polymorphic and inherited functions. For example, d.hello(); // Derived::hello() and returns Derived& Base b; b.hello(); // Base::hello() and returns Base& Base &b = d; b.hello(); // Derived::hello() and returns Base& d.somebasefunc(); // Base::somebasefunc() and returns Derived& I think This is a useful notion. It is effectively an automatic cast, but only in circumstances that are clearly safe. The typechecking would not be difficult. Of course the reason for all this is so that I can conveniently say subob.basefunc().subfunc().subfunc().superbasefunc().subfunc(); This is a useful feature that can be added to the language compatibly (except for anyone who has named a class, This) and simplifies things rather than complicating them. Alonzo Gariepy // If Microsoft has an opinion, alonzo@microsoft // they haven't told it to me.
alonzo@microsoft.UUCP (Alonzo Gariepy) (07/28/89)
As further explanation of why you would want the notion of passing
invocation class through inherited and virtual functions, I present
the following examples.
First, remember that (in the proposed scheme)
This& Base::func(){...; return *this;}
is defined to return a Base& when it is called with
Base b; b.func(); // returns a Base&
but a Derived& when it is inherited by class Derived and called with
Derived d; d.func(); // returns a Derived&
EXAMPLE ONE inherited function
Let's say we have a class of object called Bar. These objects can
be incremented with the += operator, but they have to be normalized
first. We don't automatically normalize them in the += function, because
it takes awhile, and we often know that a particular instance is already
normalized.
Bar bell;
bell.normalize += 8;
Now we also have a subclass called Nun. It contains some extra stuff
that must also be incremented, so we redefine += to
Nun& Nun::operator+=( int i )
{
this->Bar::operator+=( i );
extra += i;
return *this;
}
When we increment nuns we would like to use identical syntax without
worrying about the return type of Bar::normalize()
Nun foo;
foo.normalize += 5;
For this to correctly call Nun::operator+=(), the definition of
normalize must be
This& Bar::normalize() {...; return *this;}
As it must be for the following to compile without a type error
Nun& Nun::nunfunc() {..., return *this;}
foo.normalize.nunfunc();
If normalize() was added after class Nun, or is later removed, the
definition of class Nun remains unchanged. As it should be, normalize()
is only relevant to Bar.
EXAMPLE TWO virtual functions
As the example stands, the operator+=() functions in classes Bar and Nun
cannot be made virtual because they return different types. If class Nun
and the += functions are redefined as
virtual This& Bar::operator+=(int);
virtual This& Nun::operator+=(int);
then the following code compiles as indicated
Nun nun;
Bar& bar = nun;
nun.normalize += 2; // OK.
bar.normalize += 2; // OK.
(bar += 2).normalize; // OK.
(nun.normalize += 2).nunfunc; // OK.
(bar.normalize += 2).nunfunc; // **error** makes no sense anyway
Alonzo Gariepy // If Microsoft has an opinion,
alonzo@microsoft // they didn't tell it to me.
ark@alice.UUCP (Andrew Koenig) (07/29/89)
At first glance it may seem obvious that it should be possible in C++ for a virtual function to return a reference to *this without converting it to its base class. However, a closer look brings second thoughts. For example: class X { public: virtual X& self() { return *this; } }; class Y: public X { public: virtual Y& self() { return *this; } // illegal }; X* xp; The reason the line marked `illegal' is illegal is because if it were legal, then it would be impossible to know the type of xp->self() at compilation time. This type uncertainty would have ramifications throughout the language. For example: void f(X); void f(Y); // overloading f(xp->self()); Evidently it would be necessary to do a run-time dispatch to decide which f() to call. If f has multiple arguments, the problem becomes that much harder. This implicit run-time dispatch can even creep into non-virtual member functions. For instance: X* xp = new Y; X* xq = new Y; xp->self() = xq->self(); Since xp and xq both point at Y objects, one would presumably want this assignment to copy a Y rather than just copying the X part. To do that, though, effectively requires making all assignments into virtual functions. Worse yet, there are four possible combinations of X and Y that xp and xq might point to; what is the right thing to do in each case? The problems become still more complicated in the presence of multiple inheritance. In short, the reason that a virtual function must have the same return type in a derived class as in its base class(es) is that otherwise the type of an expression becomes impossible to determine during compilation. -- --Andrew Koenig ark@europa.att.com
alonzo@microsoft.UUCP (Alonzo Gariepy) (07/31/89)
In article <9694@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes: >At first glance it may seem obvious that it should be possible >in C++ for a virtual function to return a reference to *this >without converting it to its base class. However, a closer look >brings second thoughts. For example: > > class X { > public: > virtual X& self() { return *this; } > }; > > class Y: public X { > public: > virtual Y& self() { return *this; } // illegal > }; > > X* xp; > >The reason the line marked `illegal' is illegal is because >if it were legal, then it would be impossible to know the >type of xp->self() at compilation time. Yes and no. It would be very poor programming practice indeed to expect xp->self() to return anything but an (X&). Even if xp points to a subclass of X, this expression must return the compile time type of the call. The whole point of polymorphism is that a call such as xp->self() is transparent in the code (compile time). But why can't yp->self() return a (Y&)? The actual case that everyone is using is: <please-return-the- same-type-as-I-called-you-with>. This works very well whether or not the function is virtual or has been redefined in a subclass. It is an intelligent cast, no more. If you wish this semantic in a virtual function, all redefinitions of it share the same semantic. Give the semantic a type name (This&) and all the redefinitions share the same type. Subclasses can inherit the semantic without silly redirection functions. Given a function (or set of virtual functions) defined with this type, you can tell exactly what type will be returned at compile time. Now I can call a function Y::sumYfunc(Y&), y1.self().someYfunc(y2.someXfunc()); as long as self() and X::someXfunc() share the new semantic. My only question is whether it is worth putting more special purpose syntax into C++. On the other hand, why stop now? Alonzo Gariepy // If Microsoft has an opinion, alonzo@microsoft // they haven't told it to me.
jima@hplsla.HP.COM (Jim Adcock) (08/01/89)
>/ hplsla:comp.lang.c++ / dlw@odi.com (Dan Weinreb) / 1:17 pm Jul 27, 1989 / >In article <8975@thorin.cs.unc.edu> kelleymt@luther.cs.unc.edu (Michael T. Kelley) writes: > > In this case, I know I'm dealing with a Derived, so I'd like to be able > to use goodbye(). Can someone explain the harm in allowing hello() > to return a Derived& in the Derived class? Or is there an > implementation issue lurking underneath? > >My colleagues and I have seen this problem several times, in slightly >different guises. As far as we can tell, there is no good solution; >you have to use explicit casts. In my experience so far with C++, >this is the biggest problem caused by C++'s mixture of run-time and >compile-time type checking. (More accurately, its mixture of >run-time-polymorphic objects and typed variables.) There's no such >problem in languages like Smalltalk-80 and Lisp/CLOS, in which >variables are untyped. (I recognize that typed variables have >advantages too, and I'm not trying to provoke a general discussion of >the virtues and drawbacks of typed variables!) I believe there is such an issue in Smalltalk-like languages, its just resolved in a different manner. In Smalltalk-like [untyped] languages there is no compile-time check of the sanity of a particular method name being applied to a particular object. Since there are no types, there is no type checking, and the compiler simply allows it. Thus the sanity of applying a particular method name to a particular object must be checked manually by the programmer, and if a mistake is made, it is only detected as a run-time bomb. A typical problem is to send a message to an object that doesn't understand that message, or does understand it, but understands it to mean something different than you meant. This happens frequently on large projects where hundreds of message names are created weekly. Joe and Susan independently give their methods the same name, though they have different meanings. You typically don't find this out until a couple months later on the project, when Jack tries to access both Joe and Susan's objects in a uniform manner. C++ is a typed language, and thus squawks at compile-time if you apply a method name to an object in what seems to not be a sane manner. You coerce the object to apply any necessary transformations to make the object/method-name combo sane, or to tell the compiler you really do know what you're doing after all. So I see this as an issue of permissive compiler design verses safe compiler design. >The underlying philosophical problem is that d.hello().goodbye(); is >doing something that is perfectly meaningful, but the C++ compile-time >type checking system cannot prove in advance that the code meaningful, >so the compiler must tag the code as an error. You have to put in an >explicit cast into the code in order to assert to C++ that >everything's really all legal. Agreed. ....If you want to take a Smalltalk-like "untyped" approach to a portion of your code, derive from a specific base class, make virtual function templates, always return references to objects of that base type, and you have a Smalltalk-like approach. Almost no type checking. I don't recommend this, though. Type-checking helps verify the sanity of your coding, and can lead to much faster code.
cook@hplabsz.HPL.HP.COM (William Cook) (08/03/89)
Recently there has been some discussion on problems in typing methods that return "self". I want to point out that Eiffel allows these methods and gives them their most natural typing. Unfortunately, Eiffel does not seem to be perfect in this respect, in that it allows some typings that are insecure (see my paper, "A Proposal for Making Eiffel Type-safe", in the proceedings of ECOOP'89). Eiffel uses "Current" instead of "self" or "this". It also has a novel typing construct called _declaration by association_ which allows a type of the form "like Current" to represent the type of a method that returns "Current". The example given by Michael T. Kelley, which was illegal in C++, can be coded in Eiffel. I haven't run this code, so it may not be perfect. The keyword "is" indicates the return value type of a method. class Base feature hello is Like Current do output("hello"); Result := Current; end end class Derived inherit Base feature hello is Like Current do output("g'day "); Result := Current; end goodbye is do output("see ya!"); end end class Main feature Create is local d : Derived; do d.hello.goodbye; -- type-correct end; end A general "copy" method can also be implemented with the same typing, by "creating" an object of type "Like Current". I don't understand Andrew Koenig's conclusion that this behavior should be illegal because you don't know the type at compile-time. I thought not knowing the precise type of things at compile-time is essential to OOP. When working with a variable v:Base one can only know that v.hello will be an instance of some subclass of Base. But this is enough because any such object will handle all the messages defined in Base. -william cook@hplabs.hp.com
vaughan@mcc.com (Paul Vaughan) (08/03/89)
William Cook (cook@hplabsz.HPL.HP.COM) writes: >I don't understand Andrew Koenig's conclusion that this behavior >should be illegal because you don't know the type at compile-time. I >thought not knowing the precise type of things at compile-time is >essential to OOP. When working with a variable v:Base one can only >know that v.hello will be an instance of some subclass of Base. But >this is enough because any such object will handle all the messages >defined in Base. I also don't understand Andrew Koenig's conclusion, but you are not quite correct for c++. When working with a variable Base v; v is most definitely a Base and not a subclass of Base. Only with a variable Base* vp; can you deal with an object which may be a subclass of Base. (Sorry if I'm picking nits.) One of Andrew's arguments had to do with ambiguity when you had references that refered to objects which could be a subclass of the stated type (indefinitely typed). Note that this is also impossible in the current c++. I wrote an article previously (which, from the lack of response, either didn't get out, or was too garbled to understand) that compared indefinitely typed references with pointers. Basically, I don't think there is any ambiguity added with indefinitely typed references that the language doesn't already resolve with pointers. His other argument had to do with overloaded functions. My response to that was that overloaded functions always work on the stated type of their arguments, rather than the actual type, and that there is no ambiguity involved there either. The rule about not being able to overload functions based only on pointer types simply hides this fact. This isn't perhaps the most desirable property in terms of language power, but it is efficient. The only language I know of that allows generic functions with multiple dispatch is CLOS. Paul Vaughan, MCC CAD Program | ARPA: vaughan@mcc.com | Phone: [512] 338-3639 Box 200195, Austin, TX 78720 | UUCP: ...!cs.utexas.edu!milano!cadillac!vaughan
Paul.Vaughan@mamab.FIDONET.ORG (Paul Vaughan) (08/06/89)
-- Fidonet: Paul Vaughan via 1:363/9 Internet: Paul.Vaughan@mamab.FIDONET.ORG Usenet: ...!peora!rtmvax!libcmp!mamab!Paul.Vaughan
strick@osc.COM (henry strickland) (08/14/89)
[ Notice! THIS ARTICLE CONTAINS A RETRACTION at the end! Do not flame me [ until you you see my turn of thinking. I include my faulty argument [ because it is VERY tempting and because other issues are addressed, [ such as why someone would ever use operator=() on indefinitely-typed [ objects. If I'm still wrong at the end, please fry me! I want to know! ]]]]] In article <2045@cadillac.CAD.MCC.COM> vaughan@mcc.com (Paul Vaughan) writes: > One of Andrew's arguments had to do with ambiguity when you >had references that refered to objects which could be a subclass of >the stated type (indefinitely typed). Note that this is also >impossible in the current c++. I wrote an article previously (which, [ paul: you say this is impossible, but I think I do it below. Impossible or illegal? --strick ] [ I do not retract this: in fact, I do it. --strick ] >from the lack of response, either didn't get out, or was too garbled >to understand) that compared indefinitely typed references with pointers. >Basically, I don't think there is any ambiguity added with >indefinitely typed references that the language doesn't already >resolve with pointers. His other argument had to do with overloaded >functions. My response to that was that overloaded functions always >work on the stated type of their arguments, rather than the actual >type, and that there is no ambiguity involved there either. The rule >about not being able to overload functions based only on pointer types >simply hides this fact. This isn't perhaps the most desirable >property in terms of language power, but it is efficient. The only >language I know of that allows generic functions with multiple >dispatch is CLOS. I agree. Virtual functions are selected by the actual type of the object; overloaded functions are discriminated by the declared type of the reference or pointer. [ I do not retract this --strick ] ............................................................................. Concerning Andrew Koenig <ark@alice.UUCP>'s piece of code: class X { public: virtual X& self() { return *this; } }; class Y: public X { public: virtual Y& self() { return *this; } // illegal }; main() { X* xp = new Y; X* xq = new Y; xp->self() = xq->self(); } The problem here is that a NON-VIRTUAL operator=() is being used in a context where a VIRTUAL one is required. /* and a smart virtual one at that: C++ has another problem from the beginning: in smalltalk, you can't assign objects, just pointers: all pointers are the same size; in C++ the language allows you to assign objects, but if there are derived classes, it sounds like a very bad idea. */ [ I do not retract these ] ........................................................................... The above virtual/nonvirtual problem also exists in the language even if you don't have specialized return types from derived class virtuals. This code compiles without errors with cfront1.2, cfront2.0, and g++1.35.1-: struct X { int a; virtual char* isa(); }; struct Y : X { int b; virtual char* isa(); }; char* X::isa() { return "I am an X."; }; char* Y::isa() { return "I am a Y."; }; X* anXptr(int y) { return y? (new Y) : (new X); }; X& anXref(int y) { return *( y? (new Y) : (new X) ); }; main() { // the problem exists with functions returning pointers X* xp= anXptr(0); X* xq= anXptr(1); *xp = *xq; // the problem exists with ptr variables X& xr= *anXptr(0); X& xs= *anXptr(1); xr = xs; // the problem exists with ref variables // the problem exists with functions returning references X* xxp= & anXref(0); X* xxq= & anXref(1); *xp = *xq; // the problem exists with ptr variables X& xxr= anXref(0); X& xxs= anXref(1); xxr = xxs; // the problem exists with ref variables }; Again, the ONLY problems I can see is that we are using a non-virtual operator=() where a virtual is required, and that assignment of objects of specializable classes is, methinks, a bad idea. Notice that all four of the assignments will break: field b will not be copied. I don't know whether the virtual function table pointer will be copied. I suppose so. Either way it's heinous. [ I am wrong about the following, however.... ] I see no problem with specialized return types from virtual functions of derived classes. It seems to me that they meet the protocol specified by the base class. In the case you have an indefinitely-typed pointer/reference to the object, you get something within the realm of what you're promised. In the case you have a specifically-typed pointer/reference (you know for some reason the exact type), you do not have to do any casts to use any special protocol of the exact type. I thought it was the obvious thing to do. I'm afraid I'm really missing something, and I want to fully understand this problem. The language designer and the implementers of cfront1.2, cfront2.0, and g++1.35 all disallow specialized return types in the virtual function case. Why? What am I missing? strick ........................................................................ AND THEN, as I typed ":wq", but before I hit return, I WAS ENLIGNTENED. This is one of the fundamental differences between single- and multiple- inheritance in C++. This is what I was missing, and is probably what many of us who have been using single-inheritance C++ for a couple of years are not used to thinking: +---------------------------------------------------------------------------+ | In a multiply-inherited object, the pointers to derived classes are NOT | | necessarily the same addresses as the pointers to their base classes. | +---------------------------------------------------------------------------+ So suppose instead my class Y were defined with three bases, with the base X buried in the middle: struct Z { int z; }; struct W { int w; }; struct X { int a; virtual char* isa(); }; struct Y : Z, X, W { int b; virtual char* isa(); }; The memory layout of a Y is probably like this: Y* ==> Z* ==> | int z; | X* ==> | int a; | W* ==> | int w; | | int b; | If the bases had been virtual, Z* and Y* would not even point to the same address. Now if I had either "virtual X& X::self()" or "virtual X* X::self()", like this: struct X { int a; virtual char* isa(); virtual X* self(); }; , a class derived from X in a multiply-inherited way cannot have a virtual function returning a pointer to itself and call that a pointer to an X: struct Y : Z, X, W { int b; virtual char* isa(); virtual Y* self(); }; // illegal , because a pointer to a Y is most definitely NOT a pointer to an X, as shown in my layout diagram above. Forgive me if someone else made this clear, but it didn't sink in when I read it. ......................................................................... To let this sink in a little more, go back to my function `anXptr', and assume the multiple-inherited definition of class Y: X* anXptr(int y) { return y? (new Y) : (new X); }; If y is false, the address returned by (new X) is returned by the function. But if y is true, the function returns the location of the X base-object inside the Y instance allocated by (new Y). The code for these two return types is NOT symmetric, like it would be in the single-inheritance case. Also notice that with m-i one can not say { X* xp= new Y; delete xp; } for the same reason. You can only delete it using pointer of the same type that it was allocated from. In single-inheritance C++ you could use any base pointer to delete it. strick -- strick@osc.com 415-325-2300 uunet!lll-winken!pacbell!osc!strick ( also strick@gatech.edu )
alonzo@microsoft.UUCP (Alonzo Gariepy) (08/18/89)
In article <480@osc.COM> strick@osc.UUCP (henry strickland) writes: ... >This is one of the fundamental differences between single- and >multiple- inheritance in C++. This is what I was missing, and >is probably what many of us who have been using single-inheritance >C++ for a couple of years are not used to thinking: > > +---------------------------------------------------------------------------+ > | In a multiply-inherited object, the pointers to derived classes are NOT | > | necessarily the same addresses as the pointers to their base classes. | > +---------------------------------------------------------------------------+ This is a simple implementation detail. The problem you have cited for multiple inheritance applies as much to polymorphic use as it does to variable return types. Any pointer adjustments are identical to those which must take place when you write things like: SomeBaseClass *pSBC; MultiplyDerivedClass MDC; pSBC = (SomeBaseClass *)&MDC; // pointer adjustment needed I think the biggest point being missed in this discussion about alternate return types is that all the work is done on the calling side at compile time so that full information about involved classes is readily available. With a preprocessor that knows about C++ calling syntax and a compiler that allows compound expressions to yield L-values (G++), you can get the correct effect with the macro definition: #define x.func(y,z) (x.thefunc(y, z), x) this works equally well for virtual or inherited functions. But this is not a satisfactory solution because: if x is an expression it is evaluated more than once; more logic is needed to handle the ->func() case; no such preprocessor exists; compound expressions yielding L-values are nonstandard; you need to invent a new name for the macro; and the idea is so simple that it can be implemented without preprocessing right in the compiler. Alonzo Gariepy // these opinions do not reflect alonzo@microsoft // the policies of Microsoft Corp.