[comp.lang.c++] Default copy constructor not making a copy.

jlk@pravda.gatech.edu (Janet Kolodner) (02/14/91)

Can someone explain why the two classes defined below exhibit different
behavior?  The class without the explicit copy constructor does not make
a copy of the *this argument for the + operator.  There is a related
comment in Ellis & Stroustrup (12.6.1): "Had no copy constructor been
declared...all would have happened exactly as before because a copy
constructor would have been generated.  Again, a good compiler would
eliminate the use of the generated copy constructor."  I'm not convinced
that this statement applies in this case.  Any enlightenment on this
subject would be welcome.  I'm using ATT C++ 2.0 on a Sparc.


#include <iostream.h>
#include <iomanip.h>

class NoCopy {
public:
  int x;
  NoCopy (int a) { x = a; };
  NoCopy &operator +=(int a) { x += a; return *this; };
  NoCopy  operator + (int a) { return NoCopy(*this) += a; };
};

class Copy {
public:
  int x;
  Copy(const Copy &c) : x(c.x) {};
  Copy (int a) { x = a; };
  Copy &operator =(int a) { x = a; return *this; };
  Copy &operator +=(int a) { x += a; return *this; };
  Copy  operator + (int a) { return Copy(*this) += a; };
};

void main()
{
  NoCopy xNo(1), xNo2(-1);
  Copy xYes(2), xYes2(-2);
  cout << "No=" << xNo.x << " Yes=" << xYes.x << endl;
  cout << "No2=" << xNo2.x << " Yes=" << xYes2.x << endl;
  xNo2 = xNo + 2;
  xYes2 = xYes + 2;
  cout << "No=" << xNo.x << " Yes=" << xYes.x << endl;
  cout << "No2=" << xNo2.x << " Yes=" << xYes2.x << endl;
}

Output of the program:
No=1 Yes=2
No2=-1 Yes=-2
No=3 Yes=2
No2=3 Yes=4

jgro@lia (Jeremy Grodberg) (02/19/91)

In article <22023@hydra.gatech.EDU> gatech.edu!blsouth!klein (Michael Klein) writes:
>Can someone explain why the two classes defined below exhibit different
>behavior?  The class without the explicit copy constructor does not make
>a copy of the *this argument for the + operator....
>
>#include <iostream.h>
>#include <iomanip.h>
>
>class NoCopy {
>public:
>  int x;
>  NoCopy (int a) { x = a; };
>  NoCopy &operator +=(int a) { x += a; return *this; };
>  NoCopy  operator + (int a) { return NoCopy(*this) += a; };
>};

The ARM is immensly confusing on this issue, and I was going to post
saying that it was completely ambiguous whether NoCopy(*this) should 
return a distinct object or not.  My thought was that if NoCopy(*this)
is treated as a cast, then it was OK for NoCopy(*this) to evaluate to
the same object you started with.  However, after a lot of digging, 
I believe I can make the case that NoCopy(*this) should return a distinct 
object in this situation.  The example output shows that NoCopy(*this) 
is in fact returning *this, and not a temporary object.

The problem seems to be the ambiguity between NoCopy(*this) as a cast
operator (Sec. 5.2.3) and as an explicit constructor call (Sec. 12.1).
Section 6.8 attempts to disambiguate this expression, and claims
it is an expression statement, given a choice between and expression 
and a declaration, but Section 12.1 makes no comment about the ambiguity
between a cast and an explicit constructor call. It appears that there is no
difference.  Section 5.2.3 says that a function-style cast "constructs a 
value of the specified type...."  Section 5.4 makes no distinction
between a function-style cast and a normal C style cast except that
the latter can be used to convert to a type that does not have a
"simple-type-name."  It goes on to say that "An object or a value
may be converted to a class object (only) if an appropriate constructor
or conversion operator has been defined,"  which implies that such
constructor or conversion operator will be called.  Therefore, it appears 
the only time there is a difference between a cast and an explicit 
constructor call is when the "destination" of a cast is not a class object,
in which case the expression cannot possibly be interpreted as a constructor
call.

Therefore, the ARM at least strongly implies that whether this NoCopy(*this)
is treated as a function-style cast or as and explicit constructor call,
it should return a distinct object.  Another reason to expect a distinct
object is that there is no alternate syntax for creating an unnamed
temporary.  So, I'd say we have a bug in CFront, and some fuzzy areas in
the spec to clean up.  If you say you can make the case that I am wrong, 
then I will go back to my original position that the ARM is too unclear
about this to say either way, and we really need a better spec (and an
answer from the C++ gods in the interim).

>
>class Copy {
>public:
>  int x;
>  Copy(const Copy &c) : x(c.x) {};
>  Copy (int a) { x = a; };
>  Copy &operator =(int a) { x = a; return *this; };
>  Copy &operator +=(int a) { x += a; return *this; };
>  Copy  operator + (int a) { return Copy(*this) += a; };
>};

Here, Copy(*this) is taken as an explicit constructor call, and you get
a different, temporary object.

>
>void main()
>{
>  NoCopy xNo(1), xNo2(-1);
>  Copy xYes(2), xYes2(-2);
>  cout << "No=" << xNo.x << " Yes=" << xYes.x << endl;
>  cout << "No2=" << xNo2.x << " Yes=" << xYes2.x << endl;
>  xNo2 = xNo + 2;
>  xYes2 = xYes + 2;
>  cout << "No=" << xNo.x << " Yes=" << xYes.x << endl;
>  cout << "No2=" << xNo2.x << " Yes=" << xYes2.x << endl;
>}
>
>Output of the program:
>No=1 Yes=2
>No2=-1 Yes=-2
>No=3 Yes=2
>No2=3 Yes=4


-- 
Jeremy Grodberg      "I don't feel witty today.  Don't bug me."
jgro@lia.com          

jimad@microsoft.UUCP (Jim ADCOCK) (02/20/91)

In article <22023@hydra.gatech.EDU> gatech.edu!blsouth!klein (Michael Klein) writes:
|Can someone explain why the two classes defined below exhibit different
|behavior?  The class without the explicit copy constructor does not make
|a copy of the *this argument for the + operator.  There is a related
|comment in Ellis & Stroustrup (12.6.1): "Had no copy constructor been
|declared...all would have happened exactly as before because a copy
|constructor would have been generated.  Again, a good compiler would
|eliminate the use of the generated copy constructor."  I'm not convinced
|that this statement applies in this case.  Any enlightenment on this
|subject would be welcome.  I'm using ATT C++ 2.0 on a Sparc.

Part of the confusion arises from the word "good" above -- today's C++
compilers vary greately in how "good" they are at understanding what 
things in C++ they can or cannot optimize.  Thus, C++ compilers can and
do give greatly varying and at first glance non-sensical performance when
presented with implied or explicit conversion or cast operators.

Note that given a class FOO,  FOO(x), (FOO)x, and (FOO)(x) all use a
FOO constructor to make a unnamed temporary FOO from x.

This is to be distinguished from reference casting:  (FOO&)x  -- which
says to consider that glob of bits at x's location as if it were a FOO.
This reference cast hack is essentially the same as the common "C" pointer
hack:  *((FOO*)(&x))  "Consider x's address as if it were a FOO pointer and
dereference that FOO pointer."

But I digress.  Back to that unnamed FOO temporary created by FOO(x).
The C++ specs allow compilers to eliminate any and all unnamed temporaries
if the only way to tell if the temporary existed is via side-effects of
the constructor and destructor.  Further,  ARM 12.1.2c states that unnamed
temporaries can be eliminated when their side effect is to make a copy!

Conversely, C++ compilers are free to introduce unnamed temporaries when
the compiler considers it necessary or convenient.

These rules can be easily summarized from the C++ programmers point of view:

"NEVER rely on unnamed temporaries to do anything you want them to do!"

How can one tell if one is relying on an unnamed temporary?  Simple, ask 
yourself if that object you think you're relying on has a name.  If so, what
is its name?  If it doesn't have a name, you CANNOT rely on it.  How then,
can one introduce a NAMED temporary?  The most common way is to use a named
local "auto" variable:

FOO FOO::operator + (int a) 
{ 
	FOO foo(*this);
	return foo += a; 
}

In this example the introduction of a named local variable "foo" forces a
copy of *this.  Adding a to foo then cannot modify *this.  The value "foo += a"
is in turn returned from "operator+" via unnamed temporary -- another issue 
which I will address below.

Another approach to giving a temporary a name is to assign a reference to it.
[ARM pg 268] A reference is just another way to give a name to an object:	

FOO FOO::operator + (int a)
{
	FOO& foo = FOO(*this);
	return foo += a;
}

I claim in this example that since the temporary created by FOO(*this) is 
bound to a name "foo", the compiler is not free to optimize it away.
However, this rule is subtle enough that you shouldn't be surprised if many
of today's C++ compilers screw it up.  Thus, I recommend the first approach.
Where the reference-to-a-temporary approach can be important however, is
in forcing a valid return object from a function:

	FOO foo;
	int a;
	....
	foo + a;  // which calls foo.operator+(a)

This allows the compiler to "optimize away" the unnamed temporary used to return
the results from FOO FOO::operator+(int).  Whereas if you say:

	const FOO& namedvalue = foo + a;

then the compiler is forced to generate a valid temporary with the name
"namedvalue" which remains valid for the scope that "namedvalue" is valid
in.  [Note that such use of references to temporaries doesn't keep the 
temporaries alive indefinitely.  Since namedvalue is in the scope of the
call to the function, the temporary only stays alive as long as the scope
the function "operator+" is called in.  Also, function return values aren't
lvalues, so the reference assigned to it must be a const.]

In Summary:

1) NEVER rely on unnamed temporaries to do anything you want them to do.

2) ALWAYS introduce a named variable to force a copy.

3) COUNT ON finding C++ compilers today that screw up even these simple rules.