[comp.lang.c++] GC, TC++ 1.01 calls X::X

stephens@motcid.UUCP (Kurt Stephens) (01/23/91)

	I know this is kinda long, but...
	Has anybody else seen TC++ 1.01 generating code where a
copy constructor makes a copy of this'self?
	I ran into this problem when writing a small Lisp
interpreter.  All Lisp object (class T and its derived classes) pointers
are wrapped by an LVAL class which increments reference count in the
pointed-to object, and decrements the reference count (and possibly
GCs (deletes) the un-referenced object) during the LVAL::~LVAL().

-------------------------cut here-------------------------------
#include <iostream.h>

#define DEBUG

class T {
	friend class LVAL;
	int	refcount; // number of LVAL refs to this.
public:
	T() { 
#ifdef DEBUG
		cerr << "T::~(): " << (void*) this << '\n';
#endif
		refcount = 0;
	}
	virtual	~T() {
#ifdef DEBUG
		cerr << "T::~T(): " << (void*) this << '\n';
#endif
		if ( refcount > 0 )
			cerr << "T::~T(): this = " << (void*) this
			<<	": " << refcount << " dangling reference(s)\n";
	}
	int	refs() { return refcount; }
	virtual	void	print ( ostream& os ) { os << "T"; }
};

class Nil : public T {
public:
	Nil() {}
	void	print ( ostream& os ) { os << "NIL"; }
};

class LVAL {
	T*	p;
	void	ref(T* ptr) { p = ptr; p->refcount ++; }
	void	unref() {
		if ( p != NULL && -- (p->refcount) <= 0 )
			delete p;
		p = NULL;
	}
public:
	LVAL () {
#ifdef DEBUG
		cerr << "LVAL::LVAL(): "
		<<	(void*) this << '\n';
#endif
		p = NULL;
	}
	LVAL ( T* p ) {
#ifdef DEBUG
		cerr << "LVAL::LVAL(T* p): "
		<<	(void*) this << ": "
		<<	(void*) p << '\n';
#endif
		ref(p);
	}
	~LVAL () {
#ifdef DEBUG
		cerr << "LVAL::~LVAL(): "
		<<	(void*) this << '\n';
#endif
		unref();
	}
	//
	// THE CULPRIT!!
	//
	LVAL ( LVAL& lval ) {
#ifdef DEBUG
		cerr << "LVAL::LVAL(LVAL& lval): "
		<<	(void*) this << ": "
		<<	(void*) &lval << '\n';

		if ( this == &lval )
			cerr << "LVAL::LVAL(LVAL& lval): "
			<<	(void*) this
			<<	": this == &lval: compiler error\n";
#endif
		ref(lval.p);
	}
	LVAL&	operator = ( LVAL& lval ) {
		unref();
		ref(lval.p);
		return *this;
	}
	T&	operator() () { return *p; }
#define	CAST(x,t)	((t&)((x)()))

	void	print ( ostream& os ) { p->print(os); }

	friend	ostream& operator << ( ostream& os, LVAL& lval ) {
#ifdef DEBUG
		os << ((void*) &lval) << "->"
		   << (void*) lval.p << ':'
		   << lval.p->refs() << ':';
#endif
		lval.print(os); return os; }
};


class Cons : public T {
	LVAL	car, cdr;
public:
	Cons () : car(new Nil), cdr(new Nil) {}
	Cons ( LVAL& CAR, LVAL& CDR ) : car(CAR), cdr(CDR) {}

	LVAL	CAR() { return car; }
	LVAL	CDR() { return cdr; }

	void	print ( ostream& os ) {
		os << '(' << car << " . " << cdr << ')';
	}
};

//
// other classes, functions.
//


//
// WARNING:
// 	No dynamic type checking!!! BE CARFUL!!
//
LVAL	car (LVAL& cons) { return CAST(cons,Cons).CAR(); }
LVAL	cdr (LVAL& cons) { return CAST(cons,Cons).CDR(); }

#define p(x)	cout << "x" << " = " << (x) << '\n'

main() {
	LVAL	A = new T;
	p(A);

	LVAL	B = new Nil;
	p(B);

	LVAL	C = new Cons ( A, B );
	p(C);
	p(car(C));
	p(cdr(C));

	LVAL	D;
	p(D = C);
}
-------------------------cut here-------------------------------

	Now implement some other functions return and pass LVAL instances.
(deeply nested calls, please ;^)
	The problem is that sometimes LVAL::LVAL(LVAL& lval) is
called with this == &lval!
	LVAL::LVAL(LVAL& lval) increments this->p->refcount, which
is already incremented for this instance of LVAL. 
LVAL::~LVAL() is only called once, p->refcount never goes back to 0,
so garbage isn't collected. 
(I know this is pretty primative GC, but it works, when it works! ;^)
Obviously, there is a simple kludge:

	LVAL (LVAL& lval) { if ( this != &lval ) ref(lval.p); }

	But, the point is that a class instance should not construct an
instance from itself. Right?.  Now, I don't know if the above
example will produce the funky calls LVAL::LVAL(LVAL&),
(I'm reproducing this from my noodle).  I'll try to come back with
a definite (short) example.
	Maybe it has something to do with CC's for return values.
I've implemented it using inline and non-inline functions, both
with the same problems.
	Could some people place runtime checks in some real
working TC++ classes to check my suspicions, like:

	X::X(X&x) { if ( this == &x )
		cerr << "X::X(X&): this == &x\n"; }

	Has anybody seen this happen before? Are there any other
implementations that do this?  This program works fine on Sun 2.0:

T::~(): 0x203ac
LVAL::LVAL(T* p): 0xefffcb8: 0x203ac
A = 0xefffcb8->0x203ac:1:T
T::~(): 0x20bc0
LVAL::LVAL(T* p): 0xefffcb4: 0x20bc0
B = 0xefffcb4->0x20bc0:1:NIL
T::~(): 0x20bcc
LVAL::LVAL(LVAL& lval): 0x20bd4: 0xefffcb8
LVAL::LVAL(LVAL& lval): 0x20bd8: 0xefffcb4
LVAL::LVAL(T* p): 0xefffcac: 0x20bcc
C = 0xefffcac->0x20bcc:1:(0x20bd4->0x203ac:2:T . 0x20bd8->0x20bc0:2:NIL)
LVAL::LVAL(LVAL& lval): 0xefffca0: 0x20bd4
car(C) = 0xefffca0->0x203ac:3:T
LVAL::LVAL(LVAL& lval): 0xefffc9c: 0x20bd8
cdr(C) = 0xefffc9c->0x20bc0:3:NIL
LVAL::~LVAL(): 0xefffc9c
LVAL::~LVAL(): 0xefffca0
LVAL::LVAL(): 0xefffc98
D = C = 0xefffc98->0x20bcc:2:(0x20bd4->0x203ac:2:T . 0x20bd8->0x20bc0:2:NIL)
LVAL::~LVAL(): 0xefffc98
LVAL::~LVAL(): 0xefffcac
LVAL::~LVAL(): 0x20bd8
LVAL::~LVAL(): 0x20bd4
T::~T(): 0x20bcc
LVAL::~LVAL(): 0xefffcb4
T::~T(): 0x20bc0
LVAL::~LVAL(): 0xefffcb8
T::~T(): 0x203ac

P.S.: Put this one in the TC++ bug list.

Kurt A. Stephens		Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."



-- 

Kurt A. Stephens		Foo Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."

karel@prisma.cv.ruu.nl (Karel Zuiderveld) (01/24/91)

In <6309@celery34.UUCP> stephens@motcid.UUCP (Kurt Stephens) writes:

>	I know this is kinda long, but...
>	Has anybody else seen TC++ 1.01 generating code where a
>copy constructor makes a copy of this'self?

Yes, I happened to run into the same problem using TC++ 1.00 (I don't
know how to get the update to 1.01 in the Netherlands :-().

I finally traced the bug down to the copy constructor of the class
'string' I was using.

The workaround is obvious: an explicit test if the address of the object 
to copy is equal to this. 

Since the workaround was so easy, I didn't track down in which specific
cases the bug occurs.

Beware!

Karel
-- 
Karel Zuiderveld                            E-mail: karel@cv.ruu.nl
3D Computer Vision - Room E.02.222          Tel:    (31-30) 506682/507772
Academisch Ziekenhuis Utrecht               Fax:    (31-30) 513399
Heidelberglaan 100, 3584 CX Utrecht, The Netherlands

jimad@microsoft.UUCP (Jim ADCOCK) (01/29/91)

I've been looking for the "correct" answer to these kinds of question
on comp.std.c++ too.  The questions being basically, what kinds of 
copies are allowed and/or not allowed, including in what situations
is a compiler allowed to make a copy at the same address as a previous
object, and do functions return by copy.

"Identity" seems to be a very poorly defined concept in C++ [ARM] today.

stephens@motcid.UUCP (Kurt Stephens) (02/07/91)

jimad@microsoft.UUCP (Jim ADCOCK) writes:

>I've been looking for the "correct" answer to these kinds of question
>on comp.std.c++ too.  The questions being basically, what kinds of 
>copies are allowed and/or not allowed, including in what situations
>is a compiler allowed to make a copy at the same address as a previous
>object, and do functions return by copy.

To copy construct onto this'self doesn't make any sence at all.
It should not be allowed.  A copy constructor should be *constructing*
a NEW object.

Kurt A. Stephens		Foo Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."
-- 

Kurt A. Stephens		Foo Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."

jimad@microsoft.UUCP (Jim ADCOCK) (02/16/91)

In article <6432@celery34.UUCP> stephens@motcid.UUCP (Kurt Stephens) writes:
|jimad@microsoft.UUCP (Jim ADCOCK) writes:
|
|>I've been looking for the "correct" answer to these kinds of question
|>on comp.std.c++ too.  The questions being basically, what kinds of 
|>copies are allowed and/or not allowed, including in what situations
|>is a compiler allowed to make a copy at the same address as a previous
|>object, and do functions return by copy.
|
|To copy construct onto this'self doesn't make any sence at all.
|It should not be allowed.  A copy constructor should be *constructing*
|a NEW object.

Let me give you a couple counter-examples:

A not uncommon C++ programming hack is to allocate a large block of
memory, and use the placement operator to create an "Object" block
header at the start of that block.  The block header is used to keep
information about the memory allocation.  Then, if one creates subclasses
of the block header, the placement operator is used again to create the
new subclass header "on top of" the existing header.

Admittedly, such an approach is a hack, but are you saying that compilers
are allowed to generate copy constructors in such a way that re-constructing
an object on top of its self will fail?   -- If compilers *are* allowed to 
assume no aliasing problems between source and the copied object, then 
better, faster constructor code can be generated  -- but then the
placement hack won't necessarily work.

Another variation on the placement hack is using placement and constructors
such as to "change the type" of an object in place.  Are constructors 
required to "work correctly" in those situations or not?

Here's another example:

A function "copy" has a const foo& for a parameter, and returns a foo by
value.  The compiler is smart enough to return the foo value in caller
space, avoiding a temporary.  The compiler is also smart enough to 
determine the foo being passed by reference never has its fields accessed
after the call to copy, and it has no destructor side-effects to worry about.
Can the compiler return the foo "value" in the same physical location 
as the the foo being passed as a reference?  Why or why not?  Note
that the foo being passed as a parameter is "dead" after "copy", but
the program might still make use of its address -- as long as that
address is not dereferenced.  Consider something like:

#ifdef ONE_WAY_OR_ANOTHER
inline
#endif
const foo copy(foo& parm) { return parm; }

BOOL doesItCopy()
{
	foo parm;
	const foo& return = copy(parm);
	if (&parm == &return) return NO;
	else return YES;
}

Must the "copy" of the value returned be in a separate place, or can
the copy be returned in the same place?  -- Again, this is an important
issue for people doing "object-oriented" programming, using addresses to
represent "identity."  And if the copy constructor is simply bit-wise
copy, and "copy" is inline, one can easily imagine where optimizing 
compilers would love to place parm and return at the same location --
then the whole "copy" routine becomes a NO-OP!