[comp.lang.c++] Smart Pointers - Another Proposal

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (02/05/91)

I agree with Tim Atkins' assessment of dumb pointer temporary problems
in Ron Guilmette's most recent smart pointers proposal (the one that
overloads operator& and operator T&).  These problems can be overcome
in single-tasking, serially garbage collected applications if the
compiler understands that for expressions involving smart pointers:

* all side effects of evaluating a component of an expression must
  take effect before evaluating the next component, and

* the RHS of primitive assignment operators (including +=, ++, etc.)
  must be evaluated before the LHS is evaluated.

These restrictions rule out some optimizations, but they work.  These
are not the only restrictions that work, and they are not the weakest
that work, though they may be the simplest outside of conservative
collectors.  Additional compiler support is required to support
parallel and incremental collectors and parallel applications.

There are other problems with the proposal.  For instance:

* it is not clear in function signatures whether or not a pointer is
  smart, and

* there is no way to explicitly use a dumb pointer to a smart class,
  even if a developer knows that such use is safe.

I think that the way around these problems is to to identify smart
pointers as such to C++ compilers explicitly.  This can be done with a
new keyword (the Modula-3 "traced" comes to mind) or by some other
syntactic convention in the declaration of smart pointer classes.

If anyone's interested, I've just finished writing a technical report
describing problems with various smart pointer proposals in the
context of a cooperative garbage collector for C++.  The report is
available via anonymous ftp from ucnet.ucalgary.ca, in the file

  DUA3:[ANONYMOUS.PUB.TECH]CPSC-91-417-01-DVI.Z.

The file really is a compress'ed ".dvi" file, even if the naming is
a bit off.

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca

rfg@NCD.COM (Ron Guilmette) (02/08/91)

In article <1991Feb5.021414.22979@cpsc.ucalgary.ca> gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) writes:
+I agree with Tim Atkins' assessment of dumb pointer temporary problems
+in Ron Guilmette's most recent smart pointers proposal (the one that
+overloads operator& and operator T&).  These problems can be overcome
+in single-tasking, serially garbage collected applications if the
+compiler understands that for expressions involving smart pointers:
+
+* all side effects of evaluating a component of an expression must
+  take effect before evaluating the next component, and
+
+* the RHS of primitive assignment operators (including +=, ++, etc.)
+  must be evaluated before the LHS is evaluated.

If my proposal for allowing *complete* overloading of `T::operator T&'
and `T::operator&' were to be accepted, you *would not* need any additional
help from the compiler.  You would however be required to write your code
carefully (and well) within those few regions of your program where the
dumb pointers were indeed allowed to roam freely.

In particular, once you had generated a dumb pointer value which pointed
to some object, you would need to avoid deleting that object until you
were sure that the given dumb pointer value (and any replicas of it)
would never be used again.

+Additional compiler support is required to support
+parallel and incremental collectors and parallel applications.

I never said it wasn't.  That is a separate problem from the one my
proposal was trying to address.

+There are other problems with the proposal.  For instance:
+
+* it is not clear in function signatures whether or not a pointer is
+  smart, and

Huh?  That's entirely cryptic.

+* there is no way to explicitly use a dumb pointer to a smart class,
+  even if a developer knows that such use is safe.

That's totally incorrect.  Obviously, whenever a member function of a
"smart" class is called, it gets a `this' pointer which is a dumb
pointer to an object of the class.  The `this' pointer can be used
in any and all ways that a pointer to such an object may be used.

What's the problem?

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (02/09/91)

In article <3783@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
> If my proposal for allowing *complete* overloading of `T::operator T&'
> and `T::operator&' were to be accepted, you *would not* need any additional
> help from the compiler.  You would however be required to write your code
> carefully (and well) within those few regions of your program where the
> dumb pointers were indeed allowed to roam freely.

I see now.  In your proposal, most designers of an "access controlled"
class X would not implement a public "operator ->" for the smart
pointer class referring to objects of type X.  Such an operator would
represent an undesirable "leak" of dumb pointers and references to X
into areas of an application which X's designer can't anticipate.

This means that most "generally useful" operations on X would be
phrased as static member functions or as "friend" functions.  This is
because most users of the class will not be able to use a smart
pointer x to X to say "x -> foo ()".  They'll have to say something
like "X::foo (x)" or just "foo (x)".

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca

rfg@NCD.COM (Ron Guilmette) (02/10/91)

In article <1991Feb8.212951.6212@cpsc.ucalgary.ca> gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) writes:
+In article <3783@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
+> If my proposal for allowing *complete* overloading of `T::operator T&'
+> and `T::operator&' were to be accepted, you *would not* need any additional
+> help from the compiler.  You would however be required to write your code
+> carefully (and well) within those few regions of your program where the
+> dumb pointers were indeed allowed to roam freely.
+
+I see now.  In your proposal, most designers of an "access controlled"
+class X would not implement a public "operator ->" for the smart
+pointer class referring to objects of type X.

100% Correct.

+Such an operator would
+represent an undesirable "leak" of dumb pointers and references to X
+into areas of an application which X's designer can't anticipate.

100% Correct.

+This means that most "generally useful" operations on X would be
+phrased as static member functions or as "friend" functions.

Not necessarily.  Each "generally useful" member function for a type X
object could have a parallel counterpart member function for the "smart
pointer" class.  When you called the parallel operation for the smart
pointer class, the implication would be to perform the given operation
on the pointed-at object.

+This is
+because most users of the class will not be able to use a smart
+pointer x to X to say "x -> foo ()".

100% Correct.  Since there will be no operator-> or (unary) operator*
for the smart pointer class, if `sptr' is one of these smart pointer
variables, then both `sptr->foobar()' and `(*sptr).foobar()' will be
illegal simply because there are no such operators defined for the
`sptr' object.

As a matter of fact, this is an important component of my proposed scheme
for encapsulating any and all *legitimate non-null* dumb pointer *values*
within the class itself and within its (friendly) associated smart pointer
class.  YOU MUST NOT DEFINE EITHER AN operator* OR AN operator-> FOR THE
SMART POINTER CLASS.

In current examples of C++ code using "smart pointers" (such as those in
some of Stroustrup's papers on the subject) each smart pointer class does
define an operator* and an operator-> and these operators typically return
values of type `T&' (where `T' is the controlled class).  That however would
constitute a serious leakage of `dumb' reference values under my proposed
scheme.  Under my scheme, you would want to AVOID defining any public member
functions which yield dumb pointer values or dumb reference values.

Assuming that each smart pointer class is ultimately implemented in terms of
an actual `dumb pointer' member variable (of the smart pointer class) and
assuming that you could (through language rules and careful programming)
make sure that valid non-null values of type `dumb pointer to T' could only
be generated and/or stored with the class T itself and within its (friendly)
associated `smart pointer' class, then you would want to maintain the
isolation of all such dumb pointer values within these two classes by
writing all member functions of these two classes such that they do not
return any of the legitimate non-null dumb pointer values which they may
have to the outside world.  Rather, whenever some operation (any operation)
on such an "encapsulated" dumb pointer value was needed, it would have to
be provided as a member function of the smart pointer class in order to
fully maintain the encapsulation of the `dumb' values.

What I had in mind was something like this:

	class smart_pointer;

	class controlled {
		// ... data members ...
		operator controlled& ();	// make it private!
		controlled* address_of ()	// sneeky (and private) way...
			{ return this; }	// ...to get address of self
		friend class smart_pointer;	// very important!
	public:
		controlled ();
		~controlled ();
		smart_pointer& operator& ();	// address-of operator

		void print_self ();
		void mutate_self ();
		void operate_on_self ();
	};

	class smart_pointer {
		controlled *ptr;
		// ... other data members ...
		friend class controlled;	// very important!
	public:
		smart_pointer (controlled *)
			{ ptr = arg; }
		smart_pointer (const smart_pointer&);  // copy constructor
		~smart_pointer ();

		void print_referent ();
		void mutate_referent ();
		void operate_on_referent ();
	};

	// create/generate a smart pointer to one's self

	smart_pointer& controlled::operator& ()
	{
		return *(new smart_pointer(this));
	}

	void smart_pointer::print_referent ()
	{
		ptr->print_self ();
	}

Note that operations upon the dumb pointer member variable `ptr' are kept
strictly within the encapsulation region.

You could only be confident of achieving the kind of "encapsulation of
dumbness" that I have been talking about if:

	(a) you code carefully, so that no member functions (except
	    private ones) return (or otherwise yield) any dumb pointer
	    or dumb reference values for the controlled class, and

	(b) the language rules change so that `operator&' applies in cases
	    where you (implicitly) get the address of the zeroth element
	    of an array, and

	(c) the language rules change so that you can define your own
	    `T::operator T&' (which will actually be used when values of
	    type T are converted to values of type T&) so that you can take
	    control over the production of dumb references, and

	(d) you define your controlled classes with explicit `T::operator T&'
	    and `T::operator&' operators (making `T::operator T&' private of
	    course).

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (02/11/91)

In article <3813@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
>In current examples of C++ code using "smart pointers" (such as those in
>some of Stroustrup's papers on the subject) each smart pointer class does
>define an operator* and an operator-> and these operators typically return
>values of type `T&' (where `T' is the controlled class).  That however would
>constitute a serious leakage of `dumb' reference values under my proposed
>scheme.  Under my scheme, you would want to AVOID defining any public member
>functions which yield dumb pointer values or dumb reference values.

This sounds workable.  You should note that while this approach does
not require most datae members of a "smart" class to be private, those
members may as well all be private, because they cannot be directly
accessed by saying something like "X -> a = 3".

My biggest remaining complaint with the proposal is that there is a
fair amount of work to be done on the part of smart class/smart
pointer designers.  Member functions must be implemented for the smart
class and a calling interface must be provided in the smart pointer
class.  Smart pointers must implement conversion operators so that
pointers to derived classes can be converted to pointers to base
classes.  And implementors of smart class member functions must code
very carefully to avoid creating dumb pointers which might survive a
garbage collection.

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca