[comp.std.c++] Smart Pointers -- A proposed language extension

rmartin@clrcom.clear.com (Bob Martin) (12/29/90)

				Smart Pointers for C++
					A Proposal for
				A new language feature

				REQUEST FOR COMMENTS

THE PROBLEM
	The facilities of C++ allow for overloading the pointer operators
	so as to create "smart pointers".  These are classes which behave
	in some ways as pointers to an object, but have other more
	intelligent behaviors as well.  For example it would be possible to
	create a smart pointer to a block of bytes which would not allow
	itself to be assigned any value outside the bounds of the block.

	Although this concept is useful it is limitted in that there is no
	way to enforce the use of smart pointers to classes designed to
	be referenced by them.  Thus if a user creates a class and intends 
	that it should only be referenced via a "smart pointer" he must 
	inform all the users of the class that they may not use normal 
	pointers but must use his "smart pointer" class.  If a programmer
	forgets and uses a normal pointer, then the program will not work
	and it will be very difficult to find the error.

PURPOSE
	The following is a proposal for a language addition which would
	allow the user to declare a smart pointer to a class, and enforce
	its use by causing the compiler to transform all pointer 
	declarations or expressions refering to that class to be 
	instantiations of the smart pointer.

PROPOSED SYNTAX
	Whereas "class T" is a declaration of T, "class *T" is a
	declaration of a "smart pointer" to T;  If such a class is
	defined then the compiler is warned that a smart pointer to T
	exists and that all expresions and declarations which
	involve pointers to T (T*) should be converted to expresions or
	declarations involving the smart poitner (*T) instead.  For example
	the following declaration:  T *a;  would be interpreted as: *T a;
	Or the following code segment:
		T x;
		T *p;
		p = &x;
	Would be interpreted as:
		T x;
		*T p;
		p = *T(x&);
	
	Thus a user can enforce the use of smart pointers without having 
	to trust the memory of other users. The other users simply declare 
	and use pointers to these objects normally without necessarily 
	knowing that the pointers are in fact "smart pointers".

DEFINING SMART POINTERS
	When in the scope of a *T definition, the use of pointer operations
	on type T are not converted to *T.  Thus within the member
	functions of *T, pointers to T can be used normally.  Thus:

		class *T {
			T *p;

		public:
			*T() 		{p = (T*)NULL;};
			*T(T& t) 	{p = (T*)t;};
		};
	works alright because T* is a dumb pointer when in the context of
	the *T definition.

	Smart pointers to T can also be used in the scope of the *T 
	definition, but they must be declared as *T;

	A definition of the form class *T{}; creates a dummy smart pointer
	which behaves exactly the way a dumb pointer does.  It is a way
	of saying "I don't want a smart pointer to T."  This is useful
	if you wish to cancel the inheritance of a smart pointer to a 
	BASE CLASS.  (See SMART POINTER INHERITANCE)

CASTING SMART POINTERS
	Variables of type *T cannot normally be cast to pointer types.
	Specifically they cannot be cast to (void *).  (Unless the
	conversion has been added as a member function operator void*());

	Variables or expressions of type void* can be converted into *T
	by casting as follows: 
		void *v;
		T *p;
		p = (T *)v;  // converts void* into *T;

	This allows functions of the form:
		operator *T(void*) to be written so as to aid in the 
		conversion from void* to *T;

SMART POINTER INHERITANCE
	When a smart pointer *B has been declared for a class B, then any
	use of pointers to class D derived from B will use *B smart
	pointers unless a *D smart pointer has been declared.  Thus:

		class B;
		class *B;
		class D : B;  // D is derived from B

		D *p;		// p is really a *B

	In the case of multiple inheritance, if one or more of the base
	classes has a smart pointer associated with it, it is illegal to
	use any pointer operations unless a smart pointer has been declared
	for the derived class:

		class A;
		class *A;	// smart pointer to A
		class B;
		class D : A,B; // D is derived from A and B.

		D *p;		// illegal unless *D is declared.

SMART POINTERS AS CLASSES
	Smart pointers are true classes.  They can have member data and
	functions.  Can inherit from one or more base classes, etc.  It
	seems likely that an inheritance hierarchy of classes would be
	shadowed by an inheritance hierarchy of smart pointers to those
	classes.

REQUIRED MEMBER FUNCTIONS FOR *T CLASS DEFINITIONS
	*T()
		The default constructor does not exist.  One must be defined
		with no parameters so that the compiler can create empty
		smart pointers.

	*T(T&)
		The pointerizing constructor.  Used by the compiler to create
		pointers to existing objects.

	operator *()
		Should return a reference to an object of type T;

	operator ->()
		Must return a true pointer to type T. (Not a smart pointer)
		The compiler will not attempt to convert it to a smart pointer.

RECOMMENDED MEMBER FUNCTIONS FOR *T CLASS DEFINITIONS		
	operator[](int)
		Should return a reference to an object of type T;

	*T& operator+=(int) and *T& operator-=(int)
		Should perform reasonable transformations on the pointer and
		return references to *T (hopefully references to "this").

	*T operator+(int) and *T operator-(int);
	friend *T operator+(int, *T&);
		Should perform reasonable transformations on the pointer and
		return a new *T;

	operator void*()
		Should return a void* 'v' which can properly be converted back
		to type *T via (T *)v
		
		
CONCLUSION
	This technique could be a useful feature to the C++ language.  It
	seems in character with other features of the language, and
	provides a way to control all indirect access to any type of
	object.
				

-- 
+-Robert C. Martin-----+:RRR:::CCC:M:::::M:| Nobody is responsible for |
| rmartin@clear.com    |:R::R:C::::M:M:M:M:| my words but me.  I want  |
| uunet!clrcom!rmartin |:RRR::C::::M::M::M:| all the credit, and all   |
+----------------------+:R::R::CCC:M:::::M:| the blame.  So there.     |

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (01/06/91)

In <1990Dec28.203554.21028@clear.com>, rmartin@clrcom.clear.com (Bob Martin) writes:

>	The following is a proposal for a language addition which would
>	allow the user to declare a smart pointer to a class, and enforce
>	its use by causing the compiler to transform all pointer 
>	declarations or expressions refering to that class to be 
>	instantiations of the smart pointer.

To be powerful enough to implement a garbage collector, the smart
pointer proposal must emphasize that, outside of the definition of the
smart pointer class, all language constructs involving machine
addresses must use smart pointers.  This includes:

* all pointers declared by the user and declared as being returned
  from functions (these were mentioned in the proposal),

* reference variables, 

* "this" pointers declared implicitly by the compiler,

* pointers to member functions (which use a pointer to an object
  internally),

* and other temporary pointers which may arise when invoking
  memberwise constructors, destructors or assignment operators.

This will guarantee that expressions like the following no longer goof
up:

	ptr_to_bar X;
	X -> boo (foo ());

With the existing "smart pointer" support, this would be translated to
something like:

	ptr_to_bar X;
	boo ( operator-> (&X), foo ());

The problem with this is that the order of evaluation of arguments to
boo is not defined.  If "operator->" is called before foo is, then a
machine pointer to a "bar" will have been left on the stack while foo
potentially activated a compacting garbage collector.  With "this"
pointers and all other internal pointers made smart pointers, the same
expression would translate to something like:

	ptr_to_bar X, tmp;
	ptr_to_bar_constructor (&tmp, &X);   /* initialize tmp = X */
        boo (&tmp, foo ());

Even with this support, implementing a garbage collector is not easy.
The big loophole is people taking the address of instance variables
in classes with smart pointers defined for them.  If the instance
variables have no smart pointers defined to them, the user will wind up
with a dumb pointer to the middle of a smart object.  If a garbage
collector tries to relocate the object, the dumb pointer is not updated.

You can get around this by requiring that all instance variables in
smart classes be themselves smart.  In an application with different
garbage collectors for different smart classes, this won't buy you
anything though.  In such an application it will be impossible for the
compiler to decide whether or not all of the instance variables in a
class are of the same "smartness" (ie: register pointers with the same
garbage collector).

You can get around this by making all instance variables in the smart
class private.  This limits the use of smart pointers to special
purpose applications.  A general purpose garbage collector can't
require all instances of all collected classes to be private.  And
even for special purpose applications, programmers will have to watch
very carefully.  If a method in the smart class invokes a method in a
dumb instance class, the dumb method will be passed a dumb "this"
pointer.  If that method calls other functions which eventually
trigger a garbage collection in the original smart class, the dumb
method's "this" pointer will be invalidated.

For anyone who's interested, I'm almost finished writing a report
describing the interaction between smart pointers and various garbage
collection options.  If you send me mail, I can send you a copy of the
report when it's finished.

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca