[comp.lang.c++] assignment to `this' is deprecated in C++ 2.0

ark@alice.UUCP (Andrew Koenig) (06/04/89)

While C++ 2.0 still supports assignment to `this' in
constructors and destructors, it also supports other
memory allocation mechanisms that we believe will offer
better performance and greater flexibility and conceptual
clarity.  Eventually, assignment to `this' will disappear
altogether; people who presently use it should convert
their programs to the new mechanisms once they start
using C++ 2.0.

The philosophical basis of the new mechanisms is that
allocation and deallocation are completely separate from
construction and destruction.  Construction and destruction
are handled by constructors and destructors (surprise!).
Allocation and deallocation are handled by `operator new'
and `operator delete.'  The rest of this article will ignore
the fact that assignments to `this' are still possible and
discuss only the new mechanisms.

At the time a constructor is entered, memory has already been
allocated in which the constructor will do its work.  Similarly,
a destructor does not have any control over what will happen
to the memory occupied by the object it is destroying after
the destructor is finished.

Here's a simple case:

	void f() {
		T x;
	}

Executing f causes the following to happen:

	Allocate enough memory to hold a T;
	Construct the T in that memory;
	Destroy the T;
	deallocate the memory.

Similarly, saying

	T* tp = new T;

does the following:

	Allocate enough memory to hold a T;
	If allocation was successful,
		construct a T in that memory;
	Store the address of the memory in tp

and saying

	delete tp;

means:

	If tp is non-zero,
		destroy the T in the memory addressed by tp;
		free the memory addressed by tp.

In all cases, the operations implied by `construct a T'
and `destroy a T' are precisely the same.


How, then, do you control the memory allocated for T objects?
How, for instance, do you say `I want to put a T in magical
super-duper graphics memory?'  The answer lies in the allocation
process, not the construction process.  In other words, C++ 2.0
provides fine-grained control over just what you mean when you say

	`allocate enough memory to hold a T.'

Let's look first at the simple case.  If you say

	T* tp = new T;

then `allocate a T' means a call to

	operator new(sizeof(T))

As in ANSI C, the size argument of operator new is of type size_t
(which is a change from previous versions of C++).

Operator new can be either a globally defined function or a member
of class T or a base class of T.  Here is a minimal example of a
global definition of operator new:

	extern "C" void* malloc(size_t);

	void* operator new(size_t sz)
	{
		return malloc(sz);
	}

As with any other global function, there may be only one operator new
(with these particular argument types) in an executable.  If you don't
supply one, there's one in the C++ run-time library that's only a
little more complicated than this one.

Because there can be only one operator new(size_t), you would do
well to define operator new as a member of class T if you want to
control allocation for objects of class T:

	class T {
	// stuff
	public:
		void* operator new(size_t);
	// more stuff
	};

	extern "C" void* malloc(size_t);

	void* T::operator new(size_t sz)
	{
		return malloc(sz);
	}

The version of operator new above will be used only when allocating
objects of class T or classes derived from T.  Making it inline can
make it faster:

	extern "C" void* malloc(size_t);

	class T {
	// stuff
	public:
		void* operator new(size_t sz) { return malloc(sz); }
	// more stuff
	};

There's little to be gained in this particular example, but specialized
allocators can beat malloc by a factor of 10 or more.

Operator new can take additional arguments of any type that it can
use as it wishes.  For example:

	enum Memory_speed { Slow, Normal, Fast };

	void* operator new(size_t sz, Memory_speed sp)
	{ /* stuff */ }

This definition of operator new makes it possible to allocate objects
in various kinds of memory (assuming an appropriate underlying
machine architecture).  You give arguments to operator new at the
time you allocate an object:

	T* tp = new(Fast) T;

The argument list after `new' is passed to operator new in addition
to the size argument.  The C++ compiler supplies the size argument,
which is always first in the argument list.


Operator new (and operator delete) obeys the same scope rules as any
other member function: if defined inside a class, operator new hides
any global operator new.  For example:

	class T {
	/* stuff */
	public:
		void* operator new(size_t, Memory_speed);
	/* more stuff */
	};

	T* tp = new T;		// Error!

The use of `new T' is incorrect in this example because the member
operator new hides the global operator new, so no operator new
can be found for T that does not require a second argument.

There are two ways to solve this problem.  First, the class definition
for T might contain an explicit declaration:

	class T {
	/* stuff */
	public:
		void* operator new(size_t, Memory_speed);
		void* operator new(size_t sz) { return ::operator new(sz); }
	/* more stuff */
	};

Because T::operator new(size_t) is explicitly declared as part of
the class definition of T, the compiler will recognize that it exists,
so that

	T* tp = new T;

will call the global operator new.  Alternatively, you can explicitly
request the global operator new when allocating a T:

	T* tp = ::new T;

Everything I have said for operator new applies to operator delete,
except that you can't overload operator delete.  The reason for this
asymmetry is that operator delete can presumably figure out how to
delete an object by looking at its address.  Alternatively, operator
new might store some kind of magic cookie with the objects it allocates
to enable operator delete to figure out how to delete them.

Of course, if T has its own operator new, it should also have its own
operator delete.  Moreover, if you override T::operator new and go
explicitly to the global allocator by saying

	T* tp = ::new T;

you should probably go directly to the global deallocator when you're
done with it:

	::delete tp;


Finally, this new allocation scheme provides a way to construct an object
in an arbitrary location.  Merely say:

	void* operator new(size_t, void* p) { return p; }

Now you can do something like this:

	void* vp = malloc(sizeof(T));	// allocate memory
	T* tp = new(vp) T;		// construct a T there.

Because it is possible to construct an object in memory that has
already been allocated, we need a way to destroy an object without
deallocating its memory.  To do that, call the destructor directly:

	tp->T::~T();

You must include the `T::' explicitly as a bit of a check that
this is really what you meant.

These kinds of memory allocation games are not recommended for
amateurs.  When cautiously used, though, they can be invaluable.


-- 
				--Andrew Koenig
				  ark@europa.att.com

jeffb@grace.cs.washington.edu (Jeff Bowden) (06/04/89)

In article <9432@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:

>These kinds of memory allocation games are not recommended for
>amateurs.  When cautiously used, though, they can be invaluable.

I think this can safely be said of the language in general. :-)
--
"C++ - the Ratfor of the '90s"

davidm@cimshop.UUCP (David Masterson) (06/06/89)

In the discussion of "this", I'm wondering what the mechanism is for
supporting references to classes by library functions outside of C++.  For
instance, functions within things like a screen manager (X, Suntools) can take
arguments that they will then pass to other functions when the screen manager
makes a callback (Callback routines).  In order to set the context of the
routine that is called back, does it make sense to pass a reference to "this"
to a screen manager function for later passing to the callback routine?
If not, what is the prescribed method of leaving the "class" context of C++
for dealing with routines in other areas/languages and then reconstituting
that context when the routines later return?

David Masterson					(preferred address)
uunet!cimshop!davidm		or		DMasterson@cup.portal.com

rfg@pink.ACA.MCC.COM (Ron Guilmette) (06/06/89)

In article <9432@alice.UUCP> ark@alice.UUCP (Andrew Koenig) writes:
>
>These kinds of memory allocation games are not recommended for
>amateurs.  When cautiously used, though, they can be invaluable.

Yea!  So don't try this at home kids!  Get your parents to allocate
your memory for you. :-)

Sorry... I couldn't resist! :-) :-) 

-- 
// Ron Guilmette  -  MCC  -  Experimental Systems Kit Project
// 3500 West Balcones Center Drive,  Austin, TX  78759  -  (512)338-3740
// ARPA: rfg@mcc.com
// UUCP: {rutgers,uunet,gatech,ames,pyramid}!cs.utexas.edu!pp!rfg

kearns@read.columbia.edu (Steve Kearns) (06/08/89)

What will happen with declaration of a "stack" variable:

{
    TypeT T;
    ...
}

Does new() get called with some indication that allocation on
the stack is expected?  Is new() not called?

I assume that when a vector is newed():
{
    TypeT * tp = new TypeT[100]
    ...
}

then new gets passed 100*sizeof(TypeT)?  

-steve

ark@alice.UUCP (Andrew Koenig) (06/08/89)

In article <6355@columbia.edu>, kearns@read.columbia.edu (Steve Kearns) writes:

> What will happen with declaration of a "stack" variable:

> {
>     TypeT T;
>     ...
> }

> Does new() get called with some indication that allocation on
> the stack is expected?  Is new() not called?

operator new is not called; only the constructor is.

> I assume that when a vector is newed():
> {
>     TypeT * tp = new TypeT[100]
>     ...
> }

> then new gets passed 100*sizeof(TypeT)?  

Nope -- again, operator new is not called.

Array types are effectively built-in types for the purpose of
memory allocation, so all requests for array allocation and
deallocation go through the built-in allocator.
-- 
				--Andrew Koenig
				  ark@europa.att.com