[comp.lang.c++] explicit initializations

shapiro@inria.UUCP (Marc Shapiro) (12/12/86)

Several messages have asked for a syntax for calling constructors
explicitly.  The official answer has been that it's not a good idea; that
it's better to provide an initialization procedure and call that procedure
explicitly.  I don't like that answer: I remember reading in some
introductory paper on C++ that one of the advantages of C++ over C is that
initialization is automatic, whenever a constructor is provided, and I
believed it.

An other answer is to use derived classes: put the necessary
initializations in a derived class.  This is not always appropriate since
the derived constructor is always called after the base constructor.  Here
is an example: suppose you want to dedicate a task to each instance.  You
spawn a task in the base constructor.  You lose, because the derived
constructor is executed in parallel with the newly-spawned task intstead
of being done within it.  In our project we have many other examples.

I had thought of the following workaround: all my classes are derived from
some base class.  This base class declares a private, virtual initializer
"initialize()".  The constructor for the base class calls "initialize()"
which is re-declared in all the subclasses to do whatever is needed.
Unluckily, this does not work: even though "initialize()" is a virtual, it
is always base::initialize which is called, as demonstrated by the
following program:

int printf (char* ...);

class base {
   virtual int sz() {return sizeof (*this);};
   virtual void initialize() {i=sz();};
 public:
   int i;
   int size () {return sz();};
   base () { initialize();};
};

class derived: public base {
    virtual int sz() {return sizeof (*this);};
    char* s;
 public:
    int j;
    derived (char* ss) {j=sz(); s=ss;};
};

void main (int argc, char *argv[]) {
    base b;
    derived d ("zzz");
    printf ("sizeof b = %2d, b.size() = %2d, b.i = %2d\n",
             sizeof b,       b.size(),       b.i);
    printf ("sizeof d = %2d, d.size() = %2d, d.i = %2d, d.j = %2d\n",
             sizeof d,       d.size(),       d.i,       d.j);
    }

which prints the following results (on a Vax):

sizeof b =  8, b.size() =  8, b.i =  8
sizeof d = 16, d.size() = 16, d.i =  8, d.j = 16


I find it odd that the call to virtual 'sz' has a different semantics
within the constructor for base (call base::sz) and within
procedure 'size' (call this->sz).  Is this a bug or a feature?

Conclusion: there exists a need for a syntax to express explicit
initialization of an instance.  This need would be filled by a syntax to
get the address of the constructor.  Such an addition would leave room for
the future enhancement of dynamic linking (that's why my group needs it).


				Marc

mikem@otc.OZ (Michael Mowbray) (12/14/86)

In article <340@inria.UUCP>, shapiro@inria.UUCP (Marc Shapiro) writes:
> 
> This base class declares a private, virtual initializer
> "initialize()".  The constructor for the base class calls "initialize()"
> which is re-declared in all the subclasses to do whatever is needed.
> Unluckily, this does not work: even though "initialize()" is a virtual, it
> is always base::initialize which is called, as demonstrated by the
> following program:
> 
> [ .. deleted .. ]
> 
> I find it odd that the call to virtual 'sz' has a different semantics
> within the constructor for base (call base::sz) and within
> procedure 'size' (call this->sz).  Is this a bug or a feature?
> 

When I first came across this occurrence, it seemed like a bug to me. I
mailed Bjarne about it some time ago. The following is a condensed
version of what I said:

    *************************************************************************

    When a derived class's constructor calls its base class's constructor,
    it ought to initialize the derived class's vptr first, just in case the
    base class constructor uses it in some way.

    Currently what happens in the derived class's ctor is:

	struct Derived *_Derived__ctor( _auto_this, ..... )
	struct Derived *_auto_this;
	{
	    if (_auto_this == 0)  /* allocate space */
		_auto_this = (struct Derived *) _new((long)...);

	    _auto_this = _Base__ctor(_auto_this, .... );
					    /* But vptr not yet set! */

	    _auto_this->_Base__vptr = _Derived__vptr;

		.... etc....

	    return _auto_this;
	}

	/* and a declaration of an instance of Derived would generate: */

	struct Derived  d;
	_Derived__ctor(&d);

      So if the Base ctor wanted to use a virtual function, it would make a 
    mistake. Perhaps more correct code would be:

	struct Derived *_Derived__ctor( _auto_this, _auto_Base__vptr, ..... )
	struct Derived *_auto_this;
	int (**_auto_Base__vptr)();
	{
	    if (_auto_this == 0)
		_auto_this = (struct Derived *) _new((long)...); 

	    _auto_this = _Base__ctor(_auto_this, _auto_Base__vptr, .... );

		...etc...

	    return _auto_this;
	}

	struct Base *_Base__ctor(_auto_this, _auto_Base__vptr, ....)
	struct Base *_auto_this;
	int (**_auto_Base__vptr)();
	{
	    if (_auto_this == 0)
		_auto_this = (struct Base *) _new((long)...);

	    _auto_this->_Base__vptr = _auto_Base__vptr;

		...etc...

	    return _auto_this;
	}

	/* Declarations of instances of Base and Derived would	*/
	/* have to generate:					*/

	struct Base  b;
	_Base__ctor(&b, _Base__vtbl);

	struct Derived  d;
	_Derived__ctor(&d, _Derived__vtbl);

    I.e: Leave it to the Base ctor to set the vptr, based on what is
    passed to it.

    ********************************************************************

The essence of Bjarne's reply was that a Base constructor must establish
an environment for whatever happens later. I believe he's RIGHT. When
you're writing the code for the Derived class's virtual function, you
tend to assume that everything's ok, but if the above were implemented,
you could only be sure of that if you had the code for the Base
class's virtual function and constructor to look at and made sure you
haven't forgotten anything.

Therefore, doing things along the lines above is probably a recipe for
a time-bomb.

So, in answer to the question "Is it a bug or a feature?" I think the
answer is "it's a feature". - One shouldn't expect virtual functions to
work as normal until the instance has been properly constructed, because
a derived class shouldn't be allowed to alter how the base class is
constructed, apart from passing appropriate arguments to the base class
constructor.

Also, the only reason for wanting to do the above is to enable explicit
initialisation of an instance. (This was also my reason for initially
wanting to do the above.)

> Conclusion: there exists a need for a syntax to express explicit
> initialization of an instance.

I'm having difficulty estimating just how often such a thing is needed.
There are certainly times when you DO need it (e.g: if you're doing
low-level stuff, and a particular instance MUST reside at a certain
location.) But I have difficulty imagining where else it's essential.
See Below.

> This need would be filled by a syntax to get the address of the
> constructor.

Superficially, it could be done this way:

	class SomeClass {
		......
	    public:
		SomeClass();
		~SomeClass();
	};

	SomeClass::* ctor_ptr() = &SomeClass::SomeClass;
	SomeClass::* dtor_ptr() = &SomeClass::~SomeClass;

or something like that. This would also need some way to express that it's
a pointer to a constructor rather than, say, a pointer to a void member fn.
In any case, how would you use this?

	main()
	{
	    int buf[.....];

	    (((SomeClass*)buf)->*ctor_ptr)();    ?????

		..... do whatever .....

	    (((SomeClass*)buf)->*dtor_ptr)();
	}

This wouldn't work correctly, because the translator has no way of knowing
whether to actually free space when the dtor_ptr is called.

Allowing constructors/destructors to be called in the normal syntax as
per the following isn't good enough either:

	int buf[.....];	 // assuming alignment is correct.

	((SomeClass*)buf)->SomeClass(); // perform whatever initialisations
					// are needed, using buf as 'this'.

Once again, the destructor can't be called properly. This is a reason
to be wary of providing a feature for taking addresses of
constructors.

The mechanism I posted earlier:

    class SomeClass {
	     .....
	public:
	    SomeClass( ...args...);
	    ~SomeClass();
    };

    struct Aux_SomeClass : public SomeClass {
	    Aux_SomeClass(void* buf, ...args...) : (...args...)
	    {
		this = (this==0) ? (Aux_SomeClass*)buf : this;
	    }
    };

    main()
    {
	int buf[......];

	SomeClass *sc = (SomeClass*) new  Aux_SomeClass(buf, ...args...);
    }

isn't good enough either, for the same reason. There doesn't seem to be any
way of arranging for ~SomeClass() to be called correctly for buf at the
end of main(). (``delete sc;'' wouldn't work correctly because buf is
an automatic variable.)

My Conclusion:
	Explicit initialisation and tidy-up of instances has uses but
probably not that many. The obvious syntax for taking addresses of
constructors doesn't seem to have any chance of working in the case of
destructors. To achieve the desired aim, maybe you'd need a syntax like:

	main()
	{
	    int buf[.....];

	    SomeClass  sc(...args...) : buf;
	}

I.e: something that let's the translator know exactly what you're doing.
This sort of thing obviously involves work and the net gain to the
language overall is probably not that great.

> Such an addition would leave room for the future
> enhancement of dynamic linking (that's why my group needs it).

Marc, would you explain your need for this in more detail?

			Mike Mowbray
			Systems Development
			Overseas Telecommunications Commission (Australia)

UUCP:   {seismo,mcvax}!otc.oz!mikem              ACSnet: mikem@otc.oz