[comp.lang.c++] Failed allocation in constructors?

escher@Apple.COM (Michael Crawford) (07/11/90)

Can anyone suggest strategies for handling failed memory allocation
in constructors?

Constructors may not have return values, and aren't even void.
I have seen some references to how there is not unified method, and
one just needs to hack it.

Example source code I have seen does not even address the issue, and
just assumes the construction succeeds.

The problem, as I see it, is that what one does upon memory failure
depends greatly on the situation: do you crash and burn, wait, advise
the user to close documents, or free a sentinel block (a large block
allocated at the start of a program to provide a reserve) and retry
the allocation?

Here is a the problem:

class foo {
	char *thePtr;
public:
	CauseACrash();
	foo();
	~foo();
}

void theFun()
{
	foo theFoo;	// foo(), the constructor, gets called

	foo.CauseACrash();

	// ~foo() gets called here, if we're still running!
}

foo::foo()
{
	thePtr = NewPtr( 123456789 );

	if ( thePtr == (char*)NULL ){
		// Whadda I do?
	}
}

foo::CauseACrash()
{
	int i;
	
	for ( i = 0; i < 123456789; i++ ){
		*i = 'Q';
	}
}

On a Mac, CauseACrash will scrawl all over the interrupt vector table; on
Unix, it should core dump (if page 0 is mapped out, as good Unices do).

Seems to me a member function ought to be able to assume that its instance
is valid, and one should not have to validate every instance one creates;
the fact that the object is dynamically allocated should be hidden to the
user.  This would mean that the failure should be handled within the
constructor.

I can conceive of some ways to do it, but I am a beginner here, and having
enough trouble figuring out the language.  Others must have invented this
wheel before, care to roll one by me?

I will summarize and post if you mail me suggestions, though this looks
like a general enough problem to have an open discussion.

Thank you very much,
-- 
Michael D. Crawford
Oddball Enterprises		Consulting for Apple Computer Inc.
606 Modesto Avenue		escher@apple.com
Santa Cruz, CA 95060		Applelink: escher@apple.com@INTERNET#
oddball!mike@ucscc.ucsc.edu	The opinions expressed here are solely my own.

		alias make '/bin/make & rn'

fair@Apple.COM (Erik E. Fair) (07/11/90)

In the referenced article, escher@Apple.COM (Michael Crawford) writes:

on Unix, it should core dump (if page 0 is mapped out, as good Unices do).


Mike,
	Deferencing address zero isn't a crime, unless it isn't in
your address space. There are lots of versions of UNIX where it is, and
that program won't crash and burn until you reach the bottom of your
allocated address space.

The reason that some UNIX systems don't map page zero, as you put it,
varies. Some do it because their hardware (or compiler) has things set
up so that a user program's address space doesn't begin in zero.

Some do it, because attempting to deference address zero is a very
common portability error (and more than occasionally, a logic error)
for the systems that don't have it mapped, particularly for those
programs which assume that location zero *contains* a zero of some
size. So some UNIX systems that might otherwise have address zero
mapped, don't do so, as an encouragement to write portable code.

So, dereferencing zero isn't a bad thing; it's just bad to assume that
you always can across all possible hardware architectures that your
program might be run on.

	Erik E. Fair	apple!fair	fair@apple.com

imp@dancer.Solbourne.COM (Warner Losh) (07/12/90)

In article <42825@apple.Apple.COM> fair@Apple.COM (Erik E. Fair) writes:
>So, dereferencing zero isn't a bad thing; it's just bad to assume that
>you always can across all possible hardware architectures that your
>program might be run on.

Dereferencing "zero" is a bad thing.  "0" is the Nil pointer in C and
C++.  It is defined to point to an address that doesn't exist (or is
at least unique).  So if you dereference it, you are asking for
trouble.

In most implementations of C and C++ a Nil pointer has all its bits
turned off, but not all implementations do this.  Regardless of the
implementation, saying "p=0" will get you a Nil pointer.  So if you
then say *p, that is an error.  However, according to the standard,
compilers are free to not notify the user this error has occurred.

-- 
Warner Losh		imp@Solbourne.COM
All the world is not a VAX. :-)

lsr@Apple.COM (Larry Rosenstein) (07/12/90)

In article <9075@goofy.Apple.COM> escher@Apple.COM (Michael Crawford) writes:
>
>Example source code I have seen does not even address the issue, and
>just assumes the construction succeeds.
>
>The problem, as I see it, is that what one does upon memory failure
>depends greatly on the situation: do you crash and burn, wait, advise
>the user to close documents, or free a sentinel block (a large block

The purpose of the constructor is to put the object in a known, consistent
state.  This doesn't necessarily mean it is in a valid or normal state.
Probably the best approach is to have the contructor flag the invalid
object, and have the other methods check the validity before proceeding.

After the client creates the object, it can check to see if the creation
succeeded.  If it doesn't check, then error can be reported later, but at
least everything is consistent, and the client can handle the error as it
chooses. 

I don't think a constructor should try to decide what to do about the error
(e.g., crash, advise the user, etc.); that's the function of higher levels
of software.  Freeing a sentinel block is a common approach on the
Macintosh, but this is normally done at a lower level (within a Memory
Manager GrowZoneProc).

Another approach would be similar to what is done in Object Pascal.  The
constructor is set up to do initialization that can't fail, and there is a
separate initialization step that the client has to explicitly call, which
can return an error.  (This makes things more complicated for the client, of
course.) 

The issue isn't just that constructors don't return a value.  One can always
use setjmp/longjmp to break out of the constructor.  The problem is
recovering, which requires that destructors be invoked for automatic
objects, for example.  (The proposed C++ exception handling mechanism does
this, for example.)

>On a Mac, CauseACrash will scrawl all over the interrupt vector table; on

True, but not very useful.

-- 
		 Larry Rosenstein,  Object Specialist
 Apple Computer, Inc.  20525 Mariani Ave, MS 46-B  Cupertino, CA 95014
	    AppleLink:Rosenstein1    domain:lsr@Apple.COM
		UUCP:{sun,voder,nsc,decwrl}!apple!lsr

escher@Apple.COM (Michael Crawford) (07/13/90)

In article <1990Jul11.180345.21464@Solbourne.COM> imp@dancer.solbourne.com 
writes:
>In article <42825@apple.Apple.COM> fair@Apple.COM (Erik E. Fair) writes:
>>So, dereferencing zero isn't a bad thing; it's just bad to assume that
>>you always can across all possible hardware architectures that your
>>program might be run on.
>
>In most implementations of C and C++ a Nil pointer has all its bits
>turned off, but not all implementations do this.  Regardless of the
>implementation, saying "p=0" will get you a Nil pointer.  So if you
>then say *p, that is an error.  However, according to the standard,
>compilers are free to not notify the user this error has occurred.

According to me, no system that pretends to have protected memory
has any excuse to not map out page 0.

This, to me, is like buying a fancy car, and having them leave out
the brakes.

I understand it is perfectly legal to have it mapped in, as long
as address 0 is either never used _or_ (void*)NULL does not have
all 0 bits (this is not always the case in practice, as recently
discussed in comp.unix.wizards), but I consider segmentation 
violations and core dumps to be extremely important debugging tools,
and would not buy a protected-mode OS that did not do this.

Of course the Macintosh doesn't, but it does not claim to have
protected memory, and one is not paying for it.  I program the mac
because I find it aesthetically pleasing, but I also program on Unix
(Sun particularly) because I find it easier to get things to work.

-- 
Michael D. Crawford
Oddball Enterprises		Consulting for Apple Computer Inc.
606 Modesto Avenue		escher@apple.com
Santa Cruz, CA 95060		Applelink: escher@apple.com@INTERNET#
oddball!mike@ucscc.ucsc.edu	The opinions expressed here are solely my own.

		alias make '/bin/make & rn'

roger@grenada.UUCP (Roger Corman) (07/13/90)

In article <9075@goofy.Apple.COM> escher@Apple.COM (Michael Crawford) writes:
>Can anyone suggest strategies for handling failed memory allocation
>in constructors?
>

This is the problem that has been most bothering me about C++.
Constructors and destructors are very powerful and elegant, so that
I would like to use them to hide messy memory allocations and such that
need to occur when a new object is created.  I have always felt that
good programming practice is to *always* assume that a memory
allocation can fail.  When creating commercial software packages,
especially on microcomputers, this is important.  Note that other
problems besides lack of memory can cause a constructor to fail.

How then do we test for constructor failure?

C++ provides a way of intercepting failed calls to new(), via the
'new() handler'.  This allows deallocation of unnecessary storage and other
measures to be taken when memory allocation fails.  This is a very
useful language feature, but doesn't really solve my problems.  It 
isn't useful for automatic object allocation (on the stack) because
new() isn't called.  True, new() is called by my constructor to allocate
other space for object-related things, but the problem is not at this
level.  My problem is how to handle the case when the object cannot
be allocated (for whatever reason).  The constructor needs to be able
to notify the routine which invoked it.  The other problem with the
new handler is that it is not customizable on a per class basis (I
can't have a different one for each class).

I can override new(), for each class.  I suspect that this may offer
a solution but I haven't totally thought it through.  I can live
with not being able to use automatic (stack) allocation for the classes
which allow for constructor failure.  It would be nice if I could
specify, in the object definition, that it can only be created via new()
(possible enhancement to the language?).  I typically test for 
new() returning NULL as in indicator of failure.  If I override new()
for a class, and cause it to return NULL whenever the constructor
fails, the class interface would be satisfactory.  Unfortunately,
new() does not call the constructor--the constructor calls new().
(I may be wrong on some of these points--someone please correct me.)
Even if new() did call the constructor, I would have the same problem
with the constructor not returning a value.  I could build (at least
some of) the constructor code into the new() function, but this
seems inelegant and seems to be circumventing the constructor
concept.

In some code I have written, I have added an error field to the
object which is routinely tested after an object is created, to
check for an allocation error.  This works, but seems wasteful and
messy.  Also it is easy to ignore.

I would be happy to be embarrassed by someone pointing out a trivial
way to do this which I am overlooking.  Otherwise maybe someone has
thought of a reasonably elegant way.  I have been using C++ for
about a year, and this is the only problem I have really come upon
with the language.

Roger Corman
Island Graphics
Santa Rosa, CA
{uunet,ucbcad,sun}!island!roger
(707) 523-4465

burch@hpccc.HP.COM (Jeff Burch) (07/13/90)

A feature of C++ that I just discovered may be helpful.  Operator new has a
built in mechanism that will check if memory allocation fails.  It will 
automatically call a user supplied routine.  Consequently, you can deal
with this condition in one place and not have to check for NULL's in your
code.

For instance, without this feature, you should do this:

void foo()
{
	char *pnt = new char[ 123456789];
	if (pnt == NULL)
		{
		// oops!
		...
		}
}

With this feature, you can skip the error checking.  To enable it, do this
somewhere (probably in main()):

#include <new.h>

main()
{
	void free_store_exception( void);	// your routine

	// catch memory allocation failures:
	set_new_handler( free_store_exception);

	...

}

void free_store_exception( void)
{
	// Oops, memory allocation has failed!
	fprintf(stderr,"Major bomb, out of memory!\n");

	// What you choose to do at this point is wide open...

	// for now, abort and dump a core (unix systems)
	abort();
}

We view this as a catastrophic failure and abort.  Note that this mechanism
functions similar to unix signals.  We routinely trap things like
floating point errors, segment violations, etc.  See signal() for details.

For more info, see p 170 (_new_handler) in the C++ Primer by Stanley B.
Lippman

Regards,

	Jeff Burch
	jeff@hpoemb

jimad@microsoft.UUCP (Jim ADCOCK) (07/14/90)

In article <1990Jul11.180345.21464@Solbourne.COM> imp@dancer.Solbourne.COM (Warner Losh) writes:
|In article <42825@apple.Apple.COM> fair@Apple.COM (Erik E. Fair) writes:
|>So, dereferencing zero isn't a bad thing; it's just bad to assume that
|>you always can across all possible hardware architectures that your
|>program might be run on.
|
|Dereferencing "zero" is a bad thing.  "0" is the Nil pointer in C and
|C++.  It is defined to point to an address that doesn't exist (or is
|at least unique).  So if you dereference it, you are asking for
|trouble.
|
|In most implementations of C and C++ a Nil pointer has all its bits
|turned off, but not all implementations do this.  Regardless of the
|implementation, saying "p=0" will get you a Nil pointer.  So if you
|then say *p, that is an error.  However, according to the standard,
|compilers are free to not notify the user this error has occurred.

The bottom of pg 35 of E&S says [short quote:]

A constant expression (@5.19) that evaluates to zero is converted to a 
pointer, commonly called the null pointer.  It is guaranteed that this
value will produce a pointer distinguishable from a pointer to any object.

	* Note that the null pointer need not be represented by the same
	bit pattern as the integer 0.

[unquote]

(Please note: only *constant* expressions that evaluate to zero can be
converted to that pointer commonly called the null pointer.)

I claim it is presently *ill*defined in C++ when, if ever, it is legal
to dereference a null pointer.  I also claim it is presently *ill*defined
in C++ what one gets if one does dereference a null pointer.

By *ill*defined, I mean to say that I believe there is currently a flaw in
the language definition in regards this issue. IE the standardization
committee should address themselves to this issue.

I can only think of one reasonable situation where dereferencing a null
pointer might be a reasonable thing to do -- when creating a reference:

	object* ptrob = 0;

	...

	object& refob = *ptrob	// potentially, a null pointer dereference

	...

	if (&refob)
	{
		refob.DoSomething();
	}
	else
	{
		HandleNullObjectCase();
	}

Readers will immediately respond:  "But you're not *really* dereferencing a
null pointer when assigning it to a reference.  It just *looks* that way
in the C++ code.  In actuality, references are going to be implemented using
pointers, you're really just assigning a pointer to a pointer, and the 
apparent dereference of the pointer is there just to make the type calculus
work right."

All of which, I claim, are issues of code generation, not language
definition.  I still say this situation is presently *ill*defined in the
language.

Compiler writers have pointed out to me, that in practice they can't
stop programmers from creating "null references" this way -- because
you sure as hell don't want to perform a runtime test and trap anytime
a "dereferenced" pointer is used to initialize a reference.

So I say: *why not* explicitly define the language to allow dereferencing
null pointers in the one unique case of initializing a reference, and
define dereferencing null pointers to be undefined in all other cases ???

andrew@ist.CO.UK (Andrew Grant) (07/17/90)

From article <860@grenada.UUCP>, by roger@grenada.UUCP (Roger Corman):
> In article <9075@goofy.Apple.COM> escher@Apple.COM (Michael Crawford) writes:
>>Can anyone suggest strategies for handling failed memory allocation
>>in constructors?
>>
> This is the problem that has been most bothering me about C++.
> 
> How then do we test for constructor failure?
> 
> C++ provides a way of intercepting failed calls to new(), via the
> 'new() handler'.  This allows deallocation of unnecessary storage and other
> measures to be taken when memory allocation fails.  This is a very
> useful language feature, but doesn't really solve my problems.  It 
> isn't useful for automatic object allocation (on the stack) because
> new() isn't called.  True, new() is called by my constructor to allocate
> other space for object-related things, but the problem is not at this
> level.  My problem is how to handle the case when the object cannot
> be allocated (for whatever reason).  The constructor needs to be able
> to notify the routine which invoked it.  The other problem with the
> new handler is that it is not customizable on a per class basis (I
> can't have a different one for each class).
> 
If space for the object cannot be allocated, the constructor will not
be called (see final paragraph of page 235 of Lippmans C++ Primer, if 
you don't have it I recomend buying it). In the case of stack allocation, 
stack overflow handling is normally implementation dependent.

If the constructor allocates additional memory it will of course have to
handle any problems with alloacting sufficent space. The usual options are 
program termination, freeing up of some other memory, or setting some kind
of flag inside the class that member functions can test. 
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

sdm@cs.brown.edu (Scott Meyers) (07/17/90)

In article <860@grenada.UUCP> roger@grenada.uu.net (Roger Corman) writes:
| C++ provides a way of intercepting failed calls to new(), via the
| 'new() handler'.  This allows deallocation of unnecessary storage and other
| measures to be taken when memory allocation fails.  This is a very

void (*_new_handler)() isn't part of the language.  It's mentioned only in
passing in the ARM (E&S) in connection with the proposed exception handling
facilities that C++ doesn't have yet.  Instead, _new_handler is part of the
AT&T implementation of C++.  In short, you can't rely on it existing.

| I can override new(), for each class.  I suspect that this may offer
| a solution but I haven't totally thought it through.  I can live

Given you have an AT&T implementation, you could declare a static class
variable void (*class_new_handler)() and then rewrite operator new for each
class as follows:

    void *operator new(size_t size)
    {
      void *ptr;
      _new_handler = class_new_handler;  // set local handler
      ptr = ::new(size);                 // call global new
      _new_handler = default_handler;    // unset local handler
      return ptr;                        // return allocated memory
    }

| which allow for constructor failure.  It would be nice if I could
| specify, in the object definition, that it can only be created via new()
| (possible enhancement to the language?).  I typically test for 

This isn't as elegant as you'd like, but it's not too gruesome:

    class NewOnly {
    private:
      NewOnly(int params) { ... }              // private constructor

    public:
      ~NewOnly() { ... }                       // public destructor

      static NewOnly *makeNewOnly(int params)  // only way to make an object
      { return new NewOnly(params); }          // from outside the class
    };

To create an object, you say

      NewOnly *p = NewOnly::makeNewOnly(args);

and to delete it you say

      delete p;
    
Good points: it works, and the compiler will flag usage errors.
Bad points:  the syntax is inconsistent with the usual way of creating
             objects, and members and friends of NewOnly can circumvent
             the private constructor.


Scott
sdm@cs.brown.edu