[comp.lang.c] realloc and malloc and zero-sized objects

msb@sq.com (Mark Brader) (04/05/89)

[Before posting any followup to this article, see the note at the end.]

> > > And when handed a non-NULL pointer and a zero size, [realloc] acts
> > > like free.  This behavior is required by the draft ANSI Standard.

Right.

> > Does it return NULL in this case (when it acts like free)?
> 
> It's implementation defined -- the implementation is allowed to
> return either a NULL pointer or a pointer to a zero-sized object
> (although many people find that concept quite repugnant, that's
> the way many existing implementations behave).

Perhaps right as regards existing implementations, but wrong as regards
the proposed Standard.  First, let me dispose of the "zero-sized object":

#  #1.6 Definitions of terms [excerpt, my emphasis added]
#  ...
#  Object - a region of data storage ...  Except for bit-fields, objects
#  are composed of sequences of *one or more* bytes ...

Now, I'd better cite the description of malloc first, to contrast it with
that for realloc.

#  #4.10.3.3 The malloc function
#  Synopsis
#	#include <stdlib.h>
#	void *malloc (size_t size);
#
#  Description
#  The malloc function allocates space for an object whose size is
#  specified by size and whose value is indeterminate.
#
#  Returns
#  The malloc function returns either a null pointer or a pointer to
#  the allocated space.

Remember that the general rule for out-of-range arguments, which applies
wherever there is no explicit statement to the contrary, is:

#  #4.1.6 Use of library functions [excerpt]
#  ...
#  If an argument to a function has an invalid value (such as a value
#  outside the domain of the function, or a pointer outside the address
#  space of the program, or a null pointer), the behavior is undefined.

Since 0 is not a possible size for an object, the invocation malloc(0)
falls under the general rule and is undefined behavior.  It would of
course be permissible for an implementation to define its behavior in
this circumstance; this would be an extension to the language.


The situation for realloc is somewhat different:

#  #4.10.3.4 The realloc function
#  Synopsis
#	#include <stdlib.h>
#	void *realloc (void *ptr; size_t size);
#
#  Description
#  The realloc function changes the size of the object pointed to by ptr
#  to the size specified by size.  The contents of the object shall be
#  unchanged up to the lesser of the new and old sizes.  If the new size
#  is larger, the value of the newly allocated portion is indeterminate.
#  If ptr is a null pointer, the realloc function behaves like the malloc
#  function for the specified size.

That last sentence is the one that started this thread; most existing
implementations do *not* behave that way, but the pANS requires it.

[Discussion of other invalid pointer values, giving undefined behavior,
omitted.]

#  ... If size is zero and ptr is not a null pointer, the object it
#  points to is freed.
#
#  Returns
#  The realloc function returns either a null pointer or a pointer to the
#  possibly moved allocated space.

Note the either-or.  Since a null pointer cannot point to allocated space,
this is really an exclusive or.  Since an object cannot have zero size
and also because of the last sentence in the Description section, it is
clear that realloc is *not* allocating any space if it is called as
realloc(p,0) where p is a valid non-null pointer.  Therefore realloc *must*
return a null pointer.

Therefore the answer to the question:
> > Does it return NULL in this case (when it acts like free)?
is *yes*.


We may also observe that if ptr is a null pointer *and* size is 0,
the same behavior as malloc(0) is required, and that is undefined.


Finally, some notes on C jargon.  One should speak of "null pointer", not
"NULL pointer".  "NULL" is the name of a macro which commonly contains
a "null pointer constant", which need not itself have pointer type
(in particular, 0 is a null pointer constant, yet has type int);
"null pointer" is term for the result of converting a null pointer
constant to pointer type.

Similarly, one should speak of "null character" or "the character NUL",
not "NULL character" or "NUL character" or even "ASCII NUL character".
The use of NULL here is confusing in that that word is usually used in
connection with pointers; NUL does not relate to any macros but to the
character names used with the ASCII character set.  The pANS tries to
avoid any ASCII-chauvinism and generally uses phrases like "a code of
value zero" rather than "null character".

And some acronyms.
ASCII - American Standard Code for Information Interchange.  A character
	set that proves that an inadequate standard may be better than
	none at all.  "7 bits suffice?"  Feh!
ANSI  - American National Standards Institute.  Once called American
	Standards Association, hence the "ASA" of photographic film speeds.
ANS   - American National Standard, i.e., a standard from ANSI.
        The informal term "ANSI Standard" means the same thing.
pANS  - proposed ANS - this is the current state of what we all believe
        will soon be the ANS for C.
ISO   - International Standards Organization.  We hope that after the
	pANS becomes an ANS it will then be adopted as an ISO Standard also.
Please note the different number of I's in these acronyms.



This article is cross-posted to comp.lang.c and comp.std.c, as was
the one it's a followup to, but I've directed further followups to
comp.std.c.  Followups discussing *existing* implementations should
go to comp.lang.c instead, though, and followups on the topic of how
to write a null pointer constant should not be posted at all.

-- 
Mark Brader, Toronto	"If the standard says that [things] depend on the
utzoo!sq!msb		 phase of the moon, the programmer should be prepared
msb@sq.com		 to look out the window as necessary."  -- Chris Torek

This article is in the public domain.

msb@sq.com (Mark Brader) (04/10/89)

[I'm posting this back to comp.lang.c as well as comp.std.c because the
 second part of the article discusses an "existing implementations" topic.]

Well, I should have known better.  Usually when I post articles citing
the proposed Standard (pANS), they deal with a topic that is of importance
to me, and thus one on which I've read the pANS carefully at some time.
I'm not a fan of zero-sized objects, and elected to post anyway, and I
got it wrong.

I wrote:

> > > Does [realloc] return NULL ... when it acts like free ...?
> > It's implementation defined -- the implementation is allowed to
> > return either a NULL pointer or a pointer to a zero-sized object
> Perhaps right as regards existing implementations, but wrong as regards
> the proposed Standard.

This is still correct.  However, I then cited #1.6 to show that zero-sized
objects are prohibited in pANS C, cited #4.10.3.3 to show that malloc(0) is
an attempt to allocate such an object, and used the general rule in #4.1.6
on argument values out of their domain, to conclude that:

> Since 0 is not a possible size for an object, the invocation malloc(0)
> falls under the general rule and is undefined behavior.  It would of
> course be permissible for an implementation to define its behavior in
> this circumstance; this would be an extension to the language.

Bzzt!  Jerry Schwarz of Bell Labs sent email, and Larry Jones posted in
comp.std.c, both pointing out my error.  Before concluding that the general
rule (#4.1.6) applied because #4.10.3.3 didn't have anything to override it,
I should also have looked at the higher-level sections #4.10.3 and #4.10.
Had I done so, I would have found:

#  #4.10.3 Memory management functions
#
#  ... If the size of the space requested is zero, the behavior is
#  implementation-defined; the value returned shall be either a null
#  pointer or a unique pointer.

However, contrary to Larry's article, I claim that this still affects only
malloc(0) and realloc(0,0).  [I use here the terse syntax valid only if
there's a prototype in scope.]  The case of realloc(p,0) where p isn't a
null pointer is covered explicitly.  As I said before, #4.10.3.4 says
this about that:

#	void *realloc (void *ptr, size_t size);
#  ... If size is zero and ptr is not a null pointer, the object it
#  points to is freed.

I claim that this is defining this situation *not* to be a request to
allocate space; in that case the words of #4.10.3 do not apply, and so
realloc() must not return a unique pointer, and so it must return a
null pointer.


By the way, I put a semicolon in place of the comma in the prototype syn-
tax last time.  This was a typo induced by having used Algol in the past.
I personally would have preferred the Algol-like syntax there for
delimiting arguments, even though I prefer C syntax generally.


I also wrote:

> [Discussion of other invalid pointer values, giving undefined behavior,
> omitted.]

Then, yesterday, I bought a copy of Andrew Koenig's "C Traps and Pitfalls",
which looks like worthwhile reading for most people who ask questions in
these newsgroups.  And people who answer questions wouldn't do badly to read
it either, especially if they ever get the answers wrong.  And besides, it
mentions my name, so it must be good!  :-)

The book quotes the V7 UNIX manual as follows:

!  Realloc also works if ptr points to a block freed since the last call
!  of realloc, malloc, or calloc.

He goes on to point out that this behavior is descended from a still
earlier form of realloc() that *required* realloc(p,n) to be preceded
by free(p).

Existing implementations now often do not support this behavior, and the
pANS, in the section I skipped over, explicitly makes such a sequence of
calls undefined behavior.


#  #4.10.3 Memory management functions
#  ... The value of a pointer that refers to freed space is indeterminate.

#  #4.10.3.4 The realloc function
#
#  ... Otherwise [i.e. if ptr isn't a null pointer], if ptr does not match
#  a pointer earlier returned by the calloc, malloc, or realloc function,
#  or if the space has been deallocated by a call to the free or realloc
#  function, the behavior is undefined.


Now, as I said last time (and this time I mean it!)....
This article is cross-posted to comp.lang.c and comp.std.c, as was
the one it's a followup to, but I've directed further followups to
comp.std.c.  Followups discussing *existing* implementations should
go to comp.lang.c instead, though, and followups on the topic of how
to write a null pointer constant should not be posted at all.
(To which I now add, likewise for followups on the topic of what syntax
should have been adopted for delimiters in prototypes!)

-- 
Mark Brader, Toronto		"The singular of 'data' is not 'anecdote.'"
utzoo!sq!msb, msb@sq.com				-- Jeff Goldberg

This article is in the public domain.