[comp.lang.c] What is NULL? There is no right answer!

gmt@arizona.edu (Gregg Townsend) (07/13/87)

Confusion about the *concept* of NULL seems to be causing most of the problems.
Some credible definitions are:

    1.  NULL is simply shorthand for 0 and is used only for readability.

    2.  NULL is a zero pointer of type (char *), and as such NULL is a legal
        parameter to a function expecting a (char *) argument.

    3.  NULL is a zero pointer of the universal pointer type used by malloc()
        etc., which just happens to be (char *).

I can't find any historical definition that I consider authoritative.  Kernighan
& Ritchie use a privately defined NULL on page 97 but it's not clear that we
should consider that any more universal than ALLOCSIZE defined the line below.
The index entry for NULL also points to a page containing a discussion of 0 as
a pointer value, and using the lower-case adjective "null", but that's pretty
tenuous to be considered a definition.

The value of NULL that most of us use comes from <stdio.h>.  The v7 man page
for stdio(3S) defines stdin, stdout, and stderr (which are FILE *) and then
says "A constant `pointer' NULL (0) designates no stream at all."   This sounds
like a narrow definition as a (FILE *), but then the man pages for gets() and
others reference NULL in a (char *) context.

In newer references, Kernighan and Pike (p.177) say NULL is "usually" defined
as (char *)0, while talking about the return from (FILE *) fopen().  Harbison
and Steele (p.94) say it's usually defined as 0.

In the absence of a clear and visible definition, it's no wonder people have
built up concepts based on observed usage.  With many (most?) C compilers,
programs assuming ANY of the above concepts run just fine with NULL defined as
0 in <stdio.h>.  With 16-bit ints and 32-bit longs and pointers, a definition
of 0L keeps all the concepts viable.  But when pointers get bigger than longs,
or different pointers have different sizes, a definition satisfying all
assumptions becomes difficult or impossible.

After writing the above paragraphs without prejudice I planned to look at the
proposed ANSI standard [Oct 86].  After all, they've presumably spent a lot
more time wrestling with this problem than most of us.  And now what do I find? 

    (p. 85)  NULL [...] expands to an implementation-defined 
             null pointer constant;

    (p. 30)  An integral constant expression with the value 0, or such an
             expression cast to type void *, is called a null pointer constant.

So allowing for the new form of a universal pointer, they've allowed both #1 and
#3, implementation dependent.  Because (void *) 0 can be compared to any other
pointer without error, if you code assuming definition 1 you're always safe.
The rationale (Sep 86, p. 65) adds:
        [definition of NULL as (void *)0] is necessary on architectures where
        the pointer size(s) do(es) not equal the size of any integer type.
This seems to imply that NULL must always work as an argument to a (void *)
parameter (concept #3), but the rationale isn't the standard.

No wonder everyone's confused.  I learned something just researching this
message.  There is no authoritative, correct answer.  Even the proposed 
new standard is ambivalent.

     Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
     +1 602 621 4325      gmt@Arizona.EDU       110 57 17 W / 32 13 47 N / +758m

guy%gorodish@Sun.COM (Guy Harris) (07/14/87)

I can definitely tell you one thing NULL is *not*; it is not the
fundamental representation of a null pointer in C.  The official C
language representation of a null pointer of a particular type is the
integral constant 0, coerced to a pointer of that type.  This
coercion occurs automatically in comparisons:

	char *p, *q;

	/*
	 * Yes, both of these are comparisons against null pointers.
	 */
	if (p == 0 || !q)

assignments:

	p = 0;

and, if you have an implementation of C that supports function
prototypes (which, barring any massive surprises, implementations of
ANSI standard C will be once the standard is official), most, but NOT
all, procedure calls:

	/*
	 * Don't try this one at home, kids, unless you have a function
	 * prototype for "setbuf" in scope!
	 */
	setbuf(stdout, 0);

but not

	execl("/bin/sh", "sh", "-c", "rm -rf /", 0);

Unless some function prototype syntax appears that enables you to
tell the compiler that the last argument to "execl" is of type
"char *", the compiler won't be able to figure out that the last
argument should be of type "char *" and that the 0 must be coerced to
that type.  Thus, you must do

	execl("/bin/sh", "sh", "-c", "rm -rf /", (char *)0);

instead.  Furthermore, if your implementation of C doesn't support
function prototypes,

	setbuf(stdout, 0);

won't work either, since the compiler can't be told what the types of
the arguments to "setbuf" are; in this case, as well, you must do

	setbuf(stdout, (char *)0);

Whether NULL is defined as 0 or 0L or (char *)0 or (void *)0 is
irrelevant.  NULL is merely syntactic sugar around 0, which is a
token used to construct null pointer constants; there are plenty of
programs that are on a sugar-free diet and use 0 rather than NULL.
If they do not include any casts necessary to cause the
aforementioned coercions of 0 to a null pointer constant of the
appropriate type, defining NULL as 0L or (char *)0 or (void *)0 or
whatever isn't going to make those programs work any better on
machines where the bit pattern for 0 and for null pointers of some
types aren't the same.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

steele@unc.cs.unc.edu (Oliver Steele) (07/15/87)

Is there any guarantee that coercing (type1 *)0 into a (type2 *) yields
(type2 *)0 ?  In particular, might
	if (foo == 0)
differ from
	if (foo == (char *)0)
in some implementation of ANSI C where (char *) and (void *) have
different representations?

Sorry to post, but I couldn't find enough information to prove the above
(I don't have access to the draft).


------------------------------------------------------------------------------
Oliver Steele				  ...!{decvax,ihnp4}!mcnc!unc!steele
							steele%unc@mcnc.org

	"They're directly beneath us, Moriarty.  Release the piano!"

guy%gorodish@Sun.COM (Guy Harris) (07/15/87)

> Is there any guarantee that coercing (type1 *)0 into a (type2 *) yields
> (type2 *)0 ?

No.

> In particular, might
> 	if (foo == 0)
> differ from
> 	if (foo == (char *)0)
> in some implementation of ANSI C where (char *) and (void *) have
> different representations?

Well, if "foo" is of type "char *", the two are obviously the same.
If "foo" is not of type "char *", the first is a comparison of it
with a null pointers, while the second does not have a well-defined
meaning.  The only pointer conversions that are described in 3.2.2.3
"Pointers" are conversions of pointers to or from type "pointer to
void", and conversions of integral constant expressions with the
value 0 to pointer types.

I wouldn't count on the latter construct doing what you expect, or
even count on it being permitted by the compiler, if "char *" and the
type of "foo" had different representations (which could happen even
if "char *" and "void *" have different representations - "void *"
does not appear in your example, and its representation isn't
relevant).
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com