[comp.std.c] references to dereferenced null pointers

rfg@ics.uci.edu (Ronald Guilmette) (03/10/90)

In article <52081@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
>...  Reading the [ C++ ] reference, I find:
>
>"A constant expression that evaluates to 0 is converted to a pointer,
>commonly called the null pointer.  It is guaranteed that this value will
>produce a pointer distinguishable from a pointer to any object."
>
>If the null pointer does not point to any object, then why should
>I assume that I can take the address of that "object" which isn't an
>object?  What guarantee is there that an object that doesn't exist
>even has an address?  And why should that address be "0" ???
>
>It may be that many present systems represent *some* C++ null pointers
>as 32 bits all zero, but this is not the same as saying making such
>an assumption is portable to future systems.  In particular, a system
>with [hardware support for] typed pointers would not have null pointers
>represented as 32 bits all zero.
>
>Also, note that the "0" that creates a null pointer must be a constant
>expression -- allowing a compiler to construct a special representation
>for null pointers at compile time.  Also note:  This means the fairly
>common "C" coding technique of assigning to a pointer a runtime expression
>that evaluates to zero is not guaranteed to be legal.  And assigning
>one null pointer to a different type of pointer need not keep the same
>bit representations.

Jim,

I see that I should not have been nearly so cavalier in response to your
questions about "null references".  You raise some important questions.

Note however that many of these questions apply equally well to the C standard.
For instance, in C, should the following be considered legal?

	void *vp;

	vp = 0;
	vp = &(*vp);

Also, you noted the obscure restriction that a null pointer must be
something created from "an integral constant expression with the value 0"
(see ANSI C standard 3.2.2.3).  This could be an issue for both C and
C++, since it would seem to place in doubt the validity of the following:

	int nil ()
	{
		return 0;
	}

	...

		void *vp;

		vp = (void *) nil ();

I am cross posting this to comp.std.c just to see if anyone there will
clarify for us the legality (or illegality) of the two examples above
(and possibly also the relevant rationale).

>So, I interpret your response to be: "This is presently undefined,
>but who cares, its trivial."  Well, I care, and I claim its not
>a trivial issue, but rather has important impact for the mapping of
>C++ onto systems with typed pointers.  If making null references
>*is* legal, let someone in the know state so uneqivocally, and I
>can get on with my programming.  I'd like to see this made explicetly
>legal -- but doing so may have a sharp negative impact on systems
>with typed pointers.

At the very least, these issues should be clarified a bit, both for C++
and for C.


// Ron Guilmette (rfg@ics.uci.edu)
// C++ Entomologist
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

henry@utzoo.uucp (Henry Spencer) (03/11/90)

In article <25F8D2FB.10981@paris.ics.uci.edu> rfg@ics.uci.edu (Ronald Guilmette) writes:
>...noted the obscure restriction that a null pointer must be
>something created from "an integral constant expression with the value 0"
>(see ANSI C standard 3.2.2.3).  This could be an issue for both C and
>C++, since it would seem to place in doubt the validity of the following:
>
>	int nil ()
>	{
>		return 0;
>	}
>
>	...
>
>		void *vp;
>
>		vp = (void *) nil ();

This code is illegal in C, and always has been.  There is no general
relationship between the integer value 0 and the null pointer.  An
integer *constant* value equal to zero in a pointer context is automatically
converted to a null pointer; this may be a non-trivial conversion, and is
guaranteed valid only at compile time.  The result of converting the
integer value 0 to a pointer at run time is implementation-defined.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

henry@utzoo.uucp (Henry Spencer) (03/11/90)

In article <1990Mar11.015305.28264@utzoo.uucp> I wrote:
>This code is illegal in C, and always has been... The result of converting the
>integer value 0 to a pointer at run time is implementation-defined.

Oops; change "illegal" to "naive, improper, unreliable, and unportable".
It's thoroughly bad code, but it is not actually illegal, as my own
further comments implied.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

haug@almira.uucp (Brian R Haug) (03/12/90)

In article <1990Mar11.015305.28264@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>In article <25F8D2FB.10981@paris.ics.uci.edu> rfg@ics.uci.edu (Ronald Guilmette) writes:
>>...noted the obscure restriction that a null pointer must be
>>something created from "an integral constant expression with the value 0"
>>(see ANSI C standard 3.2.2.3).  This could be an issue for both C and
>>C++, since it would seem to place in doubt the validity of the following:
>>
>>	int nil ()
>>	{
>>		return 0;
>>	}
>>
>>	...
>>
>>		void *vp;
>>
>>		vp = (void *) nil ();
>
>This code is illegal in C, and always has been.  There is no general
>relationship between the integer value 0 and the null pointer.  An
>integer *constant* value equal to zero in a pointer context is automatically
>converted to a null pointer; this may be a non-trivial conversion, and is
>guaranteed valid only at compile time.  The result of converting the
>integer value 0 to a pointer at run time is implementation-defined.

Henry,
    I beg to differ.  I see where you are coming from with the wording:
"an integral constant expression with the value 0", but in K&R First edition,
page 97, there is an explicit example which 0 is returned by a function
returning a pointer to a character.  Also, from the text "C guarantees that no
pointer that validly points at data will contain zero, so a return value of
zero can be used to signal an abnormal event,..."
    Also, from Appendix A, section 7.7 (Equality operators) "A pointer may be
compared to an integer, but the result is machine dependent unless the integer
is the constant 0.  A pointer to which 0 has been assigned is guaranteed no to
point to any object, and will appear to be equal to 0; in conventional usage,
such a pointer is considered to be null."  This implies to me that I can
compare a pointer to any arbitrary expression which evaluates to 0 and have
machine independent behavior.  And from section 7.14, "it is guaranteed that
the assignment of the constant 0 to a pointer will produce a null pointer
distinguishable from a pointer to any object."  Since the "BNF" for this
section lists "lvalue = expression," I would assume that if expression
evaluates to 0 then the pointer will produce a null pointer distinguishable
from a pointer to any object.
    As far as ANSI C goes, I do not have easy access to the standard at the
current time, nor the experience reading it.  However, I would have expected
this behavior to be brought forward (hopefully not a silly idea, but we'll
see).
     This was a quick scan through K&R, so I hope I have not taken anything
out of context.  The only reason this objection stuck in my mind is that I had
once thought of doing an implementation where NULL != 0, but after further
reading convinced myself the implementation would be invalid.

			Share and Enjoy!

			      Brian Haug

Disclaimer:  These opinions are mine alone.

tim@nucleus.amd.com (Tim Olson) (03/12/90)

In article <1990Mar11.222634.2701@almira.uucp> haug@Columbia.NCR.COM (Brian Haug) writes:
| In article <1990Mar11.015305.28264@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
| >This code is illegal in C, and always has been.  There is no general
| >relationship between the integer value 0 and the null pointer.  An
| >integer *constant* value equal to zero in a pointer context is automatically
| >converted to a null pointer; this may be a non-trivial conversion, and is
| >guaranteed valid only at compile time.  The result of converting the
| >integer value 0 to a pointer at run time is implementation-defined.
| 
| Henry,
|     I beg to differ.  I see where you are coming from with the wording:
| "an integral constant expression with the value 0", but in K&R First edition,
| page 97, there is an explicit example which 0 is returned by a function
| returning a pointer to a character.

That is a correct use.  Just as in an assignment to a pointer type,
the compiler knows that it is to coerce the integral constant 0 to the
machine-specific nil representation for that pointer type.

This differs from the case that was being discussed, in that the
return value is specifically declared to be a pointer type.  In the
previous discussion, the return value was declared "int".


|     Also, from Appendix A, section 7.7 (Equality operators) "A pointer may be
| compared to an integer, but the result is machine dependent unless the integer
| is the constant 0.  A pointer to which 0 has been assigned is guaranteed no to
	 ^^^^^^^^^^
| point to any object, and will appear to be equal to 0; in conventional usage,
| such a pointer is considered to be null."  This implies to me that I can
| compare a pointer to any arbitrary expression which evaluates to 0 and have
| machine independent behavior.

Not unless it is a constant expression that evaluates to 0.  Constant
expressions are a subset of expressions that involve only integral
constants and certain operators -- function calls are not allowed.


	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)

henry@utzoo.uucp (Henry Spencer) (03/13/90)

In article <1990Mar11.222634.2701@almira.uucp> haug@Columbia.NCR.COM (Brian Haug) writes:
>    I beg to differ.  I see where you are coming from with the wording:
>"an integral constant expression with the value 0", but in K&R First edition,
>page 97, there is an explicit example which 0 is returned by a function
>returning a pointer to a character...

K&R1 was kind of sloppy about this, partly because the pdp11 let you get
away with a lot.  However, the definition of C was the reference manual,
not the textbook...

>    Also, from Appendix A, section 7.7 (Equality operators) "A pointer may be
>compared to an integer, but the result is machine dependent unless the integer
>is the constant 0.  A pointer to which 0 has been assigned is guaranteed no to
>point to any object, and will appear to be equal to 0; in conventional usage,
>such a pointer is considered to be null."  This implies to me that I can
>compare a pointer to any arbitrary expression which evaluates to 0 and have
>machine independent behavior...

How do you conclude this from a statement which explicitly says that the
comparison is machine-dependent unless the *constant* 0 is used?

>And from section 7.14, "it is guaranteed that
>the assignment of the constant 0 to a pointer will produce a null pointer
>distinguishable from a pointer to any object."  Since the "BNF" for this
>section lists "lvalue = expression," I would assume that if expression
>evaluates to 0 then the pointer will produce a null pointer distinguishable
>from a pointer to any object.

Again, you're ignoring the fact that the text quite explicitly calls for
the *constant* 0 to be used.  The BNF defines only the syntax, not the
semantics and type structure; it must be read together with the text to
understand the language.

>    As far as ANSI C goes, I do not have easy access to the standard at the
>current time, nor the experience reading it.  However, I would have expected
>this behavior to be brought forward...

ANSI C carries forward the rules from K&R1 and later references:  a null
pointer is an integer *constant* zero converted to a pointer type.  ANSI
does permit a *constant* expression which evaluates to zero, e.g. "1-1",
but that does not remove the requirement that the value be known at
compile time.

>... I had
>once thought of doing an implementation where NULL != 0, but after further
>reading convinced myself the implementation would be invalid.

Not invalid, no; I think there are one or two such.  Unwise, yes, because
many many programs are sloppy about this and have to be fixed to run on
such an implementation.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gordon@sneaky.UUCP (Gordon Burditt) (03/13/90)

Implementations in which the value of a null pointer is not zero are
legal, but painful.  

>    I beg to differ.  I see where you are coming from with the wording:
>"an integral constant expression with the value 0", but in K&R First edition,
>page 97, there is an explicit example which 0 is returned by a function

The example shows the integer constant 0 being returned from a function that
returns a pointer to character.  Fine.  The compiler can convert it
to:
	movl	#0xdeadbee1,d0
	rts
at compile time.  The compiler knows it has to convert 0 in a return statement
to the type of the return type of the function.

>returning a pointer to a character.  Also, from the text "C guarantees that no
>pointer that validly points at data will contain zero, so a return value of
>zero can be used to signal an abnormal event,..."

"contain 0": that combination of bits that results from executing:
	p = 0;
commonly implemented by:
	movl	#0xdeadbee1,_p

>    Also, from Appendix A, section 7.7 (Equality operators) "A pointer may be
>compared to an integer, but the result is machine dependent unless the integer
>is the constant 0.  A pointer to which 0 has been assigned is guaranteed no to
        ^^^^^^^^^^
>point to any object, and will appear to be equal to 0; in conventional usage,

"appear to be equal to zero" means p == 0 is true, not something involving
peeking at bits in a machine register.

>such a pointer is considered to be null."  This implies to me that I can
>compare a pointer to any arbitrary expression which evaluates to 0 and have

It sure doesn't imply that to me.  The zero being discussed is constant 0, 
not an int variable containing 0.  The comparison p == 0 (to yield a "boolean" 
value, not a branch) might be translated as:

	cmpl	#0xdeadbee1,_p
	jeq	.L5001
	movl	#0,d0
	jmp	.L5002
.L5001: movl	#1,d0
.L5002:

Slow, yes.  But a perfectly legal implementation.

>machine independent behavior.  And from section 7.14, "it is guaranteed that
>the assignment of the constant 0 to a pointer will produce a null pointer
                       ^^^^^^^^^^
>distinguishable from a pointer to any object."  Since the "BNF" for this
>section lists "lvalue = expression," I would assume that if expression
>evaluates to 0 then the pointer will produce a null pointer distinguishable
>from a pointer to any object.

Where do you get that assumption?  It is legal to assign an integer expression
to a pointer, but that doesn't make it machine-independent.  The BNF
certainly doesn't say anything explicitly about the machine-dependent
characteristics of addition with overflow, division by zero, or shifts by
negative amounts, so why should it be expected to say something in this
case?  

Just before the section you quoted, it says, "The compilers currently allow 
a pointer to be assigned to an integer, an integer to a pointer, and a 
pointer to a pointer of another type.  The assignment is a pure copy 
operation, with no conversion.  This usage is unportable, and may produce 
pointers which cause addressing exceptions when used."

Since "an integer variable containing zero" does not fit the exception
for the integer constant zero, assigning one to a pointer IS unportable.

>    As far as ANSI C goes, I do not have easy access to the standard at the
>current time, nor the experience reading it.  However, I would have expected
>this behavior to be brought forward (hopefully not a silly idea, but we'll
>see).

ANSI C seems quite careful NOT to require that the representation of a
null pointer is all zero bits.

>     This was a quick scan through K&R, so I hope I have not taken anything
>out of context.  The only reason this objection stuck in my mind is that I had
>once thought of doing an implementation where NULL != 0, but after further
>reading convinced myself the implementation would be invalid.

Don't confuse NULL and the bits used to represent a null pointer.
NULL == 0 must be true.  assembly_language_representation_of(NULL) == 0
need not be true.  An implementation in which ((char *) NULL) == 0 and 
((char *) NULL) == 0xdeadbee1 are both true is possible, provided that 
assembly-language address 0xdeadbee1 isn't a possible address for a variable.

						Gordon L. Burditt
						sneaky.lonestar.org!gordon

henry@utzoo.uucp (Henry Spencer) (03/14/90)

In article <1990Mar12.175613.12082@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>>... in K&R First edition,
>>page 97, there is an explicit example which 0 is returned by a function
>>returning a pointer to a character...
>
>K&R1 was kind of sloppy about this, partly because the pdp11 let you get
>away with a lot...

Oops, I goofed.  (Yesterday was a bad day...)  There is no problem with
this.  The function is known to return pointer to character, so the
"return(0)" [actually "return(NULL)", prefaced by "#define NULL 0"]
gets compiled as if it were "return((char *)0)" and everything is fine.
By the time the value reaches the caller, it has already been converted
to a pointer.  Automatic conversion in "return" has been in the language
since well before K&R1 (I can remember when it arrived, but that was a
*long* time ago).

On closer inspection of K&R1, at the prodding of a friend, no, K&R1 is
*not* sloppy about this.  I can't find any instance where the rules are
being violated.  Conversion of zero to pointer is always conversion of the
constant 0.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

guy@auspex.auspex.com (Guy Harris) (03/14/90)

>    I beg to differ.  I see where you are coming from with the wording:
>"an integral constant expression with the value 0", but in K&R First edition,
>page 97, there is an explicit example which 0 is returned by a function
>returning a pointer to a character.

Since the compiler knows that the function is supposed to return a "char
*", the compiler knows that the integral constant with the value 0 being
returned by that function should be converted to a null pointer-to-char.
In other words, said function doesn't return an integral value of 0, it
returns a null pointer-to-char.  This conversion is no different from
the conversion in

	char *p;

	p = 0;

or

	extern void foo(char *p);

	foo(0);

or

	char *p;

	if (p == 0)
		...

or even

	char *p;

	if (!p)		/* equivalent to previous example */
		...

The same applies to:

>Also, from the text "C guarantees that no
>pointer that validly points at data will contain zero, so a return value of
>zero can be used to signal an abnormal event,..."

since if your function returns a pointer value, it had better be defined
as doing so....

ken@argus.UUCP (Kenneth Ng) (03/14/90)

In article <1990Mar12.175613.12082@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
: ANSI C carries forward the rules from K&R1 and later references:  a null
: pointer is an integer *constant* zero converted to a pointer type.  ANSI
: does permit a *constant* expression which evaluates to zero, e.g. "1-1",
: but that does not remove the requirement that the value be known at
: compile time.
[edit]
: >once thought of doing an implementation where NULL != 0, but after further
: >reading convinced myself the implementation would be invalid.
: Not invalid, no; I think there are one or two such.  Unwise, yes, because
: many many programs are sloppy about this and have to be fixed to run on
: such an implementation.

I'm confused, is a non zero NULL pointer valid or not?  I'm not asking if
it will break 90% of the programs out there that use 0 instead of NULL.
On a 370 here I'd love to define NULL as -1 because it will cause an
immediate addressing exception if it is referenced.  But, I was told that
NULL is defined as always being the value zero.

: MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
: an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu


-- 
Kenneth Ng: Post office: NJIT - CCCC, Newark New Jersey  07102
uucp !andromeda!galaxy!argus!ken *** NOT ken@bellcore.uucp ***
bitnet(prefered) ken@orion.bitnet  or ken@orion.njit.edu

henry@utzoo.uucp (Henry Spencer) (03/15/90)

In article <1623@argus.UUCP> ken@argus.UUCP (Kenneth Ng) writes:
>I'm confused, is a non zero NULL pointer valid or not?  I'm not asking if
>it will break 90% of the programs out there that use 0 instead of NULL.
>On a 370 here I'd love to define NULL as -1 because it will cause an
>immediate addressing exception if it is referenced.  But, I was told that
>NULL is defined as always being the value zero.

NULL is just a convenient way of writing the constant 0, for all practical
purposes.  It is *not*, by itself, a null pointer, because there is no
"generic null pointer" type.  NULL has to be converted to a specific pointer
type to become a null pointer.  That (compile-time) conversion may well
change the representation in some strange way.  `NULL' must remain 0 (or a
close equivalent), but `(foo *)NULL' can be a different story.

There is absolutely nothing wrong with having a pointer representation in
which the bit pattern for a null pointer is not all zeros... except that
there are a lot of old, badly-written programs which will break.  Thus my
earlier comment that it is valid but unwise.
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

guy@auspex.auspex.com (Guy Harris) (03/15/90)

>I'm confused, is a non zero NULL pointer valid or not?

The confusion stems from confusion over the meaning of "non-zero".

4.1.5:

	...

	The macros are

	     NULL

	which expands to an implementation-defined null pointer
	constant; ...

3.2.2.3:

	...

	   An integral constant expression with the value 0, or such an
	expression cast to type "void *", is called a *null pointer
	constant*.

So if some implementation defines NULL as something other than 0 or
(void *)0 or (17 - 17) or ((void *)(17 - 17)) or..., the implementation
is not a valid C implementation. 

*However*, this does *not* mean that the implementation must *represent*
a null pointer of any type as a bit string of all zero bits.

>I'm not asking if it will break 90% of the programs out there that
>use 0 instead of NULL.

Defining NULL as something other than the values listed (or implied...)
and having that be the only null pointer constant will break all of the
programs out there at use 0 instead of NULL.  Thus, doing so would be
invalid.

Representing a null pointer constant as something other than a bit
string of all zero bits will not break *any* valid ANSI C program.  It
will break some *in*valid programs, which is why Henry described it as
"unwise" even though it's completely valid.

>On a 370 here I'd love to define NULL as -1 because it will cause an
>immediate addressing exception if it is referenced.  But, I was told that
>NULL is defined as always being the value zero.

If by "define NULL as -1" you mean putting in something like

	#define	NULL	-1

that would, indeed, be invalid.

If you mean changing your 3*0 C implementation such that null pointers
are represented by 32 1 bits, *including making sure that the statements

	char *p;

	p = 0;

assigns a value of 32 1 bits to "p"* (this is a bit that confuses a lot
of people who think that the "p = 0;" is obliged to assign an
all-zero-bits value to "p"), then it would be valid as long as
you make sure this representation convention is followed consistently;
you also have to make sure that non-C code that calls C code or is
called by C code obeys this convention.

karl@haddock.ima.isc.com (Karl Heuer) (03/15/90)

In article <1990Mar14.164539.23685@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>There is absolutely nothing wrong with having a pointer representation in
>which the bit pattern for a null pointer is not all zeros... except that
>there are a lot of old, badly-written programs which will break.  Thus my
>earlier comment that it is valid but unwise.

Note that "p = 0", "p == 0", "!p", "char *f() { return 0; }" are *not*
examples of such badly-written code; they may be bad style, but the compiler
is required to generate correct code involving a true null pointer.  The only
"dangerous" context (other than hacking with unions and such) is when a null
pointer constant is being passed as an argument to a function.  (In C++ and
ANSI C, any argument not covered by a prototype.  In old C, any function
argument at all.)  In particular, neither of the two calls
	execl("/bin/sh", "sh", "-i", 0);
	execl("/bin/sh", "sh", "-i", NULL);
is correct; it should be written as either of
	execl("/bin/sh", "sh", "-i", (char *)0);
	execl("/bin/sh", "sh", "-i", (char *)NULL);

But this problem can occur even without strange null pointers: such sloppy
code will already break on certain implementations where pointers and ints
have different lengths.

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint
Followups to comp.lang.c.

randall@uvaarpa.virginia.edu (Randall Atkinson) (03/15/90)

Henry's observation on MSDOS and Doug's comments about the
shortcomings of the Intel architecture are both very well taken.

On at least one such system that I am forced to deal with regularly,
the NULL pointer is internally represented as address F000:0000 hex.
This invariably trips up newcomers to the compiler in question,
which compiler is originally from Washington State (yes, them :-(.

ray@philmtl.philips.ca (Raymond Dunn) (03/16/90)

>In article <1623@argus.UUCP> ken@argus.UUCP (Kenneth Ng) writes:
>>But, I was told that NULL is defined as always being the value zero.

All the things that have been said about NULL and 0 and the *internal*
hidden actual value perhaps being different are of course true.

Perhaps Kenneth was actually looking for a simpler answer though.  In any
program, you are free to use *any* value you like to represent *your* value
of the illegal pointer.

So long as you don't define NULL to be this, and so long as you don't pass it
to library routines and expect it to be equivalent to passing NULL, or
expect library routines to return that value, you will have no problems.

As an example you could define:

#define NOPTR(type) ((type *)-1)

and use for example NOPTR(char) within your own code with the restrictions
stated.  It is still of course legal to pass such a value to standard
routines, but if they process NULL gently, they probably wont process your
NOPTR gently, but will indeed generate an exception.
-- 
Ray Dunn.                    | UUCP: ray@philmtl.philips.ca
Philips Electronics Ltd.     |       ..!{uunet|philapd|philabs}!philmtl!ray
600 Dr Frederik Philips Blvd | TEL : (514) 744-8200  Ext : 2347 (Phonemail)
St Laurent. Quebec.  H4M 2S9 | FAX : (514) 744-6455  TLX : 05-824090

karl@haddock.ima.isc.com (Karl Heuer) (03/16/90)

In article <12347@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>	#define NULL ((void*)(1-1))
>	#define NULL 0L
>	#define NULL ((void*)0)
>	#define NULL 0
>[All of the above are legal, though] the first example is silly.  I
>personally recommend the last example, which (despite some arguments you will
>get from certain IBM PC implementors) is always a correct way to define NULL
>and is simpler than the others.  However, the next-to-last example does have
>one advantage, namely that it will cause diagnostics to be generated for more
>instances of abuse of the NULL macro than will the last example.

Another idea I've been toying with is
	#define NULL __builtin_NULL
where __builtin_NULL is a keyword that, in a pointer context, acts like a
properly-typed null pointer constant (i.e. just like `0' does), and in a non-
pointer context, causes a diagnostic to be issued.  This is "even better" than
the ((void *)0) definition since it should catch *all* abuses of the macro,
though it does of course depend on having a hook in the compiler.

Strictly speaking, NULL is supposed to be defined as 0 or a casted 0, but I
think this would be legal by the as-if rule.  (Since no correct program could
tell the difference.)

(It is perhaps worth mentioning again that none of this has anything to do
with the internal representation of a null pointer, which may or may not have
all bits zero.)

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint

chris@mimsy.umd.edu (Chris Torek) (03/16/90)

In article <1136@philmtl.philips.ca> ray@philmtl.philips.ca
(Raymond Dunn) writes:
>... In any program, you are free to use *any* value you like to represent
>*your* value of the illegal pointer.

No, not really.

>As an example you could define:
>
>#define NOPTR(type) ((type *)-1)

The conversion from any integral constant other than zero to any
pointer type is up to the implementation.  Most often, this means
that for any

	#define XXX ((mytype *)some_int_constant)

you find two problems: the constant you picked happens to be equal
to a valid pointer, and/or: the constant you picked causes the run
time system to phone your mother and leave a nasty message on her
answering machine.  (Well, not really. :-)  But something like
`NOPTR(char)' may cause a run-time fault---a `bus error (core dumped)'
sort of error.)

Incidentally, the reason for the latter is that the system is allowed
to trap as soon as an invalid pointer is examined or created.

Again, the only solidly defined conversion from integral constant to
pointer is for zero: the integral constant zero becomes, in a pointer
context, a nil pointer of the appropriate type.  (In addition, every
nil pointer is in principle different, except for nil pointer to char
vs. nil pointer to void.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

roberto@ssd.csd.harris.com (Roberto Shironoshita) (03/17/90)

In article <1623@argus.UUCP> ken@argus.UUCP (Kenneth Ng) writes:
> I'm confused, is a non zero NULL pointer valid or not?  I'm not asking if
> it will break 90% of the programs out there that use 0 instead of NULL.
> On a 370 here I'd love to define NULL as -1 because it will cause an
> immediate addressing exception if it is referenced.  But, I was told that
> NULL is defined as always being the value zero.

The Dec. 88 draft states that <stddef.h> defines the macro NULL, which
expands to an implementation-defined null pointer constant (section 4.1.5).

\begin{quote}
	An integral constant expression with the value 0, or such an
	expression cast to type void *, is called a _null pointer
	constant_. (...)
\end{quote}


So the implementation has two choices:

	#define NULL	0
or
	#define NULL	(void *)0

(where '0' stands for an integral constant expression that evaluates
to 0).

Your compiler is free to use whatever address it pleases when
translating code, so long as the program does not consider it a
valid address.  Note that dereferencing a NULL pointer causes
undefined behavior.
--
                               ||   Internet: shirono@ssd.csd.harris.com
     Roberto Shironoshita      ||
      Harris Corporation       ||             ...!novavax-\
   Computer Systems Division   ||   UUCP:                  -!hcx1!shirono
                               ||             ...!uunet---/
DISCLAIMER: The opinions expressed here are my own; they in no way reflect the
            opinion or policies of Harris Corporation.

gwyn@smoke.BRL.MIL (Doug Gwyn) (03/17/90)

In article <1136@philmtl.philips.ca> ray@philmtl.philips.ca (Raymond Dunn) writes:
>As an example you could define:
>#define NOPTR(type) ((type *)-1)

A conforming implementation is not required to support this.
It is much better to simply use the implementation-defined NULL
macro or simply (type*)0.

throopw@sheol.UUCP (Wayne Throop) (03/18/90)

> From: guy@auspex.auspex.com (Guy Harris)
>> I'm confused, is a non zero NULL pointer valid or not?
> The confusion stems from confusion over the meaning of "non-zero".

While what Guy says in explanation is certainly correct, I think
that the confusion is over the meaning of "NULL pointer", not
non-zero.  That is, Guy explained things thoroughly, but the basis
of the problem is the distinction between properties of a name
of a thing and properties of the thing itself.

In the phrase "NULL pointer" in C, there are TWO levels of naming
going on, either one of which could have the property "zeroness".

Or let me put it this way.

In C, the name of the nil pointer is called "NULL".

But that's only what the name is CALLED, you see.

The NAME of the nil pointer is "0".

The nil pointer itself can have any bit pattern it pleases. 



The above sequence of statements explains
   1) why NULL should only be used as a pointer (this is a convention).
      It shouldn't be used to mean ascii NUL
   2) why the only proper definition of the macro NULL is the string "0".
      (or, actually, any way to spell the constant zero in C)
      (though, granted, the advent of ANSI C means that the best
       definition of NULL is arguably the string "((void*)0)" and
       variants on this theme.  Nevertheless "0" remains proper.)
   3) in just what way C's nil pointer may have a non-zero bit pattern.

It omits explanations of why one should never (in K&R1 C) pass NULL
as an argument without a cast to a specific pointer type.

It also omits explanation of the fact that there isn't really one
nil pointer, or (necessarily) one nil pointer bit pattern.  (This
is due to the fact that all pointer values in C have specific
types.)

So, to conclude, "NULL", in C should never have a "non-zero" definition.

Any particular nil pointer value (eg, one named ((void*)0)) can have any
bit pattern the implementor of a C language system chooses. 

--
Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

henry@utzoo.uucp (Henry Spencer) (03/18/90)

In article <ROBERTO.90Mar16130444@ecx1.ssd.csd.harris.com> shirono@ssd.csd.harris.com writes:
>	An integral constant expression with the value 0, or such an
>	expression cast to type void *, is called a _null pointer
>	constant_. (...)

Note, also, that the cast to `void *' is legal not because it somehow
creates a generic null pointer -- there is no such thing in C -- but
because there may not be an integer type of the same size as a pointer,
and breakage of old programs is minimized if NULL is the same size as
a pointer.  (Even on machines where pointers are not all the same size,
one can at least reduce the breakage somewhat this way.)
-- 
MSDOS, abbrev:  Maybe SomeDay |     Henry Spencer at U of Toronto Zoology
an Operating System.          | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

rfg@ics.uci.edu (Ronald Guilmette) (03/18/90)

In article <10582@alice.UUCP> shopiro@alice.UUCP (Jonathan Shopiro) writes:
>
>
>To me the interesting question in this null pointer business is
>whether there is any circumstance where it is legal to say *p
>where p is a null pointer.
...
>I think the fundamental issue here is when is a pointer dereferenced?
>(Since it is clearly illegal to dereference the null pointer).  I don't
>se why writing
...
>
>	&*p
>
>should dereference p.

I think this discussion is comming full circle.

If you scan backwards over the (20-30) postings on this issue you'll note that
I said almost exactly this same thing 20-30 messages ago.

It was quickly pointed out to me that I was I was being naive (which I now
freely admit that I was) because the question becomes confusing if p
contains a NULL pointer value.

Perhaps we should just take a vote and put the issue to bed.


// Ron Guilmette (rfg@ics.uci.edu)
// C++ Entomologist
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

raeburn@athena.mit.edu (Ken Raeburn) (03/23/90)

In article <ROBERTO.90Mar16130444@ecx1.ssd.csd.harris.com>,
roberto@ssd.csd.harris.com (Roberto Shironoshita) writes:
[Dec '88 dpANS says:]
> \begin{quote}
> 	An integral constant expression with the value 0, or such an
> 	expression cast to type void *, is called a _null pointer
> 	constant_. (...)
> \end{quote}
> 
> So the implementation has two choices:
> 
> 	#define NULL	0
> or
> 	#define NULL	(void *)0

A small point of the logic of these arguments is nagging at me.  Does the
standard say that values equivalent to (void *)0 in pointer contexts
cannot be used for the definition of NULL?

Just to be weird, let us consider a machine/compiler combination which has
a integral type that has more bits than a pointer uses.  Say the machine
uses 24 bits for addressing, and a "long int" provides 32.  Could not
"(void *) 0xc0000000" be used for NULL?  The compiler would have to
understand that the top bits should be stripped out for comparisons, or
the architecture could provide a pointer-compare instruction; would it
need to provide anything else?  Or is this simply disallowed because the
standard is worded wrong?  Or is there something fundamental about it I am
missing that would make it fail miserably?

It would probably be a strange architecture that would do things this way,
but I expect there are some strange architectures out there, each with its
own reasons.

-- Ken

jc@atcmp.nl (Jan Christiaan van Winkel) (03/23/90)

From article <1990Mar23.072132.7307@athena.mit.edu>, by raeburn@athena.mit.edu (Ken Raeburn):
> a integral type that has more bits than a pointer uses.  Say the machine
> uses 24 bits for addressing, and a "long int" provides 32.  Could not
> "(void *) 0xc0000000" be used for NULL?  The compiler would have to
> 
> It would probably be a strange architecture that would do things this way,
> but I expect there are some strange architectures out there, each with its
> own reasons.

I believe the Apple Macintosh used to use the upper 8 bits of a pointer
to store information like relocatable or something. Now they've made
their system "32-bit clean" and pointers are the full 32 bits.
Note that I do not want to imply that the Mac has a strange architecture,
nor that it doesn't... :-)

JC.

guy@auspex.auspex.com (Guy Harris) (03/24/90)

>Does the standard say that values equivalent to (void *)0 in pointer
>contexts cannot be used for the definition of NULL?

Even if it did, what would the point be in doing so?  If the value truly
is equivalent, why is it any better than 0 or "(void *)0"?

karl@haddock.ima.isc.com (Karl Heuer) (03/24/90)

In article <1990Mar23.072132.7307@athena.mit.edu> Ken Raeburn <Raeburn@MIT.EDU> writes:
>Could not "(void *) 0xc0000000" be used for [the macro] NULL?

Oh, I suppose you *could* (by the as-if rule)--but the compiler still has to
generate the right answer for "0" and "(void *)0", so now you've got one more
"magic string" to worry about%.  If you're going to do that, you might as well
have it expand into "__builtin_NULL", which also allows compile-time detection
of improper NULL usage&.

Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint
________
% Also, people who understand that "0" is *always* a valid definition for
  "NULL" will suspect that you don't know what you're doing.
& E.g. wrongly using NULL instead of '\0', or passing an uncasted NULL as a
  function argument not covered by a prototype.