[comp.lang.c] the nil pointer is not zero

chris@mimsy.umd.edu (Chris Torek) (11/13/90)

In article <s64421.658489804@zeus> s64421@zeus.usq.EDU.AU (house ron) writes:
>Zero is actually a perfectly legitimate address, but it got snaffued
>by C to stand for a NULL pointer.

This is true in many *implementations*, but it is not part of C itself.
A nil pointer is a pointer that points to no object.  The actual bit
pattern used to represent a nil pointer at runtime is up to the
implementation, may (but is not required to) depend on the type of
object to which the pointer can point, and is not necessarily all 0
bits.

On computers on which address location zero has `interesting' contents
(e.g., a boot reset vector, or simply regular memory), implementations
are faced with two choices:  either the nil pointer pointer to some
type(s) must not be the bit pattern that means `address location
zero'---typically, this means that, e.g., something like 0x3fc0ee7b
must be used as the bit pattern for nil pointers---or else the
implementation must make sure no object ever happens to be assigned
address 0.  In the former case, one still writes

	char *p = 0;

to put a nil pointer (value 0x3fc0ee7b) into p.

See the Frequently Asked Questions for more details.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

stanley@phoenix.com (John Stanley) (11/14/90)

chris@mimsy.umd.edu (Chris Torek) writes:

> In article <s64421.658489804@zeus> s64421@zeus.usq.EDU.AU (house ron) writes:
> >Zero is actually a perfectly legitimate address, but it got snaffued
> >by C to stand for a NULL pointer.
> 
> This is true in many *implementations*, but it is not part of C itself.
> 
> On computers on which address location zero has `interesting' contents
> (e.g., a boot reset vector, or simply regular memory), implementations
> are faced with two choices:  either the nil pointer pointer to some

   And some operating systems will not map address 0 into the process
space, which makes 0 the NULL pointer, and dereferencing it immediately
painful. I remember reading that those systems which do not map 0
addresses are a direct result of protecting against the most common
invalid pointer reference.





<> "Aneth!  That's a charming place!" "You've been to Aneth?" 
<> "Yes, but not yet." -- The Doctor and Seth, "The Horns of Nimon".
><
<> "Sanity check!" "Sorry, we can't accept it, it's from out of state." - me

gwyn@smoke.brl.mil (Doug Gwyn) (11/14/90)

In article <27636@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
>On computers on which address location zero has `interesting' contents ...

One implementation possibility is to use the address of some reserved
object in the run-time library for the null pointer.

rja7m@hopper.cs.Virginia.EDU (Ran Atkinson) (11/15/90)

In article <27636@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
> On computers on which address location zero has `interesting' contents ...

In article <14459@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
% One implementation possibility is to use the address of some reserved
% object in the run-time library for the null pointer.

Indeed, a compiler used on a former project had the NULL pointer address
be 0xf0000000 because that happened to point into a place where the system
would generate an runtime access violation if a NULL pointer were dereferenced.

It invariably confused folks who hadn't thought clearly and went to debug
their code, so as a style convention we required folks to always write "NULL" 
rather than some other equivalent in their C code to reinforce the notion 
that the null pointer isn't necessarily address 0x00000000.

bright@nazgul.UUCP (Walter Bright) (11/17/90)

In article <27636@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes:
/On computers on which address location zero has `interesting' contents
/(e.g., a boot reset vector, or simply regular memory), implementations
/are faced with two choices:  either the nil pointer pointer to some
/type(s) must not be the bit pattern that means `address location
/zero'---typically, this means that, e.g., something like 0x3fc0ee7b
/must be used as the bit pattern for nil pointers---or else the
/implementation must make sure no object ever happens to be assigned
/address 0.

Or a third method, commonly used on the PC: 0 is both the NULL pointer
*and* is a valid address. If you wish to poke into the interrupt vector
table, nothing stops you from doing this (and it works fine):
	long far *p = 0;
	*p = whatever;
I'm also told that this solution is used on that Prime computer which seems
to be the only one where NULL!=0. All that is necessary is to adjust malloc
and the layout of the code and data so that it never sits on 0.

I think the ANSI C committee missed the boat on this. Thousands of hours
of wasted time, confusion, and net debate would have been eliminated if
NULL had been fixed at all bits 0.

gwyn@smoke.brl.mil (Doug Gwyn) (11/18/90)

In article <164@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
-Or a third method, commonly used on the PC: 0 is both the NULL pointer
-*and* is a valid address. If you wish to poke into the interrupt vector
-table, nothing stops you from doing this (and it works fine):
-	long far *p = 0;
-	*p = whatever;
-I'm also told that this solution is used on that Prime computer which seems
-to be the only one where NULL!=0. All that is necessary is to adjust malloc
-and the layout of the code and data so that it never sits on 0.

Certainly nothing stops you from dereferencing a null pointer (which
is what your example does), on systems where it happens to work by
accident.

-I think the ANSI C committee missed the boat on this. Thousands of hours
-of wasted time, confusion, and net debate would have been eliminated if
-NULL had been fixed at all bits 0.

I don't think so.  What good would it do you to know how a null pointer
is represented?  There is nothing useful you can do about that.

mcdonald@aries.scs.uiuc.edu (Doug McDonald) (11/18/90)

In article <14516@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>In article <164@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>
>-I think the ANSI C committee missed the boat on this. Thousands of hours
>-of wasted time, confusion, and net debate would have been eliminated if
>-NULL had been fixed at all bits 0.
>
>I don't think so.  What good would it do you to know how a null pointer
>is represented?  There is nothing useful you can do about that.


The point is that if it were indeed actually all bits zero, in all
contexts, period, and so that you could not, for example, 
say,

char *i;
int j;
scanf("%d",&j);
i = (char *) j;

and end up with something other than a null pointer if you input 
0 to the scanf, then there would be a lot less discussion in comp.lang.c.

Doug MCDonald

steve@taumet.com (Stephen Clamage) (11/20/90)

mcdonald@aries.scs.uiuc.edu (Doug McDonald) writes:

|The point is that if it were indeed actually all bits zero, in all
|contexts, period, and so that you could not, for example, 
|say,

|char *i;
|int j;
|scanf("%d",&j);
|i = (char *) j;

|and end up with something other than a null pointer if you input 
|0 to the scanf, then there would be a lot less discussion in comp.lang.c.

What in the world are you going to do with a pointer value that you read
from a file?  The only useful information you could get would be to note
whether it was zero, which you hope to equate to a nil pointer.  This
purpose might be better served by using a flag instead: it doesn't purport
to carry more information that it really has, and it doesn't rely on
nil pointers being all-bits-zero.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

bright@nazgul.UUCP (Walter Bright) (11/20/90)

In article <14516@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
/I don't think so.  What good would it do you to know how a null pointer
/is represented?  There is nothing useful you can do about that.

1. Explanations of how C works become much simpler (note the endless debate
   on usenet about NULL and 0). Making C more understandable with fewer
   counter-intuitive rules is useful.

2. You will be able to reliably be able to use memset and calloc to initialize
   structures containing pointers.

rob@mowgli.eng.ohio-state.edu (Rob Carriere) (11/22/90)

In article <171@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>In article <14516@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>/I don't think so.  What good would it do you to know how a null pointer
>/is represented?  There is nothing useful you can do about that.
>
>1. Explanations of how C works become much simpler (note the endless debate
>   on usenet about NULL and 0). Making C more understandable with fewer
>   counter-intuitive rules is useful.

True, but hardly the only criterion.  I would argue that this is a problem of
education: you don't see mathematicians protesting the extreme overloading of
symbols like 0, 1 or x.  The potential for confusion is much greater (and at
least as serious) there as it is with the nil == 0 situation.  Finding what
the math educators are doing better than the C-educators and changing the way
we teach C accordingly seems a more productive solution.[1]

>2. You will be able to reliably be able to use memset and calloc to initialize
>   structures containing pointers.

Again, true.  But isn't this a sepcification bug, rather than a langauage bug?
I think we need a function that can realiably initialize structures containing
pointers, floats, doubles and long doubles; I don't think we should hack the
language instead.

[1] My guess: (1) C is taught at too low a level of abstraction.  If your
    daily mode of thinking about pointers involves bit-patterns, something is
    wrong.    (2) Insufficient stress on the fact of overloading.  My abstract
    algebra book spends _several_pages_ carefully going through the various
    meanings of the symbol x (a number, an unknown, a polynomial), how they
    interact, why the overloading convention makes sense, and under what
    circumstances you should disambiguate to avoid confusion. 

SR
---

jimp@cognos.UUCP (Jim Patterson) (11/22/90)

In article <6205@quanta.eng.ohio-state.edu> rob@mowgli.eng.ohio-state.edu (Rob Carriere) writes:
>In article <171@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes:
>>In article <14516@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>>/I don't think so.  What good would it do you to know how a null pointer
>>/is represented?  There is nothing useful you can do about that.
>>
>>2. You will be able to reliably be able to use memset and calloc to initialize
>>   structures containing pointers.
>
>Again, true.  But isn't this a sepcification bug, rather than a langauage bug?
>I think we need a function that can realiably initialize structures containing
>pointers, floats, doubles and long doubles; I don't think we should hack the
>language instead.

But how can you write such a function, for the GENERAL case, without
"hacking the language"? I wouldn't call it "hacking the language"
though to specify what the value of a NULL pointer is, instead of
leaving it "unspecified". If there's a "specification bug", I think
this is it.  While I can understand why ANSI made the choice they did,
it's still hard to justify NOT using memset given the rarity of
machines which don't equate "NULL" with "all-bits-zero" at the
hardware level.

The C runtime environment simply doesn't provide the information
necessary to write a more general initialization function than memset,
meaning you either use "memset" (if it works), or you write specific
initialization code for each structure (and unless you do both, you'll
leave slack bytes uninitialized).  Also, you need tagged unions if you
want to handle union initialization correctly.

I'm all for using memset because it's reliable (no chance of missing
a field in the initialization, unless you miss the whole thing). I
dread the day that I have to port code to a Symbolics Lisp machine
though (:^).

Here's an alternative proposal, maybe for the next round of C
standardization.  In the same way that C allows a cast of 0 to a
pointer type, it could allow a cast of 0 to a struct or union type
and interpret it as being default-initialized value for that object.
This may seem like a hack, but in fact it's no more of a hack than the
same cast of a pointer value would be. (Except for the 0 special case,
it's an error to assign an integer to a pointer just as it's an error
to assign an integer to a struct).  This proposal would allow you to
initialize any struct or union by an assignment e.g.

   struct A { int b; char* c; double d} x;
   ...
   x = (struct A)0;

You can do the same thing already if you declare static default
initializers for all struct types and use them to provide default
initializations.  However, it's a lot more work; you have to adapt a
naming convention, ensure each one is declared on one-and-only-one
place, etc. A simple language construct like that above would be a lot
more convenient.
-- 
Jim Patterson                              Cognos Incorporated
UUCP:uunet!mitel!cunews!cognos!jimp        P.O. BOX 9707    
PHONE:(613)738-1440                        3755 Riverside Drive
NOT a Jays fan (not even a fan)            Ottawa, Ont  K1G 3Z4

steve@taumet.com (Stephen Clamage) (11/24/90)

bright@nazgul.UUCP (Walter Bright) writes:

>In article <14516@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:
>/I don't think so.  What good would it do you to know how a null pointer
>/is represented?  There is nothing useful you can do about that.

>1. Explanations of how C works become much simpler (note the endless debate
>   on usenet about NULL and 0).

I have to disagree with Walter here.  Consider Pascal as a counter-
example.  A pointer either points to a particular object, or it has
the value nil; 'nil' is a keyword.  A nil pointer points nowhere.  It
is always an error to dereference a nil pointer, always caught at run time
(in a standard-conforming Pascal system).  What could be simpler than that?

The confusion in C stems not from nil pointers which might not be
zero, but from using a literal zero to represent nil pointers.  This
leads to the 'obvious' conclusion that a nil pointer must point at
address zero and be represented by all-bits-zero.  The obvious
conclusion is unfortunately not always correct.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com