[comp.lang.c] Referencing NULL pointers

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/19/89)

In article <1759@cadillac.CAD.MCC.COM> ned%cad@MCC.COM (CME Ned Nowotny) writes:
>However, there are environments where addresses can be represented
>by numeric constants and there really is something important at address 0.

There better not be.  C guarantees that valid object addresses compare
unequal to null pointers, and since a null pointer constant is written
as "0" in C source code, you cannot obtain a valid object address by
casting 0 to the object pointer type.

If this is really a problem for some implementation, then it can arrange
for the "equivalent integer" mapped form of pointers to be essentially
the conventional machine address plus one, or some similar mapping that
keeps 0 from appearing to be a possible valid object address in a C
program.

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/21/89)

In article <1796@cadillac.CAD.MCC.COM> ned%cad@MCC.COM (CME Ned Nowotny) writes:
>There are byte-addressable systems where important data may reside at
>address 0.

I think Chris and I know that.  What we're saying is, don't attempt to
use what appears like an integer constant "0" to construct an access to
such a location.  Instead, try something like
	#define	word_at(loc)	((int*)(sizeof(int)+(loc))[-1]
which will not run afoul of C's null pointer convention.  Of course,
there may be some C implementations (maybe "Safe C")  that perform
run-time dereferencing checks, in which case it would be possible that
there is no valid way to access address zero directly from C.  I doubt
that you'd be using such an implementation if you were munging address
zero, however.

>Other languages do not care whether 0 is a valid address or not.

I have no doubt that it would have been preferable to have designed
an explicit "nil" keyword into C.  For one thing, it would have avoided
these endless discussions about dereferencing null pointers!  However,
as with numerous other "warts", we're stuck with it and merely need to
learn how to live with it.  For the vast majority of applications it
should not pose a problem.

ned@pebbles.cad.mcc.com (CME Ned Nowotny) (08/15/89)

In article <10556@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <1759@cadillac.CAD.MCC.COM> ned%cad@MCC.COM (CME Ned Nowotny) writes:
>>However, there are environments where addresses can be represented
>>by numeric constants and there really is something important at address 0.
>
>There better not be.  C guarantees that valid object addresses compare
>unequal to null pointers, and since a null pointer constant is written
>as "0" in C source code, you cannot obtain a valid object address by
>casting 0 to the object pointer type.
>
>If this is really a problem for some implementation, then it can arrange
>for the "equivalent integer" mapped form of pointers to be essentially
>the conventional machine address plus one, or some similar mapping that
>keeps 0 from appearing to be a possible valid object address in a C
>program.

While Chris Torek and you have made good arguments against attempting
to cast a 0 to a valid pointer, I believe you are losing sight of
the distinction between a null pointer represented by the numeric
constant 0 and the address 0 in a given environment.

There are byte-addressable systems where important data may reside at
address 0.  On these systems, it is desirable for system specific C
code to be able to bind pointers to specific addresses which have
a one-to-one mapping from integer value to address location.  (Yes,
it would be better to use symbolic names and bind them to specific
addresses in the link editor.  Unfortunately, this is not an option
in most cases.)

Now C adopted a convention that maps well into it's set of logical
operations whereby a 0 represents a null pointer.  In fact, this was
stated in much stronger terms than a mere convention even in K&R I.
However, the language is distinct from the environment.  Other
languages do not care whether 0 is a valid address or not.  If you
can get a handle on it (In *gasp* BASIC, "peek" will do.), you can
read the contents.  The C language description is distinct from
whether useful data resides at address 0.  The question is how do
you get at it if it is there?

The two clean (?) solutions are to either write an assembly language
routine(s) to access the data and link it with your C program or to
have a link editor capable of the trick described above.  The first
is generally more trouble than it is worth and the second may not be an option.
Therefore, Chris's trick using an integer variable set to 0 and cast
to the appropriate pointer type is a desirable approach when writing
non-portable, system-specific code on a byte-addressable system which
implements a one-to-one mapping between addresses and integers.  These
constraints describe a wide range of environments and it almost certainly
is not in the interest of compiler vendors to complicate things by
using an offset mapping as you describe.  Besides, if integers (i.e. unsigned
longs) are the same size as pointers, you will have problems accessing the
highest address location and frequently this is as important as the lowest
address location, 0.

It is absolutely true that C dictates that a null pointer can never point to
a valid object and that this condition must hold when writing portable code.
It is almost universally true that programs that do dereference null pointers
are broken even if they appear to "work" in some environments.  If this were
to be worth discussion, it would probably have best been covered in comp.lang.c
or comp.std.c.  If anything justifies covering it here, it is the question of
whether there are cases in which valid data resides at location zero in some
systems, including implementations of UNIX, and the question of how to access
it.  (Then again, perhaps not.)

Ned Nowotny, MCC CAD Program, Box 200195, Austin, TX  78720  Ph: (512) 338-3715
ARPA: ned@mcc.com                   UUCP: ...!cs.utexas.edu!milano!cadillac!ned
-------------------------------------------------------------------------------
"We have ways to make you scream." - Intel advertisement in the June 1989 DDJ.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/28/89)

In article <1796@cadillac.CAD.MCC.COM> ned%cad@MCC.COM (CME Ned Nowotny) writes:
>In article <10556@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>>There better not be.  C guarantees that valid object addresses compare
>>unequal to null pointers, and since a null pointer constant is written
>>as "0" in C source code, you cannot obtain a valid object address by
>>casting 0 to the object pointer type.
>While Chris Torek and you have made good arguments against attempting
>to cast a 0 to a valid pointer, I believe you are losing sight of
>the distinction between a null pointer represented by the numeric
>constant 0 and the address 0 in a given environment.

No, in fact I directly addressed the idea of forcing an integral 0
"address" to point at an object.  This has nothing to do with whether
or not you can obtain useful data from "address" 0 by dereferencing
such a pointer.  The compiler is entirely at liberty to IMMEDIATELY
turn (whatever*)0 into an internal form such as (&__nullity), where
__nullity is part of the C library.  If you were able to dereference
such a "0-valued" pointer, in fact you'd access __nullity, not address 0.
Therefore you'd better not have anything at address 0 that really needs
to be accessed from C code.

>whether there are cases in which valid data resides at location zero in some
>systems, including implementations of UNIX, and the question of how to access
>it.

Original UNIX implementations inserted a "shim" at I-space virtual
address 0 to prevent useful data from being allocated there, just
to guarantee that null pointers were always distinct from pointers
to valid objects.

tneff@bfmny0.UUCP (Tom Neff) (08/28/89)

I would also point out:

 (1) A truly portable application is unlikely to have any legitimate need
	 to mess with things at "real" address 0.  Lots of system-y type programs
	 might have an excuse (it's a manipulable interrupt vector location in
	 the Intel architectures for instance) but by definition they are not
	 very portable.

 (2) In architectures where accessing address 0 is a legitimate concern,
	 good compilers will probably provide a non-ANSI non-portable extension
	 of some kind to let you do it.  The pANS is not a nursemaid.
-- 
"We walked on the moon --	((	Tom Neff
	you be polite"		 )) 	tneff@bfmny0.UU.NET

kremer@cs.odu.edu (Lloyd Kremer) (08/29/89)

In article <10830@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

>The compiler is entirely at liberty to IMMEDIATELY
>turn (whatever*)0 into an internal form such as (&__nullity), where
>__nullity is part of the C library.  If you were able to dereference
>such a "0-valued" pointer, in fact you'd access __nullity, not address 0.
>Therefore you'd better not have anything at address 0 that really needs
>to be accessed from C code.


This is true for integral constant 0, but could you not access memory
location 0 by writing:

	int data, p;

	p = 0;  /* integer variable that happens to be set to zero */
	data = *(int *)p;  /* no constant expression in this line */

I would think the "promotion to nil pointer" rule would not apply here.

-- 
					Lloyd Kremer
					...!uunet!xanth!kremer
					Have terminal...will hack!

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/29/89)

In article <9838@xanth.cs.odu.edu> kremer@cs.odu.edu (Lloyd Kremer) writes:
>	p = 0;  /* integer variable that happens to be set to zero */
>	data = *(int *)p;  /* no constant expression in this line */
>I would think the "promotion to nil pointer" rule would not apply here.

Correct, as Chris has already pointed out.  However I think there are
enough requirements imposed on converting between integers and pointers
that if you allow the above you'll also have to make the null pointer a
funny bit pattern, with resulting higher overhead than most people would
think necessary in that environment.  I don't have a formal proof though..

barmar@think.COM (Barry Margolin) (08/29/89)

In article <10862@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <9838@xanth.cs.odu.edu> kremer@cs.odu.edu (Lloyd Kremer) writes:
>>	p = 0;  /* integer variable that happens to be set to zero */
>>	data = *(int *)p;  /* no constant expression in this line */
>>I would think the "promotion to nil pointer" rule would not apply here.
>Correct, as Chris has already pointed out.  However I think there are
>enough requirements imposed on converting between integers and pointers
>that if you allow the above you'll also have to make the null pointer a
>funny bit pattern, with resulting higher overhead than most people would
>think necessary in that environment.  I don't have a formal proof though..

Why?  You'd only need a funny bit pattern for the null pointer if you
want to guarantee a runtime error when dereferencing the null pointer,
or if some library routine might return a pointer to location 0 that
must be distinguished from NULL.

Such a library routine would have to be implementation-dependent, of
course; malloc, for example, could never return a pointer that
happened to have the same representation as the null pointer.  But if
there's some implementation-specific data at location 0, and the
implementation uses a pointer to 0 as the null pointer, there's no
reason you couldn't use
	ptr = 0;
	foo = *ptr;
to access it.  This code is non-portable, but so is the fact that this
particular piece of data is at location 0.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

richard@aiai.ed.ac.uk (Richard Tobin) (08/29/89)

In article <9838@xanth.cs.odu.edu> kremer@cs.odu.edu (Lloyd Kremer) writes:
>This is true for integral constant 0, but could you not access memory
>location 0 by writing:

>	p = 0;  /* integer variable that happens to be set to zero */
>	data = *(int *)p;  /* no constant expression in this line */

Probably, but there's nothing to stop a cast doing something strange.
This may work better (but of course is still completely unreliable):

   union {int i; int *p} x;

   x.i = 0;
   data = *x.p;

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin