[comp.lang.c] More crap about nil

chris@mimsy.umd.edu (Chris Torek) (05/02/90)
>        "The name of the null pointer is called 'NULL'"
>
>... but that's only what the name is called, you know.
>The name really IS "0".  But that's only the name, not a representation.
>--
>Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

Right.  There are three separate issues; people generally insist on
tying all three into one (so they have to be handled together).

  A.  The (runtime) representation for `the' nil pointer.

	This issue is particularly sticky.  As a programmer writing
	portable C, you (a) have no idea what the representation is and
	(b) have no idea whether there is only one representation.
	There can be---and on some machines, the most `natural' setup
	has---several different `nil's, such as a `nil pointer that
	points to bytes' versus a `nil pointer that points to words',
	or---as on an IBM PC in certain models---a `nil pointer that
	points to data' versus a `nil pointer that points to code'.
	All of these `nil's are different, at least in principle if
	not often in fact.  In some cases the difference cannot be
	swept under the rug.  The mixed-model IBM PC example is one
	of the best.  There is no way the compiler can cheat with
		#define NULL 0
	or
		#define NULL 0L
	that will hide the fact that a nil data pointer and a nil code
	pointer have different sizes in a mixed-model-memory-mode.

	Fortunately, it Just So Happens that, while you have no idea
	what representation(s) the nil pointer(s) has/have, you do
	not need to know.

  B.  The (compile-time) syntax for the nil pointer.

	In Classic C, there was only one syntax for a nil pointer.
	One wrote up an expression such that the compiler could tell
	`I mean for this to be a pointer', then wrote the constant
	`0'.  In New C, there are two: one writes up an expression
	as before, then writes either `0' or `(void *)0'.

	Other languages are more clever about this: they have a keyword
	(`nil') that tells the compiler `I want a nil'.  If C had such
	a keyword, one could use it in all situations and let the
	compiler complain when one accidentally left out the context
	telling it what *particular kind* of nil pointer one wants (see
	point A above).  Unfortunately, C is not clever this way; if a
	programmer leaves out the context, the compiler has to assume
	that the programmer meant `the integer zero' or `a nil pointer
	of type pointer-to-void'.  If you want to make sure you never
	accidentally leave out the context, there are two very simple
	ways to do this.

	Way 1: always use a cast.  (This is overkill.)

	Way 2: always use a cast in every function call.  (In Old C,
	this is never overkill and is always required.  In New C, this
	is overkill when a function prototype supplies the pointer
	context.  Not all prototypes do so, and the cast never hurts.)

  C.  The word used for the syntax (when humans read and write C code).

	K&R (both editions), and many programming styles, recommend
	that programmers write `NULL' when they mean `I think I have
	supplied context for the compiler, and I want a nil pointer of
	the particular kind specified by this context'.  This helps the
	human reading the code tell that (a) you think you supplied a
	pointer context (whether you did or not) and (b) that you
	wanted a nil pointer of some kind.  A competent C programmer
	should be able to tell whether there is in fact sufficient
	context for the compiler, and any C programmer should be able
	to tell that you meant `a nil pointer of some kind'.  The
	competent maintenance programmer can then find the context, see
	what kind of nil is going to get used, and decide whether this
	is the right thing to do.  (The same competent maintenance
	programmer might find instead that the context is missing,
	which might well be the source of the bug that sent said
	maintenance programmer off to look at the code in the first
	place.)

	You can, of course, use the compile-time syntax for the nil
	pointer (i.e., `0') instead of NULL---after all, you *did*
	provide the context the compiler needs (you did, right? ... are
	you *sure*?).  The danger in doing this is that even a
	competent maintenance programmer can then miss the fact that
	this is supposed to be a nil pointer, not really a zero; this
	can lead the maintenance programmer astray, and your name may
	become infamous.

So, to sum up:
  - The name of the nil pointer is called NULL.
  - But that is just what the name is called.  The name itself is `0'.
  - But that is just the name.  The nil pointer itself is something
    else entirely, if indeed there is only one nil pointer.  If you
    want to write a compiler, you will need to know what the nil pointer
    really is; but if you just want to write C programs, you can make
    do with the name of the nil pointer, and with what the name is called.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris