jc@minya.UUCP (John Chambers) (07/08/90)
After following a bit of debate in another newsgroup concerning dereferencing null pointers, I've become curious as to how various C compilers actually represent null pointers. I've never actually seen a C compiler that uses anything other than all 0 bits for a null pointer, but some people insist that this is quite common (and I'm a total idiot for not knowing it ;-). Now, I've known for some time that all-zeroes wasn't the *required* representation of a null pointer, but I also understand why it's the obvious one. Consider that the C bible (page 192) says, concerning assignments to/from pointers and other types, "The assignment is a pure copy operation, with no conversion." This means that in: int i; char*p; i = 0; p = i; the value assigned to p is the same bit pattern as i (which needs to be long on some machines, of course). Now, I've never even heard of a C compiler that uses anything other than all zeroes for an int (or long) 0, so it seems to my naive little mind that p must be all zeroes, also. Of course, the manual doesn't quite say anywhere that the above code gives p the same value as p = 0; though I sorta expect that most programmers would be surprised if the values were different. Anyhow, what's the story here? Are there really C compilers that use something other than all-zero bits for a null pointer? If so, can you name the compilers, and describe their representations and how they handle code like the above? This seems like it could be the source of a lot of fun portability problems. Any insights here? [It seems to me that it'd be useful if there were a compiler option to specify the representation of null pointers, but that's probably far too much to hope for... :-] -- Typos and silly ideas Copyright (C) 1990 by: Uucp: ...!{harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc (John Chambers) Home: 1-617-484-6393 Work: 1-508-952-3274
unhd (Paul A. Sand) (07/10/90)
In article <422@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >Anyhow, what's the story here? Are there really C compilers that >use something other than all-zero bits for a null pointer? If so, >can you name the compilers, and describe their representations and >how they handle code like the above? "Certain Prime computers use a value different from all-bits-0 to encode the null pointer. Also, some large Honeywell-Bull machines use the bit pattern 06000 to encode the null pointer. On such machines, the assignment of 0 to a pointer yields the special bit pattern that designates the null pointer. Similarly, (char *)0 yields the special bit pattern that designates a null pointer." -- "Portable C" by H. Rabinowitz and Chaim Schaap, Prentice-Hall, 1990, page 147. A good book. Rex Jaeschke's "Portability and the C Language" (Hayden, 1988) makes the same point but doesn't name names. >This seems like it could be the source of a lot of fun portability >problems. Any insights here? I bet you're right, although it's rather easy to be careful in these cases; there are a lot of more common and subtler portability problems. These books point out, for example, that calloc() initializes its allocated memory to all-bits-0. Interestingly [at least for those interested by such things] Rabinowitz & Schaap claim that "most C environments" initialize non-explicitly-initialized static and extern variables to all-bits-0. On the other hand, Jaeschke claims that such variables are assigned the value of 0 "cast to their type." Unless you're working in guaranteed ANSI-land only, I wouldn't rely on Jaeschke being right. It's also a good idea, I'm told, to cast the null pointer explictly when using it as a function argument, for this and other reasons. -- -- Paul A. Sand -- University of New Hampshire -- uunet!unhd!pas -or- pas@unh.edu
henry@zoo.toronto.edu (Henry Spencer) (07/10/90)
In article <422@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >Consider that the C bible (page 192) says, concerning assignments >to/from pointers and other types, "The assignment is a pure copy >operation, with no conversion." This means that in: > int i; > char*p; > i = 0; > p = i; >the value assigned to p is the same bit pattern as i... Reading the Old Testament (K&R1) and trying to apply it to modern C is a mistake. This code isn't even legal nowadays. You need an explicit cast to turn the int into a pointer, and there is no promise that that cast doesn't do some sort of arcane conversion operation. Actually, even the Old Testament continued with: "This usage is nonportable, and may produce pointers which cause addressing exceptions when used. However, it is guaranteed that assignment of the *constant* 0 to a pointer will produce a null pointer distinguishable from a pointer to any object." [emphasis added] The constant 0 in a pointer context has no relationship whatsoever to the integer value 0; it is a funny way of asking for the null pointer, which need not resemble the int value 0 in any way. -- NFS is a wonderful advance: a Unix | Henry Spencer at U of Toronto Zoology filesystem with MSDOS semantics. :-( | henry@zoo.toronto.edu utzoo!henry
roger@everexn.uucp (Roger House) (07/11/90)
In <422@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > ... "The assignment is a pure copy >operation, with no conversion." This means that in: > int i; > char*p; > i = 0; > p = i; >the value assigned to p is the same bit pattern as i (which needs >to be long on some machines, of course). ... I don't know about the Bible, but the ANSI C standard does NOT say that p = i is a pure copy. Page 37 of the standard Rationale says: Since pointers and integers are now considered incommensurate, the only integer that can be safely converted to a pointer is the constant 0. The result of converting any other integer to a pointer is machine dependent. Also, p38 of the standard itself says: An integral constant expression with the value 0, or such an expression cast to type void *, is called a null pointer con- stant. If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. Note the term "integral constant expression". In your example, i is not a constant expression, so the result of p = i is machine dependent. Roger House
amoore@softg.uucp (07/12/90)
In article <1990Jul10.141208.24902@uunet!unhd>, pas@uunet!unhd (Paul A. Sand) writes: > "Certain Prime computers use a value different from all-bits-0 to > encode the null pointer. The Prime 50 series (segmented) architecture has a segment 0. A null pointer on a Prime is segment 7777 location 0 (usually written 7777/0). The C compiler (written by Garth Conboy of Pacer Software) deals with comparisons to 0.
marking@drivax.UUCP (M.Marking) (07/13/90)
jc@minya.UUCP (John Chambers) writes:
) After following a bit of debate in another newsgroup concerning
) dereferencing null pointers, I've become curious as to how various
) C compilers actually represent null pointers. I've never actually
) seen a C compiler that uses anything other than all 0 bits for
) a null pointer, but some people insist that this is quite common
My recollection is that there are "funny" NULLs on those machines
that use 1s-complement arithmetic (Univac 1100, some CDC stuff)
because sometimes you can conveniently generate exceptions on
"minus zero". But it's been a few years...
bls@svl.cdc.com (brian scearce) (07/14/90)
The CDC Cyber series of computers uses not-all-0-bits for NULL.
Cyber addresses are 48 bits long, with 4 bits for ring, 12 bits
for segment and 32 bits for offset. If you load an address register
with a number with ring == 0, you get a hardware trap.
So, on our compiler, NULL is represented by ring == (ring your
program is executing in), segment == 0, offset == 0.
This means that you have to be quite careful in those situations
where you type 0 and mean NULL and it isn't inferable from context
what you mean. The only time that this makes a difference is (I
think) arguments to functions (should be "non-prototyped functions",
but I haven't implemented ANSI yet).
So, the output from:
main()
{
char *p = 0;
printf("%x\n", (int)p);
}
is:
b00000000000
Its still a very good compiler. Really. This small oddity isn't
as bad as most people seem to think. As I've explained to a few
through email, it's almost like floating point. Nobody expects
2.0 to have the same representation as 2, but we still write 2
sometimes when we mean 2.0 (like double x = 2; is OK).
--
Brian Scearce \ "I tell you Wellington is a bad general, the English are
(not on CDCs behalf) \ bad soldiers; we will settle the matter by lunch time."
bls@u02.svl.cdc.com \ -- Napolean Bonaparte, June 18, 1815 (at Waterloo)
shamash.cdc.com!u02!bls \ From _The Experts Speak_, Cerf & Navasky
rja@edison.cho.ge.com (rja) (07/17/90)
I used to use a compiler for MSDOS and the 80x86 cpus whose NULL pointer was F000:0000 hex when examined via a debugger. It of course did compile fine as long as one used sense and compared pointers to NULL rather than a constant of zero... Compilers where NULL isn't represented as all zero bits just aren't that uncommon.
darcy@druid.uucp (D'Arcy J.M. Cain) (07/17/90)
In article <9007161750.AA00664@edison.CHO.GE.COM> rja <rja@edison.cho.ge.com> writes: >I used to use a compiler for MSDOS and the 80x86 cpus >whose NULL pointer was F000:0000 hex when examined via >a debugger. It of course did compile fine as long as one >used sense and compared pointers to NULL rather than >a constant of zero... > Which compiler was that? I hope it didn't claim to be ANSI compatible. The NULL pointer does not have to be represented in memory as all zero bits but it does have to be represented by the string "0" in the context of a pointer comparison. Comparing a pointer to 0 is always correct and does not have anything to do with the internal representation of the NULL pointer. However I always use NULL for two reasons. Broken compilers on brain dead CPUs and if NULL is defined as "(void *)0" then it tests for accidentally testing a NULL pointer against a non pointer variable. For example: int a = 0; if (a == NULL) do(something); If tested against 0 the compiler won't complain but it will complain if it is tested against (void *)0. At least GNU C complains. In other words, use NULL not because 0 may not be the NULL pointer but because NULL can't be anything else. -- D'Arcy J.M. Cain (darcy@druid) | Government: D'Arcy Cain Consulting | Organized crime with an attitude West Hill, Ontario, Canada | (416) 281-6094 |
peter@ficc.ferranti.com (Peter da Silva) (07/17/90)
In article <9007161750.AA00664@edison.CHO.GE.COM> rja <rja@edison.cho.ge.com> writes: > I used to use a compiler for MSDOS and the 80x86 cpus > whose NULL pointer was F000:0000 hex when examined via > a debugger. It of course did compile fine as long as one > used sense and compared pointers to NULL rather than > a constant of zero... If that was the case, the compiler was broken. A constant zero in a pointer context is the definition of NULL. !pointer == 0! and !pointer == NULL! should evaluate the same way (as if they generated the same code). > Compilers where NULL isn't represented as all zero bits > just aren't that uncommon. Compilers where it's something you need to watch out for should be. -- Peter da Silva. `-_-' +1 713 274 5180. <peter@ficc.ferranti.com>
ergo@netcom.UUCP (Isaac Rabinovitch) (07/18/90)
In <9007161750.AA00664@edison.CHO.GE.COM> rja@edison.cho.ge.com (rja) writes: >I used to use a compiler for MSDOS and the 80x86 cpus >whose NULL pointer was F000:0000 hex when examined via >a debugger. It of course did compile fine as long as one >used sense and compared pointers to NULL rather than >a constant of zero... True. But what the "NULL should always be 0" diehards want is not to write (for example) for (ptr = fist; ptr != 0; ptr = ptr->next) in which 0 should probably be #DEFINED anyway, but rather for (ptr = first; ptr ; ptr = ptr->next) which produces tighter code and (most important of all) looks spiffier. It has the elegance of expression old C hands crave. >Compilers where NULL isn't represented as all zero bits >just aren't that uncommon. My '78 K&R says that assigning 0 to a pointer is (or was) guarranteed to produce a NULL, even on compilers that didn't like other integer-to-pointer assignments. But, interestingly, they did *not* guarantee, even then, the reverse! -- ergo@netcom.uucp Isaac Rabinovitch atina!pyramid!apple!netcom!ergo Silicon Valley, CA uunet!mimsy!ames!claris!netcom!ergo "I hate quotations. Tell me what you know!" -- Ralph Waldo Emerson
leo@ehviea.ine.philips.nl (Leo de Wit) (07/18/90)
In article <1990Jul17.123627.1932@druid.uucp> darcy@druid.uucp (D'Arcy J.M. Cain) writes: [] |However I always use NULL for two reasons. Broken compilers on brain dead |CPUs and if NULL is defined as "(void *)0" then it tests for accidentally |testing a NULL pointer against a non pointer variable. For example: | int a = 0; | if (a == NULL) | do(something); |If tested against 0 the compiler won't complain but it will complain if it |is tested against (void *)0. At least GNU C complains. In other words, use |NULL not because 0 may not be the NULL pointer but because NULL can't be |anything else. For much the same reason I always use explicit casts for null pointers; this also catches unadvertent assignments or comparisions to a pointer of a different type, and has the additional advantage that null pointers as parameters have the same "appearance"; you don't have to develop different habits of treating pointers. As an example: if (fgets(inbuf,sizeof(inbuf),fp) != (char *)0) .... but also: execl("/bin/ls","ls",(char *)0); Another advantage is that you see immediately from the program text what kind of pointer is expected at some stage; in production code, and especially in the maintenance phase, this may prevent a lot of type lookups. Leo.
steve@taumet.com (Stephen Clamage) (07/18/90)
ergo@netcom.UUCP (Isaac Rabinovitch) writes: >But what the "NULL should always be 0" diehards want is not to >write (for example) > for (ptr = fist; ptr != 0; ptr = ptr->next) >in which 0 should probably be #DEFINED anyway, but rather > for (ptr = first; ptr ; ptr = ptr->next) >which produces tighter code ... If in this context the expression ptr produces code which is better than ptr != 0 then you are the victim of a lazy compiler writer. There should be no difference, since 'ptr' is shorthand for 'ptr != 0'. I would complain to the vendor, or buy a better compiler. This is the sort of micro-optimization that programmers in a higher level language should NOT have to worry about. On compiler A, one code version may produce a more efficient program; on compiler B, the reverse may be true. Compiler B might even be the next release of compiler A. Thus, the effort spent in this micro-optimization is not only wastful, but may be counter-productive over time. -- Steve Clamage, TauMetric Corp, steve@taumet.com
ark@alice.UUCP (Andrew Koenig) (07/19/90)
In article <12288@netcom.UUCP>, ergo@netcom.UUCP (Isaac Rabinovitch) writes: > True. But what the "NULL should always be 0" diehards want is not to > write (for example) > for (ptr = fist; ptr != 0; ptr = ptr->next) > in which 0 should probably be #DEFINED anyway, but rather > for (ptr = first; ptr ; ptr = ptr->next) These two forms are guaranteed to be equivalent (if you change `fist' to `first' in the first example). Period. > which produces tighter code and (most important of all) looks > spiffier. It has the elegance of expression old C hands crave. Whether it produces tighter code is a matter between you and your local implementation. Since the two forms are equivalent, there is no particular reason to believe that they will produce different code at all. > My '78 K&R says that assigning 0 to a pointer is (or was) guarranteed > to produce a NULL, even on compilers that didn't like other > integer-to-pointer assignments. But, interestingly, they did *not* > guarantee, even then, the reverse! Yes indeed. However, if you write ptr != 0 then the 0 is converted to a pointer of the appropriate type and then compared, and that *is* guaranteed to work. -- --Andrew Koenig ark@europa.att.com
karl@haddock.ima.isc.com (Karl Heuer) (07/19/90)
In article <12288@netcom.UUCP> ergo@netcom.UUCP (Isaac Rabinovitch) writes: > for (ptr = fist; ptr != 0; ptr = ptr->next) > for (ptr = first; ptr ; ptr = ptr->next) >which produces tighter code and (most important of all) looks >spiffier. There is no reason it should produce tighter code; the compiler still has to generate a compare against zero. And whether it "looks spiffier" is a matter of taste. I personally switched to explicit compares (against an *appropriately typed* zero!) many years ago. Redundancy is your friend. Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint if (i != 0 && c != '\0' && x != 0.0 && p != NULL) abort();
martin@mwtech.UUCP (Martin Weitzel) (07/19/90)
Sometimes I think we should collect votes for starting a new group "comp.lang.c.nullpointers". Yes, I know about the kill-files but some of us could tell our feeds then not to send this group ... 1/2:-). -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
thacher@unx.sas.com (Clarke Thacher) (07/20/90)
Actually the Prime 50 series has both types of pointers. The older (PL1) style of pointer uses segment 7777 (octal) in their NULL pointers (there can never be a segment 7777), dereferencing this type of pointer would raise a NULL_POINTER condition. The newer C compiler uses a pointer with a segment of 0 and an offset of 0. This pointer is still not bit equal to an integer 0 there are two extra bits (for the ring number) that may or may not be set. To solve this, Prime added a TCNP (test c null pointer) instruction to the instruction set. They also added a bunch of other instructions for the C compiler (mostly having to do with character operations). Clarke Thacher PRIMOS Host developer SAS Institute, Inc. sasrer@unx.sas.com (919) 677-8000 x7703 Box 8000, Cary, NC 27512 -- Clarke Thacher PRIMOS Host developer SAS Institute, Inc. sasrer@unx.sas.com (919) 677-8000 x7703 Box 8000, Cary, NC 27512
ansok@stsci.EDU (Gary Ansok) (07/22/90)
In article <12288@netcom.UUCP> ergo@netcom.UUCP (Isaac Rabinovitch) writes: >True. But what the "NULL should always be 0" diehards want is not to >write (for example) > > for (ptr = fist; ptr != 0; ptr = ptr->next) > >in which 0 should probably be #DEFINED anyway, but rather > > for (ptr = first; ptr ; ptr = ptr->next) > >which produces tighter code and (most important of all) looks >spiffier. It has the elegance of expression old C hands crave. Once more with feeling: if (ptr) /* or for(;ptr;) */ is exactly equivalent to if (ptr != 0) which is exactly equivalent to if (ptr != (typeof ptr) 0) which is exactly equivalent to if (ptr != NULL-pointer-for-typeof-ptr) Any C compiler that has a not-all-bits-zero NULL internal representation and does not compare a pointer to that in "if (ptr)" or "for (...; ptr; ...)" is seriously BROKEN. Whether you like "if (ptr)" on readability grounds is a different question (I like it, but I seem to be in the minority) -- but that's purely a style question and the compiler had better produce the correct code. Gary Ansok ansok@stsci.edu