[comp.lang.c] NULL question not in FAQ

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) (03/27/91)

Given that the compiler is supposed to translate the constant "0" to the
appropriate value for a NULL pointer on the machine type, how does one
get a pointer value whose representation happens to be all zeroes, but
is a non-NULL pointer?

Are all of these equivalent or are any differnt?

    p = ( (char *) 0 );
    p = ( (char *) 00 );
    p = ( (char *) 0x0 );
    p = ( (char *) 0x00 );
    p = ( (char *) 1-1 );

If not, what are they supposed to compile to?

Suppose I do this:

    char *p;
    int i;
    p = NULL;
    i = (int) p;

Will I get a value of zero in "i" always, regardless of the way the
machine type represents a NULL?

I am wanting to discern whether or not the special case of a pointer value
being translated to the NULL representation is done with the particular
constant of a single digit "0" or if it is done with ANY constant whose
integer value is numerically 0.

For low level code, e.g. drivers and such, where portability and machine
independence are not issues, it would still be nice to be able to cast
address values into pointers as desired.  Some machine types might very
well have special functionality at address 0x00000000 and uses some other
form of address for a NULL.  Some machine types actually have no NULL at
all and one is simply chosen by convention.

Can someone summarize this, depending one what the real answers are, and
include it in the FAQ in the section on NULL?  This might clear up (or
confuse further) the distinction of NULL.
-- 
 /***************************************************************************\
/ Phil Howard -- KA9WGN -- phil@ux1.cso.uiuc.edu                              \
\ Lietuva laisva -- Brivu Latviju -- Eesti vabaks                             /
 \***************************************************************************/

shields@yunexus.YorkU.CA (Paul Shields) (03/27/91)

phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:

>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is a non-NULL pointer?

>Are all of these equivalent or are any differnt?
 [.. examples deleted..]

I think all the original examples are equivalent.  But if the stored 
value for a NULL pointer is something other than all 0 bits, then
erhaps this will do it...

     char *x;
     memset( &x, 0, sizeof(char *));

torek@elf.ee.lbl.gov (Chris Torek) (03/27/91)

In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu
(Phil Howard KA9WGN) writes:
>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is a non-NULL pointer?

In standard C, one simply writes:

	p = &obj;

or

	p = malloc(size);

or whatever, and it Just Happens to return a value whose representation
is all-zero-bits, yet compares unequal to 0.

Outside the standard, there is no way to say how one will do it, precisely
*because* one is outside the standard.  In order to describe the rules,
one must first know them.

That said---warning: opinion follows---the best way for an implementor,
faced with such a machine, to allow O/S programmers and similar ilk access
to the Special Stuff at machine address 0, despite the fact that nil
pointers to types T1, T2, and T3 must be nonzero bit patterns, is to
make

	struct foo *p; long l = 0; p = l;

`do the right thing'.  Another entirely acceptable alternative is:

	#include <machine/cpu.h>

	volatile union zeropointer z;
	...
		zeropointer(&z);	/* point to machine address 0 */
		z.ptr->zp_frammitz = 1;	/* turn on the frammitz */

>Are all of these equivalent or are any differnt?
>
>    p = ( (char *) 0 );
>    p = ( (char *) 00 );
>    p = ( (char *) 0x0 );
>    p = ( (char *) 0x00 );

These are all equivalent.

>    p = ( (char *) 1-1 );

This is not, but only because the cast binds more tightly;

	p = (char *)(1 - 1);

*is* equivalent to the other four assignments above, but

	p = (char *)1 - 1;

is unpredictable.  It might be another way to get an all-bits-zero pointer
value, but it might not (perhaps all `char *' pointers have the low 2 bits
set indicating a <char> tag, so this might give a pointer that happens to
have the integer value `3').

This is another acceptable, but machine dependent (as always), method of
obtaining an `all zero bits pointer'.

>Suppose I do this:
>
>    char *p;
>    int i;
>    p = NULL;
>    i = (int) p;
>
>Will I get a value of zero in "i" always, regardless of the way the
>machine type represents a NULL?

No.

>I am wanting to discern whether or not the special case of a pointer value
>being translated to the NULL representation is done with the particular
>constant of a single digit "0" or if it is done with ANY constant whose
>integer value is numerically 0.

A null pointer to some type T is produced by writing the integer
constant 0 (i.e., any constant expression whose type is integral and
whose value is zero) in the appropriate pointer context.

In other words, it is `any constant', not just the single digit `0'.

Note that some expressions that are immutable are not actually
`constant expressions', e.g.,

	(""[0])

is immutably% 0, yet it is not an `integer constant zero' by ANSI
rules.  A compiler is allowed, but not required, to treat it as if
it were just `0'.
-----
% Well, you can sometimes persuade it not to be 0 if you do something
  illegal, like scribble on your text segment.
-----

Incidentally, one simple method for getting an arbitrary machine-
dependent pointer value, without gimmicking the compiler or using
unions or any such other grotesqueries, is to use assembly code.
For instance:

	extern struct cypripareunia s;

combined with:

	.set	_cypripareunia, 0
or
	_cypripareunia = 0

will often do the trick.  (Those of you who look up this word, or
[shudder :-) ] know if offhand, will see what I think of this. :-) )
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

henry@zoo.toronto.edu (Henry Spencer) (03/28/91)

In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is a non-NULL pointer?

I assume you're talking about a machine where such a thing is possible.
On many machines an all-zeros pointer is a null pointer.

This will work:

	char *p;

	memset(&p, 0, sizeof(p));

Note that an all-zeros pointer might cause a core dump whenever you try
to compare it or assign it, never mind follow it.  Turning an arbitrary
bit pattern into a pointer is fraught with peril.

>Are all of these equivalent or are any differnt?
>    p = ( (char *) 0 );
>    p = ( (char *) 00 );
>    p = ( (char *) 0x0 );
>    p = ( (char *) 0x00 );
>    p = ( (char *) 1-1 );

Apart from the precedence error in the last one, they are all equivalent.
Any zero-valued constant expression turns into a null pointer when used
in a pointer context.

>Suppose I do this:
>    char *p;
>    int i;
>    p = NULL;
>    i = (int) p;
>Will I get a value of zero in "i" always, regardless of the way the
>machine type represents a NULL?

No; the value you get is *totally* implementation-dependent.  There is
absolutely no guaranteed relationship between the representation of
integer zero and the representation of null pointers.  The 0-to-null
conversion is entirely a compile-time phenomenon.

>For low level code, e.g. drivers and such, where portability and machine
>independence are not issues, it would still be nice to be able to cast
>address values into pointers as desired...

There is nothing stopping you from doing this, but you have to have
implementation-specific knowledge of what the results will be.  It's
not portable.
-- 
"[Some people] positively *wish* to     | Henry Spencer @ U of Toronto Zoology
believe ill of the modern world."-R.Peto|  henry@zoo.toronto.edu  utzoo!henry

ckp@grebyn.com (Checkpoint Technologies) (03/28/91)

In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is a non-NULL pointer?

void some_func(void) {
	int **ip;

	ip = (int **) calloc(sizeof(int *), 1);

	...

And now, *ip is a pointer whose representaiton is all-bits-zero, which
of course is not guaranteed to be NULL, but *probably is on your
implementation.  Most architectures are not capable of distinguishing
between a pointer representation of all-bits-zero, and a null pointer.
But you as a programmer should make this distinction.

And, if I might restate the rest, for brevity...

I am comfortable with the idea that 0 may be transformed into the null
pointer value.  However, the FAQ says that the *compiler* translates
the integer 0 *constant* into the null pointer.  It says nothing about
transforming, at run time, integer zero values which may be in an object.
Therefore, the following is not guaranteed to give me a null pointer:

	int i = 0;
	char *p = (char *)i;

So is it correct to interpret this code as unportable?
--
First comes the logo: C H E C K P O I N T  T E C H N O L O G I E S      / /
                                                ckp@grebyn.com      \\ / /
Then, the disclaimer:  All expressed opinions are, indeed, opinions. \  / o
Now for the witty part:    I'm pink, therefore, I'm spam!             \/

scs@adam.mit.edu (Steve Summit) (03/28/91)

In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
>Given that the compiler is supposed to translate the constant "0" to the
>appropriate value for a NULL pointer on the machine type, how does one
>get a pointer value whose representation happens to be all zeroes, but
>is [not necessarily a] NULL pointer?
>
>For low level code, e.g. drivers and such, where portability and machine
>independence are not issues, it would still be nice to be able to cast
>address values into pointers as desired.

Indeed.  In fact, K&R1 stated that "the mapping function [between
integral types and pointers] is... machine dependent, but is
intended to be unsurprising to those who know the addressing
structure of the machine."  (Section A14.4, p. 210.)  Of course,
in these more modern and civilized times, wanton conversions
between pointers and integers are more strongly discouraged, and
the "intended to be unsurprising" language does not appear in K&R2.

>Some machine types might very
>well have special functionality at address 0x00000000 and uses some other
>form of address for a NULL.

While such machines are certainly possible (and, with time,
perhaps more so) I would hazard to guess that they are rare at
best.  If a machine is intended to be manipulated directly via
absolute addresses, the manufacturer (and compiler architect)
will presumably make it convenient to do so.  In effect, the old
"intended to be unsurprising" language is likely to be observed.

I can think of three general ways to get a pointer that really
points at address zero.  (All are, of course, unportable, but
that's obviously okay here.)

     1.	char *p = (char *)0;

     2. int zero = 0;
	char *p = (char *)zero;

     3.	char *p;
	memset((char *)&p, 0, sizeof(char *));

(Chris and Henry, and others, have already described several
variations on methods 2 and 3.)

Number one is the most obvious (and easiest) technique; there's
nothing "magic" or surprising about it at all.  It simply assumes
that the internal representation of a null pointer really is
address 0.  (This is, after all, a safe assumption for many
machines, frequent exhortations here to the contrary
notwithstanding.)

Number two is a bit safer, because it dodges the source code null
pointer constant rules in favor of run-time int->pointer
conversion rules, which are more likely to behave as intended, as
long as the compiler author happens to have heard or though of
something equivalent to the old K&R1 "intended to be unsurprising"
advice.  (On a machine with nonzero internal null pointers,
number two wouldn't work if the compiler writer tried to be
"helpful" by making runtime int->pointer coercions, for zero
values, mimic compile-time null pointer generation.)

Number three (including analogous tricks using unions) is
probably the safest pure-C approach, but it is somewhat clumsy.

(Number four, not shown here, is to use an auxiliary assembly
language file, as Chris described, which is the very safest
technique.  "If you want assembly language, you know where to
find it.")

I always use technique one, if it works.  (So far, it always
has, for me.)

Falling-all-over-myself-disclaimer:  Obviously, these techniques
are all unportable, shamelessly violating both the information-
hiding intent of source-code null pointer constants, and also
everything comp.lang.c has been trying to teach anybody about
null pointers.  However, since accessing (for instance) a non-
maskable or power-up interrupt vector at location zero is
inherently unportable, it's obviously acceptable if unportable
code is used to do it.  (Phil understands this; I'm just
re-emphasizing the point.)

If you choose one of the simpler (but less safe) techniques for
accessing location 0 (and I wouldn't blame you for doing so),
you'll have to document that decision and be aware of future
compiler releases which might do things differently.

>Can someone summarize this, depending one what the real answers are, and
>include it in the FAQ in the section on NULL?  This might clear up (or
>confuse further) the distinction of NULL.

It's certainly true that broaching this essentially taboo subject
risks further confusion, and the question doesn't come up very
often, but it's worth a (small) mention in the FAQ list.

                                            Steve Summit
                                            scs@adam.mit.edu

gwyn@smoke.brl.mil (Doug Gwyn) (03/29/91)

In article <1991Mar27.194101.1685@grebyn.com> ckp@grebyn.com (Checkpoint Technologies) writes:
>	int i = 0;
>	char *p = (char *)i;
>So is it correct to interpret this code as unportable?

It's not only not guaranteed to work in all standard-conforming C
implementations, it may not even compile successfully, if "int" is
not a suitable type for holding a pointer representation.  (SOME
integer type must do so, but not necessarily "int", which for example
could be only 16 bits in a 24-bit address-space environment.)

Basically, you should try to not use integer types to hold values of
pointers.  Usually it is simpler to use a pointer type all along.

torek@elf.ee.lbl.gov (Chris Torek) (03/30/91)

In article <1991Mar28.071834.27272@athena.mit.edu> scs@adam.mit.edu writes:
>(Number four, not shown here, is to use an auxiliary assembly
>language file, as Chris described, which is the very safest
>technique.  "If you want assembly language, you know where to
>find it.")

Safest in one sense (you have the most control), but not in another.  I
claimed that those who looked up `cypripareunia' would find what I
(sometimes) think of mixing C and assembly.  Spilling the beans, for
those of you without copies of `Mrs. Byrne's Dictionary of Unusual,
Obscure, and Preposterous Words', `cypripareunia' is defined as `n.
sexual intercourse with a prostitute'.  In other words, it sounds like
fun but you find it is not, and it can be downright dangerous. :-)
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

ravim@gtenmc.UUCP (Vox Populi) (04/02/91)

In article <1991Mar27.194101.1685@grebyn.com> ckp@grebyn.com (Checkpoint Technologies) writes:
 >In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu> phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
 >>Given that the compiler is supposed to translate the constant "0" to the
 >>appropriate value for a NULL pointer on the machine type, how does one
 >>get a pointer value whose representation happens to be all zeroes, but
 >>is a non-NULL pointer?
 >
 >void some_func(void) {
 >	int **ip;
 >
 >	ip = (int **) calloc(sizeof(int *), 1);

The same result (getting a pointer value to be all null bytes) can also be
achieved by declaring the pointer variable to be either static or/and global,
since static/global variables are automatically initialized to zeroes.

	-	Ravi Mandava

-- 
**********************   #include <stddisclaimer.h>  **************************
Ravi Mandava			e-mail :	ravim@gtenmc.gtetele.com
					  or    ravim@gtenmc.UUCP
*******************************************************************************

volpe@camelback.crd.ge.com (Christopher R Volpe) (04/02/91)

In article <1103@gtenmc.UUCP>, ravim@gtenmc.UUCP (Vox Populi) writes:
|>In article <1991Mar27.194101.1685@grebyn.com> ckp@grebyn.com
(Checkpoint Technologies) writes:
|> >In article <1991Mar26.235643.4498@ux1.cso.uiuc.edu>
phil@ux1.cso.uiuc.edu (Phil Howard KA9WGN) writes:
|> >>Given that the compiler is supposed to translate the constant "0" to the
|> >>appropriate value for a NULL pointer on the machine type, how does one
|> >>get a pointer value whose representation happens to be all zeroes, but
|> >>is a non-NULL pointer?
|> >
|> >void some_func(void) {
|> >	int **ip;
|> >
|> >	ip = (int **) calloc(sizeof(int *), 1);
|>
|>The same result (getting a pointer value to be all null bytes) can also be
|>achieved by declaring the pointer variable to be either static or/and global,
|>since static/global variables are automatically initialized to zeroes.

No, that won't work. Statics are not initialized with zero bit patterns.
They are initialized as if they were assigned the constant 0. Thus, pointer
variables get the Null Pointer, and floats get the bit representation for
0.0. 
                       
==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com

throopw@sheol.UUCP (Wayne Throop) (04/08/91)

> ravim@gtenmc.UUCP (Vox Populi)
> The same result (getting a pointer value to be all null bytes) can also be
> achieved by declaring the pointer variable to be either static or/and global,
> since static/global variables are automatically initialized to zeroes.

As the cliche now goes.... 
"Bzzzzzzzzt!  Wrong!  But thank you for playing our game!"

The problem here is that pointers and floating point values are a
special case, and initializing them to zero doesn't guarantee a
byte-wise or bit-wise zero value.  Further, uninitialized static or
global variables of these types are mandated by X3J11 (the ANSI C
standard) to act like they'd been subjected to initializers to zero,
not to "bzero" or "all bits (or bytes) zeroed". 

Um... that is...
unless the phrase "can be achieved" meant "on some particular machines",
rather than "portably".
--
Wayne Throop  ...!mcnc!dg-rtp!sheol!throopw