[net.lang.c] Global ptrs init to NULL or 0000?

mike@whuxl.UUCP (BALDWIN) (11/03/85)

I've got a question:  can one assume that uninitialised global/static
pointers will init to NULL or will they be filled with 0 bytes?  This
obviously doesn't make any difference on machines where NULL pointers
have all 0 bits, but what about the ones that don't?  Since I don't
have access to one (how many are there?) I can't test it out.  Reading
X3J11C leads me to believe they will default to NULL:

	C.7.2:
		If no subsequent definition is encountered, the first
		tentative definition is taken to be a definition with
		initializer equal to 0.
	C.5.6:
		If such an object is not initialized explicitly, it is
		initialized implicitly as if every scalar member were
		assigned the integer constant 0.
		If there are fewer initializers in a list than there are
		members of an aggregate, the remainder of the aggregate
		is initialized implicitly as if every scalar member were
		assigned the integer constant 0.
	C.2.2.3:
		An integral constant expression with the value 0 may be
		assigned to or compared for equality to a pointer.  The
		integer constant 0 is converted to a pointer of the ap-
		propriate type that is guaranteed not to point to any object.

I read this as saying:
	static int *p, *q[3], *r[2] = { 0 };
is equivalent to
	static int *p = 0, *q[3] = { 0, 0, 0 }, *r[2] = { 0, 0 };

Big deal, huh?  Well, in all the C compilers I know of, any uninitialized
global/static is stuffed into the bss section and is set to all 0 bytes at
run time, but initialized data (even if init to 0) is put in the data section
(and is therefore in the executable).  Now, on machines with NULL equal to
some funny value, say 5551212, putting uninitialized ptrs into bss won't do.

So the big question is:  is it OK (portable, etc) to assume that declaring
a global/static ptr without initialization will set it to the machine's
idea of NULL, not all 0 bytes?  I say it's OK, can anyone test this out?

BTW, does anyone else get queasy with the idea that any "integral constant
expression with the value 0" can be used as NULL?  I mean, do we really want
to allow: p = 0xAA/0xAA - ((3*4 - 20/2) >> 1); ?  Why not just allow the exact
token 0?  (pcc and lint both accept that stmt, sigh)
-- 
						Michael Baldwin
						{at&t}!whuxl!mike

ark@alice.UucP (Andrew Koenig) (11/03/85)

> 
> I've got a question:  can one assume that uninitialised global/static
> pointers will init to NULL or will they be filled with 0 bytes?  This
> obviously doesn't make any difference on machines where NULL pointers
> have all 0 bits, but what about the ones that don't?

You are entitled to assume that an otherwise uninitialized global or
static item has the same value that it would have if you assigned 0 to it.

jdb@mordor.UUCP (John Bruner) (11/05/85)

One ramification of moving uninitialized pointers from the BSS
segment into the DATA segment is that the common practice of
defining global variables by putting the statement "int *p;"
in multiple files will cause load errors.  (There should be
exactly one definition; the rest should be "extern int *p;".)
A *lot* of C programs are written this way.  (Wasn't AT&T
forced to back away from a fix to the C loader [Sys Vr1?]
that prevented this sloppy practice?)

This is just one example of how much inertia must be overcome
when implementing C on a machine where (foo *)0 does not have
the same representation as (int)0.

A less-common but related problem is the use of calloc(), which
returns memory which is guaranteed to be zero.  Programs that
calloc() pointers (usually within structures) and do not
initialize those pointers are making the nonportable assumption
that (foo *)0 is an all-zero bit pattern.
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: jdb@mordor [jdb@s1-c.ARPA]	(415) 422-0758
  UUCP: ...!ucbvax!dual!mordor!jdb 	...!seismo!mordor!jdb

henry@utzoo.UUCP (Henry Spencer) (11/05/85)

> ... in all the C compilers I know of, any uninitialized
> global/static is stuffed into the bss section and is set to all 0 bytes at
> run time...
> 
> So the big question is:  is it OK (portable, etc) to assume that declaring
> a global/static ptr without initialization will set it to the machine's
> idea of NULL, not all 0 bytes?  ...

That's the way things are supposed to work according to X3J11.  I would be
a little wary of assuming that things really *do* work that way on those
few machines that have non-0 NULLs, with current compilers.  I'd say the
assumption is all right in general, unless you are really being maximally
paranoid.  If you expect to have to port your software to such a machine
next month, some paranoia may be in order.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

mikeb@inset.UUCP (Mike Banahan) (11/06/85)

In article <772@whuxl.UUCP> mike@whuxl.UUCP (BALDWIN) writes:
>I've got a question:  can one assume that uninitialised global/static
>pointers will init to NULL or will they be filled with 0 bytes?  This
>obviously doesn't make any difference on machines where NULL pointers
>have all 0 bits, but what about the ones that don't?  Since I don't
>have access to one (how many are there?) I can't test it out.  Reading
>X3J11C leads me to believe they will default to NULL:
>
>	/* Lots of X3J11 stuff */
>
>I read this as saying:
>	static int *p, *q[3], *r[2] = { 0 };
>is equivalent to
>	static int *p = 0, *q[3] = { 0, 0, 0 }, *r[2] = { 0, 0 };
>
>Big deal, huh?  Well, in all the C compilers I know of, any uninitialized
>global/static is stuffed into the bss section and is set to all 0 bytes at
>run time, but initialized data (even if init to 0) is put in the data section
>(and is therefore in the executable).  Now, on machines with NULL equal to
>some funny value, say 5551212, putting uninitialized ptrs into bss won't do.

Correct.

>So the big question is:  is it OK (portable, etc) to assume that declaring
>a global/static ptr without initialization will set it to the machine's
>idea of NULL, not all 0 bytes?  I say it's OK, can anyone test this out?

Well, either the compiler conforms to X3J11 or it doesn't. If it doesn't
it isn't compiling C so we can ignore it. If it does, then global pointers
will initialise to zero and we will all be happy. The poor b*****d who
has to implement the compiler/loader may not be happy, but that's what
(s)he is paid to do. The same is true, incidentally of float, double and
so on. A lot of machines think that a float full of zero bits is not
(double)0, but uninitialised, and trap you to a core dump on the spot.

Remember, nobody has got an X3J11 compiler yet, so we can't really
expect to learn much by inspecting the output of any other compiler.

Mike Banahan, Technical Director, The Instruction Set Ltd.
mcvax!ukc!inset!mikeb

kenny@uiucdcsb.CS.UIUC.EDU (11/09/85)

OK, so what does

static union {
	int i;
	char *p;
	} foo;

get initialized to on a machine with a non-0 NULL?  Incidentally, something
like this appears in a LOT of UN*X programs, and is a MAJOR headache in
attempting to port to a machine with a non-0 NULL.

k**2
Kevin Kenny	kenny@Uiuc.ARPA
		kenny@Uiuc.CSNET
		...!ihnp4!pur-ee!uiucdcs!kenny

Opinions expressed herein will attach themselves to anyone that claims them.

guy@sun.uucp (Guy Harris) (11/11/85)

> OK, so what does
> 
> static union {
> 	int i;
> 	char *p;
> 	} foo;
> 
> get initialized to on a machine with a non-0 NULL?

To be pendantic, NULL is a #define and doesn't depend on the machine; you
mean "a machine where null pointers do not contain the same bit pattern as a
0 integral value."

On machines with pre-ANSI C compilers, it doesn't get initialized;

	8.6 Initialization

	...It is not permitted to initialize unions or automatic
	aggregates.

On machines with ANSI C compilers, it gets initialized to whatever bit
pattern a 0 integral value has, since initializing a union initializes only
its first member.  (Yes, this is a rule with limited practical use, but they
had to choose *some* rule, I guess.)

> Incidentally, something like this appears in a LOT of UN*X programs, and
> is a MAJOR headache in attempting to port to a machine with a non-0 NULL.

Which is a good reason why the language specification should have been
silent on the initial value of *any* variable not explicitly initialized.
If you were forced to initialize items with an explicit initialization,
there would be no question about whether a pointer value (even on machines
with non-zero null pointers) which wasn't declared with an initializer would
contain a null pointer or not; it might, but you could NOT count on it.
Furthermore, the question of "what would a union be initialized to" would
not exist either.

Furthermore, non-UNIX environments may have to go through some contortions
to deal with

	int	big_array[32767];

if they don't have UNIX-style automatic zeroing of a BSS area - they might
actually have to put 32767 "int"s worth of zeroes into the executable image.

	Guy Harris

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/12/85)

> OK, so what does
> 
> static union {
> 	int i;
> 	char *p;
> 	} foo;
> 
> get initialized to on a machine with a non-0 NULL?

The first member of the union is initialized with the
appropriate form of 0, in this case (int)0.  That is
the main necessity for defining initializations of
unions.  The effect of trying to dereference the `p'
member of this union is indeterminate.

jsdy@hadron.UUCP (Joseph S. D. Yao) (11/12/85)

In article <139200016@uiucdcsb> kenny@uiucdcsb.CS.UIUC.EDU writes:
>OK, so what does
>static union {
>	int i;
>	char *p;
>	} foo;
>get initialized to on a machine with a non-0 NULL?  ...

Unions get initialized per the first element of the union.
For 100+ pages of argument over why, refer to the archives of
this newsgroup.		;-)
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

gwyn@BRL.ARPA (VLD/VMB) (11/15/85)

I don't see much problem with the requirement that any defined
statically-allocated storage that has no explicit initializer
be initialized as though the constant 0 were assigned to it.
(This phrasing ensures that such pointers will start out with
the correct representation for null pointers (not necessarily
a zero bit pattern)).

As I am sure you are aware, a vast quantity of existing C code
assumes that such data is implicitly initialized to "zero"
(not necessarily with the same interpretation as X3J11), so
your example of
	int	big_array[32767];
would have to be initialized to zero somehow in any reasonable
existing implementation of C.  The UNIX .bss trick works only
for a limited set of architectures anyhow; if "zero" means a
non-zero bit pattern, the compiler or linker is going to have
to use an approach other than auto-zeroing of .bss upon
process startup.

I believe your point was that there may be overhead caused by
this initialization, even in the case that the
auto-init-to-zero values are not going to be used because the
code is going to store into the array before referencing it.

I agree that it would be much cleaner to declare that defined
data that is not explicitly initialized may not be referenced
until after something has been explicitly stored into it.  I
follow this practice in my own code (not counting modifications
to existing software that does not follow these rules).  But
that would clearly "invalidate" much existing code that has
(reasonably) been considered correct up to now.

mikeb@inset.UUCP (Mike Banahan) (11/15/85)

In article <139200016@uiucdcsb> kenny@uiucdcsb.CS.UIUC.EDU writes:
>
>OK, so what does
>
>static union {
>	int i;
>	char *p;
>	} foo;
>
>get initialized to on a machine with a non-0 NULL?  Incidentally, something
>like this appears in a LOT of UN*X programs, and is a MAJOR headache in
>attempting to port to a machine with a non-0 NULL.

This is an example of an initialised union; explicit initialisation of unions
is not permitted in X3J11. Almost quoting the draft standard ``static objects
not initialised explicitly are initialised as if every *scalar* member
were assigned the integer constant 0 ''.

Now a union is not a scalar object. But its members would seem to be.
Perhaps herein lies an ambiguity.

-- 
Mike Banahan, Technical Director, The Instruction Set Ltd.
mcvax!ukc!inset!mikeb