[comp.lang.c] NULL, zero, and readable code

bdm-thad@Walker-EMH.arpa (07/06/87)

Re: NULL vs. zero and readable code
 
I think the problem here is the definition of NULL.  NULL is not, repeat,
NOT, equal to zero, at least the base ten zero.  If I recall my high
school math, zero is a number, the crossover between positive and nega-
tive.  NULL, on the other hand, is the absence of any number.
 
ASCII in fact defines them differently:  NULL is hex 0 while zero is hex
30. Therefore, stdio.h should define NULL as 0x0, not 0 which would be
0x30.  I don't know how most compilers define it.  Aztec C, v3.20 for the
IBM, defines it as (void *)0 so I am afraid others may be wrong as well.
What do your compilers say, gang?
 
As for TRUE and FALSE, FALSE should be 0 or 0x30 and TRUE should be !FALSE.
 
Of course, this may be an exercise in academics:  We tend to do things the 
way they have always been done or the easiest way, not necessarily the 
right way.
 
--
Thad Humphries
The BDM Corporation, Seoul, Korea
DISCLAIMER:  I'm just a Poli Sci major.

chris@mimsy.UUCP (Chris Torek) (07/06/87)

In article <8170@brl-adm.ARPA> bdm-thad@Walker-EMH.arpa writes:
>I think the problem here is the definition of NULL.  NULL is not, repeat,
>NOT, equal to zero, at least the base ten zero.  If I recall my high
>school math, zero is a number, the crossover between positive and nega-
>tive.  NULL, on the other hand, is the absence of any number.

You are mixing definitions wildly.  C's NULL is an integer constant
zero.  A null set is one that contains no elements.  The integer or
real number zero is the crossover between positive and negative.
But C only cares about C's NULL.

>ASCII in fact defines them differently:  NULL is hex 0 while zero
>is hex 30.

Now you have brought in another irrelevancy.  There are C compilers
that do not use ASCII.  Besides, code 0/0 in ASCII is NUL, not
NULL.

>Therefore, stdio.h should define NULL as 0x0, not 0 which would be
>0x30.

This is all a joke perhaps?  C source can be entirely independent
of the base character set [*].  Any ASCII-specific code is your
own doing.  [*Excluding problems with character sets lacking, e.g.,
left brace or vertical bar.  Nonetheless, there are EBCDIC based
C compilers out there.]

Note that `0x0' is indeed an integer constant zero; it should work
as a definition for NULL.  But 0 is not equal to 0x30, although
'0' may be equal to 0x30 on your machine.

>I don't know how most compilers define it.  Aztec C, v3.20 for the
>IBM, defines it as (void *)0 so I am afraid others may be wrong as well.

This has been officially mandated as legal in the proposed standard.
I do not like it, but so it goes.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

marc@pismo.cs.ucla.edu (Marc Kriguer) (07/06/87)

In article <8170@brl-adm.ARPA> bdm-thad@Walker-EMH.arpa writes:
>
>Re: NULL vs. zero and readable code
> 
>I think the problem here is the definition of NULL.  NULL is not, repeat,
>NOT, equal to zero, at least the base ten zero.

0 in base 10 = 0000000 (ten), which is equal to 0000000 in ANY base.

>ASCII in fact defines them differently:  NULL is hex 0 while zero is hex
>30. Therefore, stdio.h should define NULL as 0x0, not 0 which would be
>0x30.

No.  ASCII defines the character code for the CHARACTER '0' to be 0x30,
but that is NOT saying that zero is hex 30.  Just the CHARACTER.  When you
#define NULL 0
	you get 0, not '0'.
Thus NULL is being defined as 0  [or (char *) 0, if you prefer], not 0x30.
 _  _  _                        Marc Kriguer
/ \/ \/ \
  /  /  / __     ___   __       BITNET: REMARCK@UCLASSCF
 /  /  / /  \   /  /  /         ARPA:   marc@pic.ucla.edu
/  /  /  \__/\_/   \_/\__/      UUCP:   {backbones}!ucla-cs!pic.ucla.edu!marc

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/06/87)

In article <8170@brl-adm.ARPA> bdm-thad@Walker-EMH.arpa writes:
>As for TRUE and FALSE, FALSE should be 0 or 0x30 and TRUE should be !FALSE.
>DISCLAIMER:  I'm just a Poli Sci major.

And it shows..

lied@ihuxy.UUCP (07/08/87)

> ... NULL is not, repeat, NOT, equal to zero ...
> NULL, on the other hand, is the absence of any number...
> NULL is hex 0 while zero is hex 30. Therefore, stdio.h
> should define NULL as 0x0, not 0 which would be 0x30....
> As for TRUE and FALSE, FALSE should be 0 or 0x30 and TRUE should be !FALSE.
>
> DISCLAIMER:  I'm just a Poli Sci major.

If this is a serious posting, you have made a wise decision.

schouten@uicsrd.UUCP (07/08/87)

In article <1219@ius2.cs.cmu.edu> edw@ius2.cs.cmu.edu (Eddie Wyatt) writes:
-... use NULL instead of 0 because there are those implements of C
-(which are wrongly implemented) that don't treat 0 the way they should.
			.
			.
			.
/* Written  1:29 am  Jul  6, 1987 by bdm-thad@Walker-EMH.arpa in uicsrd:comp.lang.c */
/* ---------- "NULL, zero, and readable code" ---------- */

Re: NULL vs. zero and readable code
 
I think the problem here is the definition of NULL.  NULL is not, repeat,
NOT, equal to zero, at least the base ten zero.  If I recall my high
school math, zero is a number, the crossover between positive and nega-
tive.  NULL, on the other hand, is the absence of any number.
			.
			.
			.
DISCLAIMER:  I'm just a Poli Sci major.
/* End of text from uicsrd:comp.lang.c */

I must have missed something, cuz I can't seem to find the root of this discussion,
Sounds entertaining though. ("I'm just a Poli Sci major" is the most appropriate
disclaimer I've heard)
But I feel obligated to encourage the use of NULL over 0.
Irregardless of style questions (or "theoretial" discussions), I
have worked on systems where NULL was != 0, because 0 was a
valid address.  It's incredibly annoying to have to search for all
occurrences of 0 that are intended as a NULL pointer value.

Dale A. Schouten
UUCP:	 {ihnp4,seismo,pur-ee,convex}!uiucdcs!uicsrd!schouten
ARPANET: schouten%uicsrd@a.cs.uiuc.edu
CSNET:	 schouten%uicsrd@uiuc.csnet
BITNET:	 schouten@uicsrd.csrd.uiuc.edu

ron@topaz.rutgers.edu.UUCP (07/08/87)

I assume Thad's message is sarcasm but it's sure to cause people to
respond, so why should I be any different.

1.  NULL is not supposed to be a definition ASCII NUL character but
   rather as a mnemonic for an invalid pointer in C.

2.  On two's complement machines, 0 is zero in any base.

-Ron

What does ANSI C say about -0 on ones complement machines?

gwyn@brl-smoke.UUCP (07/09/87)

In article <13222@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes:
>What does ANSI C say about -0 on ones complement machines?

It's not specifically addressed as a -0 issue.  However, enough rules for
the abstract machine arithmetic are provided to resolve practically any
question concerning -0 in various contexts.  Note that on reasonable 1's
complement architectures, one does not get a -0 as the result of a series
of arithmetic operations unless one of the original operands were -0,
which in C would have to be written as a bitlike entity (e.g. ~0 or 0xFFFF).
The C source expression "-0" (without the quotes) means 0, not -0.

jack@swlabs.UUCP (Jack Bonn) (07/10/87)

In article <6090@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <13222@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes:
> >What does ANSI C say about -0 on ones complement machines?
> 
>                                              Note that on reasonable 1's
> complement architectures, one does not get a -0 as the result of a series
> of arithmetic operations unless one of the original operands were -0

I don't believe this to be the case.  One's complement addition is the same 
as two complement addition with the addition of END AROUND CARRY.  This 
means that the carry is brought around to the lsb and added in.  

Perhaps an example will elucidate for those less familiar with the 
technique.  Lets assume a 16 bit word for simplicity.

    hex    decimal    comments

    FFFE        -1    Negation is simply complementation in one's comp.
   +0001         1    Same as two's comp for numbers > 0.
   =FFFF        -0    Sum of the above.  Note that no carry was generated,
                      so the END AROUND rule doesn't apply.

Note that we generated a -0 when none of the original operands were -0.

Now an example with a carry:

    hex    decimal    comments

    FFFE        -1    Negation is simply complementation in one's comp.
   +0002         2    Same as two's comp for numbers > 0.
  =C0000              Sum of the above.  Carry was generated so the 
                      END AROUND rule applies.  Note that that this is
                      an intermediate result and would never be made available.
   =0001         1    Result.  The carry is added in before the result
                      is made available.
  
I don't know whether ALL machines will leave a -0 result.  But I know
that the CDC 6400 (and family) did.  Since the one's complement system was
claimed to be used for speed, I doubt that many manufacturers would add an
additional normalization step before storing results.  Do you have any 
examples of machines that do this normalization?  Anyway, is the CDC family
to be dismissed as un"reasonable"?
-- 
Jack Bonn, <> Software Labs, Ltd, Box 451, Easton CT  06612
seismo!uunet!swlabs!jack

gmt@arizona.edu (Gregg Townsend) (07/10/87)

In article <6090@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>                                              Note that on reasonable 1's
> complement architectures, one does not get a -0 as the result of a series
> of arithmetic operations unless one of the original operands were -0

In article <266@swlabs.UUCP>, jack@swlabs.UUCP (Jack Bonn) replies:
> I don't know whether ALL machines will leave a -0 result.  But I know
> that the CDC 6400 (and family) did....  Do you have any examples that
> [normalize to avoid this]?  Anyway, is the CDC family
> to be dismissed as un"reasonable"?

I programmed the CDC 6000 series in assembly language for 12 years.  For an
addition, both operands had to be -0 to get a -0 result;  for subtraction it
had to be  (-0) - (+0).  I don't remember if you could even get -0 from integer
multiplication, but that wasn't in the original architecture anyway.

The way to think of this was that both operands of addition, or the wrong
one for subtraction, were complemented (negated) and then added.  Then the
result was complemented on the way out.  I assume this was just conceptual
and actually involved negative logic instead of extra gates.

So the example of (-1) + (+1) becomes	(sorry, I still think in octal!)
	-1  =  7776   compl ->  0001
	+1  =  0001   compl ->  7776
				----
			add ->	7777	compl -> 0000	yielding +0 result

     Gregg Townsend / Computer Science Dept / Univ of Arizona / Tucson, AZ 85721
     +1 602 621 4325      gmt@Arizona.EDU       110 57 17 W / 32 13 47 N / +758m

guy%gorodish@Sun.COM (Guy Harris) (07/10/87)

> But I feel obligated to encourage the use of NULL over 0.
> Irregardless of style questions (or "theoretial" discussions), I
> have worked on systems where NULL was != 0, because 0 was a
> valid address.  It's incredibly annoying to have to search for all
> occurrences of 0 that are intended as a NULL pointer value.

Any C implementation that doesn't convert the constant 0 to a pointer
of the appropriate type, in the appropriate contexts, is buggy,
broken, wrong, invalid, etc., etc., etc..  ANY IMPLEMENTATION WHERE
THE STATEMENTS

	if (p == NULL)
		(do something)

and

	if (p == 0)
		(do something)

where "p" is of type "char *" don't generate equivalent code (if the
two "(do something)"s are the same) is broken.  The same is true of

	p = 0;

and

	p = NULL;

and of

	foo((char *)0);

and

	foo((char *)NULL);

It doesn't matter *what* bit pattern is used to represent a NULL
pointer; NO C IMPLEMENTATION THAT REQUIRES NULL TO BE DEFINED TO BE
THAT BIT PATTERN IS VALID.  Period.  End of discussion.

If 0 is a valid address, and objects have 0 as their address (there
are implementations where 0 is a valid address in the sense that you
won't get a fault by referencing it, but where no objects are placed
at that location, and many of these represent a null pointer by an
all-zeroes bit pattern), then the compiler MUST recognize 0 in the
appropriate context (one where it is to be converted to a null
pointer of the apprpriate type) and represent it as the bit pattern
used for null pointers in that implementation.

This may be counterintuitive, but that's just too bad; that's the way
C works, changing this would break too many programs written strictly
according to the rules, and it's not going to change.  (Besides, the
same thing happens when mixing the integral constant 0 and floating
point numbers, if 0.0 isn't represented by an all-zero bit pattern.)
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/12/87)

In article <266@swlabs.UUCP> jack@swlabs.UUCP (Jack Bonn) writes:
>I don't know whether ALL machines will leave a -0 result.  But I know
>that the CDC 6400 (and family) did.

I wonder about this.  The CDC 1700, a 16-bit 1's complement machine using
the same cordwood modules as the larger CDCs (which means the same bit-slice
arithmetic modules), definitely did NOT produce -0 in normal arithmetic
unless one introduced a -0 as an operand.  That's why I was surprised when
I first heard people saying that -0 was a problem; I never had any trouble
with it in two years of CDC 1700 programming.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/12/87)

In article <44200004@uicsrd> schouten@uicsrd.cs.uiuc.edu writes:
>I have worked on systems where NULL was != 0, because 0 was a valid address.

I thought we were discussing C.  No C data object is permitted to have
an address such that a pointer to it is indistinguishable from 0 cast
to the appropriate pointer type.

jack@swlabs.UUCP (Jack Bonn) (07/14/87)

In article <6104@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <266@swlabs.UUCP> jack@swlabs.UUCP (Jack Bonn) writes:
> >I don't know whether ALL machines will leave a -0 result.  But I know
> >that the CDC 6400 (and family) did.
> 
> I wonder about this.  The CDC 1700, a 16-bit 1's complement machine using
> the same cordwood modules as the larger CDCs (which means the same bit-slice
> arithmetic modules), definitely did NOT produce -0 in normal arithmetic
> unless one introduced a -0 as an operand.  That's why I was surprised when
> I first heard people saying that -0 was a problem; I never had any trouble
> with it in two years of CDC 1700 programming.

Apparently, judging from my mail (thanks guys), I was misinformed about 
the possibility of a -0 in the CDC 6400.  One has to be careful when 
believing professors.  The interesting thing is that the instruction 
set is such that the lack of a -0 never broke anything; at least in 
the work that I did with it.  And I believe that the code would have 
been the same regardless.

I have a question though.  How do they prevent a -0 from appearing 
when you have hll code like:

a = -b;

and b is 0?  Do they generate code to subtract from 0 rather than 
just complementing the bits?  Or is that the ONLY way to do it?  It 
has been 14 years since I tried to tame that beast, and some of the 
details elude me.
-- 
Jack Bonn, <> Software Labs, Ltd, Box 451, Easton CT  06612
seismo!uunet!swlabs!jack

howard@cpocd2.UUCP (Howard A. Landman) (07/24/87)

In article <6107@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <44200004@uicsrd> schouten@uicsrd.cs.uiuc.edu writes:
>>I have worked on systems where NULL was != 0, because 0 was a valid address.
>
>I thought we were discussing C.  No C data object is permitted to have
>an address such that a pointer to it is indistinguishable from 0 cast
>to the appropriate pointer type.

This is true but doesn't contradict the previous statement.  The exact
statement in K & R is:

	...it is guaranteed that the assignment of the constant 0 to
	a pointer will produce a null pointer distinguishable from a
	pointer to any object.

Now let Null be a null pointer.  Just because ((char *) 0) is a null
pointer, you cannot conclude that ((int) Null) == 0.  The ineluctable
conclusion is that an implementation of C, in which ((int)((char *) 0))
!= 0, is legal.  This might mean that widely used tests such as

	if (p)

where p is a pointer, are simply wrong!  From K & R we see that if
statements are just tests for zero:

	"...the expression is evaluated and if it is non-zero, the
	first substatement is executed."  p.201

	"...relational expressions like i > j and logical expressions
	connected by && and || are defined to have value 1 if true,
	and 0 if false.  ...  (In the test part of an if, while, for,
	etc., ``true'' just means ``non-zero.'')" p.41

A further interesting question is whether all null pointers must be
equal.  K & R is silent on this question, although they don't mention
more than one.  If not, then just because ((char *) 0) is a null
pointer, you cannot conclude that ((char *) 0) == ((int *) 0), and tests
like

	if (p == NULL)

might even fail for some null pointers!  Not a very pleasant thought.

-- 
	Howard A. Landman
	...!{oliveb,...}!intelca!mipos3!cpocd2!howard
	howard%cpocd2%sc.intel.com@RELAY.CS.NET
	"..., precisely the opposite of what we now know to be true!"

guy%gorodish@Sun.COM (Guy Harris) (07/24/87)

> Now let Null be a null pointer.  Just because ((char *) 0) is a null
> pointer, you cannot conclude that ((int) Null) == 0.  The ineluctable
> conclusion is that an implementation of C, in which ((int)((char *) 0))
> != 0, is legal.  This might mean that widely used tests such as
> 
> 	if (p)
> 
> where p is a pointer, are simply wrong!

No, it CANNOT mean that!  The test

	if (p)

tests whether "p" is not equal to the zero *of the type that "p" is*!
In other words, if "p" is of an integral type, it tests whether "p"
is not equal to an integral zero; if "p" is of a floating-point type,
it tests whether "p" is not equal to a floating-point zero; and, if
"p" is of a pointer type, it tests whether "p" is not equal to a
"pointer zero" of that type - namely, a NULL pointer.

Were there a "typeof" "operator", another way to say this would be:

	if (p)

is equivalent to

	if (p != (typeof p)0)

> A further interesting question is whether all null pointers must be
> equal.  K & R is silent on this question, although they don't mention
> more than one.  If not, then just because ((char *) 0) is a null
> pointer, you cannot conclude that ((char *) 0) == ((int *) 0), and tests
> like
> 
> 	if (p == NULL)
> 
> might even fail for some null pointers!  Not a very pleasant thought.

If NULL is properly defined as "0", rather than improperly defined as
"(char *) 0", this test will not fail for any null pointers.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

ron@topaz.rutgers.edu (Ron Natalie) (07/25/87)

> No, it CANNOT mean that!  The test
>	if (p)
> tests whether "p" is not equal to the zero *of the type that "p" is*!
>  is equivalent to
>	if (p != (typeof p)0)

Of course which is precisely equivelent to

	if( p != 0)

for ANY type of p.  You can work through the "usual" arithmetic
conversions and the special case of integer constant zero versus
pointers and prove that this is true.  Hence for any of the relational
or equality operator one never need type cast zero.

-Ron

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/25/87)

In article <802@cpocd2.UUCP> howard@cpocd2.UUCP (Howard A. Landman) writes:
>This might mean that widely used tests such as
>	if (p)
>where p is a pointer, are simply wrong!

I couldn't follow your reasoning, but we've been over this MANY times
and you're mistaken.  I think your problem is that you think in terms
of converting pointers to integer quantities for the zero comparisons,
but that is the exact opposite of the type conversions required by the
language rules.