[comp.lang.c] Why NULL is 0

chris@mimsy.UUCP (Chris Torek) (03/09/88)

(You may wish to save this, keeping it handy to show to anyone who
claims `#define NULL 0 is wrong, it should be #define NULL <xyzzy>'.
I intend to do so, at any rate.)

Let us begin by postulating the existence of a machine and a compiler
for that machine.  This machine, which I will call a `Prime', or
sometimes `PR1ME', for obscure reasons such as the fact that it
exists, has two kinds of pointers.  `Character pointers', or objects
of type (char *), are 48 bits wide.  All other pointers, such as
(int *) and (double *), are 32 bits wide.

Now suppose we have the following C code:

 	main()
	{
 		f1(NULL);	/* wrong */
 		f2(NULL);	/* wrong */
 		exit(0);
 	}
 
 	f1(cp) char *cp; { if (cp != NULL) *cp = 'a'; }
 	f2(dp) double *dp; { if (dp != NULL) *dp = 2.2; }

There are two lines marked `wrong'.  Now suppose we were to define NULL
as 0.  Clearly both calls are then wrong: both pass `(int)0', when the
first should be a 48 bit (char *) nil pointer and the second a 32 bit
(double *) nil pointer.

Someone claims we can fix that by defining NULL as (char *)0.  Suppose
we do.  Then the first call is correct, but the second now passes a
48 bit (char *) nil pointer instead of a 32 bit (double *) nil pointer.
So much for that solution.

Ah, I hear another.  We should define NULL as (void *)0.  Suppose we
do.  Then at least one call is not correct, because one should pass
a 32 bit value and one a 48 bit value.  If (void *) is 48 bits, the
second is wrong; if it is 32 bits, the first is wrong.

Obviously there is no solution.  Or is there?  Suppose we change
the calls themselves, rather than the definition of NULL:

	main()
	{
		f1((char *)0);
		f2((double *)0);
		exit(0);
	}

Now both calls are correct, because the first passes a 48 bit (char *)
nil pointer, and the second a 32 bit (double *) nil pointer.  And
if we define NULL with

	#define NULL 0

we can then replace the two `0's with `NULL's:

	main()
	{
		f1((char *)NULL);
		f2((double *)NULL);
		exit(0);
	}

The preprocessor changes both NULLs to 0s, and the code remains
correct.

On a machine such as the hypothetical `Prime', there is no single
definition of NULL that will make uncasted, un-prototyped arguments
correct in all cases.  The C language provides a reasonable means
of making the arguments correct, but it is not via `#define'.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

g-rh@cca.CCA.COM (Richard Harter) (03/10/88)

Just a small addendum to Chris's excellent summary -- the current generation
of PR1ME C compilers use a 48 bit size for all pointers; but earlier versions
used 48 bits for char pointers and 36 for everything else.  I don't know if
all formats are the same -- but then I don't need to know that.  All I need
to know is that everything works properly if I cast all of my pointers right.

And this really works!  Recently I did a PRIMOS upgrade of our software.
I took 40,000 lines of C, developed and maintained, under UNIX, and ported
it to PRIMOS with 0 pointer problems.  [There are sundry incompatibilities in
library routine calls, another matter.]

Long live portable software.
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

pablo@polygen.uucp (Pablo Halpern) (03/11/88)

From article <10576@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek):
> (You may wish to save this, keeping it handy to show to anyone who
> claims `#define NULL 0 is wrong, it should be #define NULL <xyzzy>'.
> I intend to do so, at any rate.)
> 
> Let us begin by postulating the existence of a machine and a compiler
> for that machine.  This machine, which I will call a `Prime', or
> sometimes `PR1ME', for obscure reasons such as the fact that it
> exists, has two kinds of pointers.  `Character pointers', or objects
> of type (char *), are 48 bits wide.  All other pointers, such as
> (int *) and (double *), are 32 bits wide.

I must admit, you have come up with a situation where the uncasted
value NULL would be an incorrect parameter to a function that didn't
have a prototype regardless of the definition of NULL. (If you missed
the original posting, please don't flame about this.  Ask someone to
mail the original to you.)  In the situation mentioned, the compiler
could not know how wide to make the null pointer when passing it to the
function.  Indeed, even with NULL defined as (void *) 0, dpANS does not
say that the size of a (void *) is the same as the size of any other
pointer.

However, if I were writing a C compiler, I would choose a size for all
pointers equal to the size of the largest possible pointer.  This would
allow code that passed uncasted NULL to work correctly, provided NULL
is a type as large as a pointer.  This is not because dpANS says it
should be so, but because so much code would break if it were not.
Perhaps ANSI should add the restriction that all pointer types must be
the same size in an effort to "codify common existing practice."

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/12/88)

In article <124@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:
>Perhaps ANSI should add the restriction that all pointer types must be
>the same size in an effort to "codify common existing practice."

But it isn't existing practice.

If you mean, non-portable, buggy code is existing practice, well yes,
but so what?  That certainly shouldn't serve as the basis for a standard.

pardo@june.cs.washington.edu (David Keppel) (03/12/88)

In article <124@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:
>However, if I were writing a C compiler, I would choose a size for all
>pointers equal to the size of the largest possible pointer.  This would
>allow code that passed uncasted NULL to work correctly, provided NULL
>is a type as large as a pointer.  This is not because dpANS says it
>should be so, but because so much code would break if it were not.
>Perhaps ANSI should add the restriction that all pointer types must be
>the same size in an effort to "codify common existing practice."

My guess is that you would want to do this only if you didn't care
whether your compiler produced efficient code or made "reasonable"
time/space efficiency tradeoffs.

I know we've gone over this a lot, and perhaps the topic should be
redirected to comp.arch or some such, but (if nobody minds too much)
could somebody who understands please tell me what kind of efficiency
you lose (in runtime, codespace, and dataspace) in trying to make
everything the most general pointer type (on existing architectures)?

I'd sure like to know.

 ;-D on  (Well it looked like luminous phosphor when I wrote it)  Pardo

louie@trantor.umd.edu (Louis A. Mamakos) (03/12/88)

In article <124@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:
>However, if I were writing a C compiler, I would choose a size for all
>pointers equal to the size of the largest possible pointer.  

Please don't do this.  On, for example, our Unisys 1100 machine, a "regular"
pointer is 8 bytes long (2 words, 72 bits).  A pointer to a function is 
64 bytes long (8 words, 288 bits).  Yes, this is a word-addressable machine
with 4 9-bit bytes per word. The existing semantics work just fine if
you don't assume programmer brain-damage.

Louis A. Mamakos  WA3YMH    Internet: louie@TRANTOR.UMD.EDU
University of Maryland, Computer Science Center - Systems Programming

friedl@vsi.UUCP (Stephen J. Friedl) (03/13/88)

In article <124@polygen.UUCP>, pablo@polygen.uucp (Pablo Halpern) writes:
> However, if I were writing a C compiler, I would choose a size for all
> pointers equal to the size of the largest possible pointer.  This would
> allow code that passed uncasted NULL to work correctly, provided NULL
> is a type as large as a pointer.  This is not because dpANS says it
> should be so, but because so much code would break if it were not.
> Perhaps ANSI should add the restriction that all pointer types must be
> the same size in an effort to "codify common existing practice."

I think this is naive.  Presumably, the large "common" pointer
format would pass around the machine OK but it still must be
converted to the machine's native pointer types to actually
use -- how efficiently will this be done?  This might be like the 80x86
huge pointer calculations or the old promote-float-to-double
rules -- they makes life a little easier for the (lazy?)
programmer at the larger expense in time and compiler complexity

Perhaps more reasonable is to promote all pointers to the same
large width while passing them on the stack and convert them back
on the other end: this would fix the alignment issues but would
still slow things down.

I don't favor this approach but I bring it up in the spirit of
this discussion.  The obvious answer is to be very rigorous about
casting your pointers and knowing when to do so.  This is a
bummer for the beginners but we all had to go through it unless
we develop on a VAX :-).
-- 
Life : Stephen J. Friedl @ V-Systems, Inc./Santa Ana, CA   *Hi Mom*
CSNet: friedl%vsi.uucp@kent.edu  ARPA: friedl%vsi.uucp@uunet.uu.net
uucp : {kentvax, uunet, attmail, ihnp4!amdcad!uport}!vsi!friedl

throopw@xyzzy.UUCP (Wayne A. Throop) (03/14/88)

> pablo@polygen.uucp (Pablo Halpern)
>> by chris@mimsy.UUCP (Chris Torek)
>> [... on some existing architectures, pointers "naturally" have many
>>      sizes or formats... ]
> However, if I were writing a C compiler, I would choose a size for all
> pointers equal to the size of the largest possible pointer.  This would
> allow code that passed uncasted NULL to work correctly, provided NULL
> is a type as large as a pointer.

This also assumes that the nil pointer has the same format for all
pointer types, and that passing protocols are also the same.  Which,
granted, might all be legislated so by the compiler implementor.  *BUT*
on some architectures, this would impose unacceptable performance
or space consumption tradeoffs.  For example, all pointers might have
to be made twice as large as they might be.  Or access times to unpack
peculiar pointer formats from the "standard" one mandated by the
compiler might be prohibitive.  For example, look at the 80x86 family of
architectures.  Making all pointers 32 bits long and mandating everybody
using the "huge" memory model (or whatever that model is called nowdays)
suffers unacceptable performance degradataion for programs that could
fit into a 16-bit address space.

> This is not because dpANS says it
> should be so, but because so much code would break if it were not.
> Perhaps ANSI should add the restriction that all pointer types must be
> the same size in an effort to "codify common existing practice."

Well.... I don't admit that it is, indeed, "common existing practice" to
assume that pointers are all the same size and the same format, no more
than it is "existing practice" that pointers will fit in ints.  Granted,
there is code that makes this assumption, but it is not so universal as
you assume, since all of these pointer assumptions have, in fact, been
false since nearly the very beginning of C.  I think that dpANS makes
the proper tradeoff, and insists that there be a prototype in scope, or
that an explicit cast occur.

--
How often have I said to you that when you have eliminated the impossible,
whatever remains, however improbable, must be the truth.
    --- Sir Arthur Conan Doyle {The Sign of the Four}
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

gregg@a.cs.okstate.edu (Gregg Wonderly) (03/15/88)

From article <10576@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek):
> (You may wish to save this, keeping it handy to show to anyone who
> claims `#define NULL 0 is wrong, it should be #define NULL <xyzzy>'.
> I intend to do so, at any rate.)
> 
> 	[Great example of PR1ME casting problem.]
> 

Another great example is the not so loved Intel segmentation.  For most
Xenix and other pseudo UNIX's for these pseudo computers the following
is true.

When you use small model, everything works great because sizeof(int) == sizeof
((any) *).  If you move to middle model, then everything still works pretty
well except that now sizeof(int) != sizeof ((*)()), but sizeof(int) == sizeof
((anything but function) *).  Now move to large model, and sizeof (int) !=
sizeof ((any) *).  This can really cause problems with routines which accept
NULL as a parameter, because if you do not cast it to (??? *)NULL, then things
break, quite spectacularly.  Now for small and large model, #define NULL
((char *)0) would work, but not for middle model because sizeof (char *) !=
sizeof ((*)()).

Moral:   As always stated in the past, ``Use typecasts, they make your program
         portable, not ugly!''

scott@stl.stc.co.uk (Mike Scott) (03/15/88)

In article <10576@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
....
>of type (char *), are 48 bits wide.  All other pointers, such as
>(int *) and (double *), are 32 bits wide.
>
>Now suppose we have the following C code:
>
> 	main()
>	{
> 		f1(NULL);	/* wrong */
> 		f2(NULL);	/* wrong */
> 		exit(0);
> 	}
> 
> 	f1(cp) char *cp; { if (cp != NULL) *cp = 'a'; }
> 	f2(dp) double *dp; { if (dp != NULL) *dp = 2.2; }
>
>There are two lines marked `wrong'.  Now suppose we were to define NULL
>as 0.  Clearly both calls are then wrong: both pass `(int)0', when the
>first should be a 48 bit (char *) nil pointer and the second a 32 bit
>(double *) nil pointer.
>

He suggests the form:
>       main()
>       {
>		f1((char *)NULL);
>		f2((double *)NULL);
>		exit(0);
>	}
>
>The preprocessor changes both NULLs to 0s, and the code remains
>correct.
>

If I could add my 2 pennyworth.  From K&R p192, we read " it is
guaranteed that assignment of the constant 0 to a pointer will
produce a null pointer distinguishable from a pointer to any object".

From page 71: "Within a function, each argument is in effect a local
variable initialized to the value with which the function was called."

It follows at once from these that the integer 0 may be supplied in
any function call where a pointer is expected, and the compiler must
make sure that the proper translation to the correct sort of
null pointer is performed.

[I suppose one might argue about differences between 'assignment' and
'initialisation'. I think it's clear that initialisation of a pointer
to NULL (whatever NULL may be defined as) and assignment of NULL to
that same pointer have to give the same result!]

The real problem is that functions don't have to be declared until
used. Given an ANSII-type declaration, the compiler can sort out
the mess. Until then, I suppose the only way to get portable code is
indeed
to use expicit casts as in the "correct" code above. But until now, I
had no idea that machines with different pointer lengths existed [I'm
rather blinkered by my PDP8/PDP11/VAX spectacles]. I don't really see
how one can guess the needs of various 'strange' machine architectures
that may exists :-(

(Usual disclaimers apply.)
-- 
Regards. Mike Scott (scott@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!scott)
phone +44-279-29531 xtn 3133.

janc@palam.eecs.umich.edu (Jan Wolter) (03/16/88)

In article <124@polygen.UUCP> pablo@polygen.uucp (Pablo Halpern) writes:
>From article <10576@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek):
>> 
>> Let us begin by postulating the existence of a machine and a compiler
>> for that machine.  This machine, which I will call a `Prime', or
>> sometimes `PR1ME', for obscure reasons such as the fact that it
>> exists, has two kinds of pointers.  `Character pointers', or objects
>> of type (char *), are 48 bits wide.  All other pointers, such as
>> (int *) and (double *), are 32 bits wide.
>
>     ... if I were writing a C compiler, I would choose a size for all
>pointers equal to the size of the largest possible pointer.  This would
>allow code that passed uncasted NULL to work correctly, provided NULL
>is a type as large as a pointer....

For efficiency reasons you probably don't want to get rid of the other
pointer types.  However, why not handle pointers in function calls something
like the way char's are handled in function calls?  When I pass a char to a
function, it is automatically cast to int before being passed.  Why not
automatically cast all pointer types to (char *) before passing them?  I'm
fairly sure C does guarantee that casting a pointer to a pointer to a smaller
object and back always gives you back the same pointer.  This adds slightly
to the overhead in function calls on the Prime, but the simplification of the
interface on just about every other living machine is probably worth it.
This, of course, would only be done if no function prototype is given.

One other vaguely related question:  Which of the following produce a null
pointer?

	int zero = 0;
	char *p1 = 0;             /* this is a null pointer! */
	char *p2 = zero;          /* is this a null pointer? */
	char *p3 = (char *)zero;  /* what's this? */

As I read K&R, a null pointer is only produced when a *constant* 0 is assigned
to a pointer.  When an integer is assigned to a zero, K&R seems to suggest that
a bitwise copy is done, which may not be the same thing at all.  This seems
to be the only case in C where "a=(b=c)" is not equivalent to "a=c,b=c".

While K&R says assignment is a bitwise copy, they say explicitly typecasting
an integer to a pointer gives a machine dependent result.  Thus it seems
possible that the p1, p2, and p3 could be three different pointers.  (Frankly,
the more I read on this subject, the more I think K&R didn't have their minds
entirely clear on this business either.)

				- Jan Wolter
				  janc@crim.eecs.umich.edu

dsill@NSWC-OAS.arpa (Dave Sill) (03/17/88)

In article <800@zippy.eecs.umich.edu> Jan Wolter <janc@palam.eecs.umich.EDU> writes:
>One other vaguely related question:  Which of the following produce a null
>pointer?
>
>	int zero = 0;
>	char *p1 = 0;             /* this is a null pointer! */

Yes.

>	char *p2 = zero;          /* is this a null pointer? */

Maybe.  K&R say assignments between pointers and ints are nonportable,
as are assignments between different types of pointers.

>	char *p3 = (char *)zero;  /* what's this? */

Exactly the same as p2.  K&R define a cast as performing the
conversions required to assign the operand to a variable of the type
of the cast.

>As I read K&R, a null pointer is only produced when a *constant* 0 is assigned
>to a pointer.

Almost, once generated a null pointer can be produced by assignment.
I.e.,
	char *pc1, *pc2;
	pc1 = 0;
	pc2 = pc1;

>While K&R says assignment is a bitwise copy, they say explicitly typecasting
>an integer to a pointer gives a machine dependent result.  Thus it seems
>possible that the p1, p2, and p3 could be three different pointers.

No, at most two different pointers.  p1 is null.  p2 and p3 will be
the same nonportable pointer.

>(Frankly,
>the more I read on this subject, the more I think K&R didn't have their minds
>entirely clear on this business either.)

Maybe you should read some more.

=========
The opinions expressed above are mine.

"Give me a pointer and I'll dereference the world."
					-- David Keppel

dsill@NSWC-OAS.arpa (Dave Sill) (03/17/88)

I wrote:
>In article <800@zippy.eecs.umich.edu> Jan Wolter <janc@palam.eecs.umich.EDU> writes:
>>	char *p2 = zero;          /* is this a null pointer? */
>
>Maybe.  K&R say assignments between pointers and ints are nonportable,
>as are assignments between different types of pointers.
>
>>	char *p3 = (char *)zero;  /* what's this? */
>
>Exactly the same as p2.  K&R define a cast as performing the
>conversions required to assign the operand to a variable of the type
>of the cast.

K&R page 42:
"...	(type-name) expression  ...
The precise meaning of a cast is in fact as if *expression* were
assigned to a variable of the specified type..."

This leads one to the conclusion that
	char *p = zero;
and
	char *p = (char *)zero;
give the same result.  Why, then, does the former cause a warning
about an illegal combination of pointer and integer?  Is the sole
function of the cast in the latter to prevent such a warning?

>>(Frankly,
>>the more I read on this subject, the more I think K&R didn't have their minds
>>entirely clear on this business either.)
>
>Maybe you should read some more.

I'd suggest reading the dpANS-C, where this is all much more
well-defined.

=========
The opinions expressed above are mine.

"Meanings receive their dignity from words instead of giving it to them."
					-- Blaise Pascal

franka@mmintl.UUCP (Frank Adams) (03/17/88)

In article <800@zippy.eecs.umich.edu> janc@palam.eecs.umich.edu (Jan Wolter) writes:
>As I read K&R, a null pointer is only produced when a *constant* 0 is assigned
>to a pointer.  When an integer is assigned to a zero, K&R seems to suggest
>that a bitwise copy is done, which may not be the same thing at all.  This
>seems to be the only case in C where "a=(b=c)" is not equivalent to "a=c,b=c".

It isn't.  Try:

double d;
int i;
d = (i = 1.5);

In general, "a=(b=c)" is eqivalent to "b=c,a=b".
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

g-rh@cca.CCA.COM (Richard Harter) (03/18/88)

In article <636@acer.stl.stc.co.uk> scott@acer.UUCP (Mike Scott) writes:

>If I could add my 2 pennyworth.  From K&R p192, we read " it is
>guaranteed that assignment of the constant 0 to a pointer will
>produce a null pointer distinguishable from a pointer to any object".
>
>From page 71: "Within a function, each argument is in effect a local
>variable initialized to the value with which the function was called."
>
>It follows at once from these that the integer 0 may be supplied in
>any function call where a pointer is expected, and the compiler must
>make sure that the proper translation to the correct sort of
>null pointer is performed.
>
>[I suppose one might argue about differences between 'assignment' and
>'initialisation'. I think it's clear that initialisation of a pointer
>to NULL (whatever NULL may be defined as) and assignment of NULL to
>that same pointer have to give the same result!]
>
>The real problem is that functions don't have to be declared until
>used. Given an ANSII-type declaration, the compiler can sort out
>the mess. Until then, I suppose the only way to get portable code is
>indeed to use expicit casts as in the "correct" code above.

Actually, the NULL=0 question is part of a more general problem that
may not be addressed correctly in ANSI C prototypes.

In a single routine there is no problem with NULL=0 because that usage
is guaranteed by the language.  The problem arises when we call one
routine from another and pass arguments.  For everything to work right
the calling routine must pass the right number of arguments to the
called routine, with all arguments having the right type.  The problem
is that the compiler has no way of knowing what these are -- it takes
it on faith from the code present in the routine that has the call.
In the case of an uncasted 0, it sees an integer, and sets up the
call as though it were passing an integer.  This fails unless, by chance,
the architecture is such that a NULL pointer and an integer 0 happen
to look alike.

However the problem is more general, and I am not sure that ANSI
function prototypes handle it properly.  [I don't have the spec,
so I am winging it a bit here.]  The problem breaks into three parts.
Part 1 is to provides a means for both the calling routine and the
called routine to have a description of the calling sequence.  Part
2 is to ensure that they have the *same* description.  Part 3 is to
ensure synchronization, i.e. to ensure that the routines affected
change when the description changes.

Function prototypes addresses the first question.  The second question
is the tricky one.  The natural way to do this is to put the decsription
in one place, which is shared by both caller and called routines.  In
the UNIX/C environment this means putting them in a header file.  It
also has implications for the called routine.  Does the called routine
still have an inline calling sequence declaration?  Look at this:

#include "prototypes.h"
....
type_of_foo foo(a,b)
  type_of_a a;
  type_of_b b;
{....}

I don't know if this is what ANSI C expects, but it's not too hot if
it is, because we now have two descriptions of the calling sequence.
The right thing is simply

#include "prototypes.h"
....
type_of_foo foo(a,b) {....}

In fact, even type_of_foo should be omitted, since that is really part
of the description.

I have the (perhaps mistaken) impression that ANSI function prototypes
cannot be set up this way in the called routine.  If they can't the next
best thing is for the compiler to check whether the function prototype
description in "prototypes.h" matches the actual declaration.

This may be even better from the standpoint of visibility, since the
contents of the header file containing the description are not visible
when you are looking at code referencing the the description.

Another issue is that the called routine may not even have a copy of
the description -- in that case the value of the description file is
much less, since the routine can change without the description file
changing and vice versa.

Part 3 (synchronization) can be handled using make with the right
dependencies, if the other problems are met.

These are some of the issues, as I see them.  I'm not sure that  the
ANSI C function prototypes deal with them all 'properly', but I
haven't seen anything on the net discussing the impact of prototypes
on the called routines.
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

ok@quintus.UUCP (Richard A. O'Keefe) (03/18/88)

In article <25652@cca.CCA.COM>, g-rh@cca.CCA.COM (Richard Harter) writes:
> I don't know if this is what ANSI C expects, but it's not too hot if
> it is, because we now have two descriptions of the calling sequence.
> The right thing is simply
> 
> #include "prototypes.h"
> ....
> type_of_foo foo(a,b) {....}
> 
> In fact, even type_of_foo should be omitted, since that is really part
> of the description.
> 
It depends what you mean by "the right thing".
If you mean "whatever means I have less to write", fine.
If you mean "whatever makes C more like Pascal", fine.
If you mean "whatever will make the code easier to maintain",
you're dead wrong.

When as Joe Maintainer I come along and try to figure out what your function
foo() does, I want to be able to see what the arguments and result are,
which means that I want them *right* *there* with the rest of foo(), where
I can see them.  In fact, I want all the immediately useful information to
be there where I don't have to hunt all over file space to find it.
(Note that I said *immediately* useful.  If there are 10 pages of text
describing the thing, and a couple of hundred test cases, give me in the
source code the names of the files that contain them.)

This was one of the biggest blunders in Pascal:  if you declare a procedure
'forward' in Pascal, you don't even get the NAMES of the parameters at the
actual definition.  Which meant that competent Pascal programmers always
put a copy of the parameter list there as a comment.

This is one of the things the ANSI C committee got right.
There are a lot of such things.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (03/20/88)

In article <25652@cca.CCA.COM> g-rh@CCA.CCA.COM.UUCP (Richard Harter) writes:
>I don't know if this is what ANSI C expects, but it's not too hot if
>it is, because we now have two descriptions of the calling sequence.

Yes, but since a diagnostic is required for such a declaration mismatch,
you get the opportunity to fix it before it's actually used.

lai@vedge.UUCP (David Lai) (03/29/88)

In article <2448@umd5.umd.edu>, louie@trantor.umd.edu (Louis A. Mamakos) writes:
> 
> On, for example, our Unisys 1100 machine, a "regular"
> pointer is 8 bytes long (2 words, 72 bits).  A pointer to a function is 
> 64 bytes long (8 words, 288 bits).  
The math is wrong, 64 bytes ( 16 words, 576 bits).

Our news feeds are very late so this reply may have been posted already
-- 
The views expressed are those of the author, and not of Visual Edge or Usenet
David Lai (vedge!lai@oliver.cs.mcgill.edu || ...decvax!musocs!vedge!lai)

ray@micomvax.UUCP (Ray Dunn) (03/30/88)

In article <10576@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>
> [the definitive, correct explanation of NULL and pointers]
>

In article <636@acer.stl.stc.co.uk> scott@acer.UUCP (Mike Scott) totally
f*cks things up again by writing:
>
>If I could add my 2 pennyworth.  From K&R p192, we read " it is
>guaranteed that assignment of the constant 0 to a pointer will
                 ****************************
>produce a null pointer distinguishable from a pointer to any object".
>
>From page 71: "Within a function, each argument is in effect a local
>variable initialized to the value with which the function was called."
>
>It follows at once from these that the integer 0 may be supplied in
                                    *****************************
>any function call ......

Mike, is your slip showing clearly enough??

The compiler treats the *constant* zero specially, as being any required
number of bits to match the type of the pointer lvalue *in an assignment* -
i.e.  that syntactic thingamy which uses the "=" symbol!!!!

In other words, when ='ed to a pointer, the constant 0 is automatically
cast to the correct type.

In all other cases, 0 is just like any other int, constant or variable, and
must be cast to the correct type, as it occupies the number of bits defined
for ints in that implementation.

Everybody - *LISTEN* - Chris is *RIGHT* - don't write any other explanation
- read Chris' article again, and *LEARN*.

Ray Dunn.  ..{philabs,mnetor,musocs}!micomvax!ray

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (04/04/88)

There has been a great deal of misunderstanding of the use of zero and
pointers. It seems clear in K&R and practice that assignment of a zero
to a pointer produces a NULL pointer of the appropriate type. What is
incorrectly assumed is that zero *is* a NULL pointer.

People (usually ;>) have no problem with the idea the while assigning a
zero to a float gives it the vaule 0.0, in most implementations the
float value does not have all bits set to zero.

There are two places in which a zero will not work as expected:
  In expressions, adding an int expression to a pointer produces a
pointer result, while adding an int expression to an int constant (zero)
produces an int result. Since there does not seem to be a good use for
treating the sum of two ints an address in any remotely portable
program, this is of interest but not concern.

  In procedure calls, however, on many machines the arguments must have
the correct type for portability because either the size of int and
pointer are not the same, or because the size and content of various
pointer types are not the same.

  If prototypes are used, then coding need not be as precise, since use
of 0 where 0L is needed, or zero where (char *)0 is needed will be
corrected by the compiler. Without prototypes the arguments will need to
be correct if the program is to be portable.

There are many programs which "have worked for years" which are not
portable, because of this lack of typing on arguments. Most of these run
on any machine which has the size of int equal sizeof pointer, and all
pointers are the same in content. This includes the VAX and 68000
family. Other machines, such as some Data General models, Cray, small
Intel processors, SPARC, and some non-UNIX C compilers on any machine
will not accept this lack of explicit typing.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

guy@gorodish.Sun.COM (Guy Harris) (04/05/88)

> Most of these run on any machine which has the size of int equal sizeof
> pointer, and all pointers are the same in content. This includes the VAX
> and 68000 family. Other machines, such as some Data General models, Cray,
> small Intel processors, SPARC, and some non-UNIX C compilers on any machine
> will not accept this lack of explicit typing.

Umm, I agree with your sentiments whole-heartedly, but I do have to point out
that on SPARC, sizeof (mumble *) == sizeof (frotz *) == sizeof int for all
values of "mumble" and "frotz", and that a "mumble *" contains the address of
the appropriate byte of the "mumble" in question (SPARC being big-endian) for
all values of "mumble".  For better or worse, on SPARC you can get away with
passing 0 or NULL to routines expecting a pointer.  (You'd better not try to
*dereference* that null pointer, though.)

Using tricks such as

	printf(fmt, args)
		char *fmt;
		char *args;
	{
		char *ap;

		...

		ap = &args;

		while ((c = *fmt++) != '\0') {

			switch (c) {

			case 'd':
				int_value = *((int *) ap);
				ap += sizeof int;

		...
	}

won't work on SPARC - you have to use the "varargs" stuff - but that's a
different matter, as is the fact that SPARC requires (2,4,8)-byte alignment of
(2,4,8)-byte atoms (many other chips require some or all of these alignments as
well, including some CISCs such as the WE32100).

ark@alice.UUCP (04/05/88)

In article <10229@steinmetz.steinmetz.ge.com>, davidsen@steinmetz.UUCP writes:

> People (usually ;>) have no problem with the idea the while assigning a
> zero to a float gives it the vaule 0.0, in most implementations the
> float value does not have all bits set to zero.

I am overwhelmed by curiosity.  Can you give me examples of
three machine architectures in which the floating-point value
0 does not have all its bits set to zero?

djones@megatest.UUCP (Dave Jones) (04/05/88)

in article <10229@steinmetz.steinmetz.ge.com>, davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) says:
> 
> There has been a great deal of misunderstanding of the use of zero and
> pointers. It seems clear in K&R and practice that assignment of a zero
> to a pointer produces a NULL pointer of the appropriate type. What is
> incorrectly assumed is that zero *is* a NULL pointer.
> 

... [ Lot's of good, true, stuff about pointers etc. ]

> There are many programs which "have worked for years" which are not
  ^^^^^ ^^^ ^^^^ ^^^^^^^^ ^^^^^  ^^^^ ^^^^^^ ^^^ ^^^^^  ^^^^^ ^^^ ^^^
> portable, because of this lack of typing on arguments. Most of these run
  ^^^^^^^^
> on any machine which has the size of int equal sizeof pointer, and all
> pointers are the same in content. This includes the VAX and 68000
> family. Other machines, such as some Data General models, Cray, small
> Intel processors, SPARC, and some non-UNIX C compilers on any machine
		    ^^^^^
> will not accept this lack of explicit typing.
> -- 
> 	bill davidsen		(wedu@ge-crd.arpa)
>   {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
> "Stupidity, like virtue, is its own reward" -me


SPARC?  Really?  REALLY??   What have they done?  Please clarify.

This is news to me.  I was naively assuming that since the current
class of Sun workstations is MC68020, they would attempt to be sure
that 68000 C programs would be portable to SPARC. (Even non-portable ones
which "have worked for years.")

I'm stunned.  Really.  REALLY!!  Stunned.

Please tell me this was just an April Fool's joke that got here a
little late.


		Dave (int*) Jones

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (04/06/88)

Aside:

> People (usually ;>) have no problem with the idea the while assigning a
> zero to a float gives it the vaule 0.0, in most implementations the
> float value does not have all bits set to zero.

  I do believe IEEE and Vax's version of float is a bit pattern
consisting of all zeros.  So what are these implementations that don't
use zero bit pattern?


-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

rbutterworth@watmath.waterloo.edu (Ray Butterworth) (04/06/88)

About the best way I've seen of convincing people of why you
can't simply use 0, or NULL, or 0L, or (void*)0, or any such
single token as a null pointer is to consider the following
function call:

auto char *bigstring;
bigstring = concatenate("This", " ", "is", " ", "it", ".", (char*)0);

where concatenate() mallocs enough memory, copies all its arguments
into one long string, and returns a pointer to that string.

It takes a variable number of arguments of which the last must be
a null (char *) pointer.

Now on any compiler for which a null character pointer is different
from the int 0, whether with a different size or a non-zero bit pattern,
there is absolutely no way that the compiler can automatically generate
a correct value for the last argument.  Even prototypes won't help
in this case.

You may use (char*)0, NULL_P(char*), (char*)NULL, or whatever
your favourite is.  But you must explicitly put the (char*) type
in there somehow or other.

If you simply use 0, NULL, 0L, or something that doesn't mention
(char*), it might just happen to work on YOUR compiler NOW, but
there is no reason you should assume that this code will work
anywhere else at any other time.  It is simply wrong.

And before someone suggests defining NULL as (char*)0, remember
that there are machines for which sizeof(char*) is not the same
as sizeof(long*).  All that does is give you something else that
is wrong but happens to work in a few more circumstances.  It is
still wrong.

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (04/06/88)

  Although the size of int and pointers may be the same on SPARC, my
impression from limited testing is that the machine can't access
unalligned data, and that the compiler doesn't catch this. Thus if I do
something dumb like:

	char m[5];
	a = foo(&m[0], &m[1]);
where:
	foo(x,y)
	  int *x, *y;

that at least one of the pointers will not be on a 4 byte boundary, and
therefore will not work as expected by people who use a VAX, 68000,
80x86, etc. This was the reason for my caution about SPARC.

  If this is not the case, please correct me, as my access to a Sun4 was
limited, and other things might have caused the problem.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (04/06/88)

In article <1323@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:

|   I do believe IEEE and Vax's version of float is a bit pattern
| consisting of all zeros.  So what are these implementations that don't
| use zero bit pattern?

  Honeywell 600 and DPS series for sure (0.0 = 0400000000), I believe
some IBM minis, and some Data General. Like "pointer same as int," it
works on many machines but assumes a hardware dependency which may be
avoided by careful typing.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (04/06/88)

In article <432@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:

| SPARC?  Really?  REALLY??   What have they done?  Please clarify.

  My understanding of SPARC, having read documentation and tried some
programs, is that SPARC requires that int, float, and long be alligned
rather than on an arbitrary boundary. This means that just typing a
pointer to char or short as pointer to {int long float double} will not
produce code which loads consecutive bytes starting at the 1st character
address.

  If you might be doing something like building a raw data buffer, by:
	*((int *)bufptr = intval;	/* char *bufptr */
	bufptr += sizeof(int);
it is portable to any machine which doesn't require allignment, but for
true portability something like the following must be done:
	memncpy(bufptr, &intval, sizeof int);
	bufptr += sizeof int;

  I am not claiming or even implying that this is going to break any
large number of programs, just that SPARC has more restrictions on
addressing than the 68000 series.  I don't have dpANS here, there may be
a better procedure to use than memncpy. 

  It also means that structures are not packed the same way in some
cases, I assume, but was not able to verify.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

davidsen@steinmetz.steinmetz.ge.com (William E. Davidsen Jr) (04/06/88)

In article <7792@alice.UUCP> ark@alice.UUCP writes:
}In article <10229@steinmetz.steinmetz.ge.com>, davidsen@steinmetz.UUCP writes:
}
}> People (usually ;>) have no problem with the idea the while assigning a
}> zero to a float gives it the vaule 0.0, in most implementations the
					      ^^^^
	Foot in mouth time... I think "some" would be more correct.
}> float value does not have all bits set to zero.
}
}I am overwhelmed by curiosity.  Can you give me examples of
}three machine architectures in which the floating-point value
}0 does not have all its bits set to zero?

No. I can tell you two series of systems which have non-zero 0.0
(Honeywell 6000 and DPS), two which I'm told have non-zero 0.0 (mid 70's
IBM minis and Data General). My reasoning is as follows:

  given:	there are machines which do have C and don't have
		flat 0 expressed as all bits zero,

  and		the proposed C standard doesn't say that any machine
		running C must have float 0 be all bits zero,

  and		I have no reason to think that innovation and
		standards have stopped evolving,

  therefore:	I conclude that assuming float 0 to be all bits
		zero, or any other machine independent integer value,
		is non-portable, now and in the future.

  In light of the previous discussion, I was discussing "really
portable" as opposed to "portable to many machines."
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

sjs@spectral.ctt.bellcore.com (Stan Switzer) (04/07/88)

In reference to:
> > People (usually ;>) have no problem with the idea the while assigning a
> > zero to a float gives it the vaule 0.0, in most implementations the
> > float value does not have all bits set to zero.

And:
>   I do believe IEEE and Vax's version of float is a bit pattern
> consisting of all zeros.  So what are these implementations that don't
> use zero bit pattern?

The Honeywell 6000 Series and GE 625/635 machines represented NORMALIZED
float zero as o400000000000 (36 bits, msb ON).  This may date back to its
grand-uncle the 709.

BTW, all 0 bits WAS a float zero, but it wasn't normalized.  The results of
most float operations on unnormalized values was undefined.  I suspect that
additions and subtractions would lose precision in the process
of alligning the points.

hook@jvnca.csc.org (Ed Hook) (04/07/88)

In article <1323@PT.CS.CMU.EDU> edw@IUS1.CS.CMU.EDU (Eddie Wyatt) writes:
>Aside:
>
>> People (usually ;>) have no problem with the idea the while assigning a
>> zero to a float gives it the vaule 0.0, in most implementations the
>> float value does not have all bits set to zero.
>
>  I do believe IEEE and Vax's version of float is a bit pattern
>consisting of all zeros.  So what are these implementations that don't
>use zero bit pattern?
>

   There's at least one architecture with this characteristic. The Control
Data Cyber 205 ( and its successor, the ETA^10 ) employ a 64-bit floating
point representation in which zero appears as

                    8000 0000 0000 0000 ;

in fact, any word whose most significant nybble is an '8' is = 0.0 in this
system - the example above is the flavor of 0.0 which is produced by any
computation  whose result is zero ( and, thus, is the most familiar example 
of 0.0 ).

   I can't seem to find my copy of the IEEE Floating Point Standard & so can't
cite chapter & verse, BUT I don't believe that it requires a valid floating
point 0.0 to be represented by a bit pattern having all bits = 0. I say this
because it didn't appear to me that the 205's representation of floating point
quantities was in violation of the standard on this score, when I went through
the document recently. If I'm wrong in holding this opinion, I would appreciate
being set straight ...

Ed Hook
Control Data Corporation
John von Neumann National Supercomputer Center
Princeton, NJ

"If I said it, it's MY opinion ... at least for now ..."

pardo@june.cs.washington.edu (David Keppel) (04/09/88)

In article <6594@bellcore.bellcore.com> sjs@spectral.UUCP (Stan Switzer) writes:
>In reference to:
>> > People (usually ;>) have no problem with the idea the while assigning a
>> > zero to a float gives it the vaule 0.0, in most implementations the
>> > float value does not have all bits set to zero.
>
>And:
>>   I do believe IEEE and Vax's version of float is a bit pattern
>> consisting of all zeros.  So what are these implementations that don't
>> use zero bit pattern?

[ some machines that use non-0's for 0.0 ]

According to my hardware book [DEC85] the F_floating and D_floating form on
the VAX both represent 0.0 by having the sign and magnitude bits <7:15> zero,
with any mantissa.  Similarly the G_floating type is 0.0 when <4:15> are zero
bits and the H_floating type 0.0 when <0:15> are zero bits.

Thus the following are perfectly valid F_floating representations for 0.0:

    00 00 00 00
    ff ff 00 7f
    43 21 00 12

And there are analogs for the other types.  This does mean (for the VAX)
that assigning at least 16 bits of zeroes (for F_floating and D_floating)
or 32 bits of zeroes (for G_floating and H_floating) give 0.0, which I think
was the spirit of the original poster.  I would like to point out, however,
that it is very much *not* possible to compare bit patterns of a float to
*anything* in 8- 16- or 32- bit quantities and determine whether the value
is 0.0 *except* for the H_floating type (which is 16 bytes long).

[DEC85] VAX ARCHITECTURE REFERENCE MANUAL (c) 1985 Digital Equipment
Corporation, Maynard Massachusetts.  Copied without permission.

	;-D on  ( VAX?  What's that? )  Pardo

news@ism780c.UUCP (News system) (04/09/88)

In article <6594@bellcore.bellcore.com> sjs@spectral.UUCP (Stan Switzer) writes:
>In reference to:
>The Honeywell 6000 Series and GE 625/635 machines represented NORMALIZED
>float zero as o400000000000 (36 bits, msb ON).  This may date back to its
>grand-uncle the 709.
>
>BTW, all 0 bits WAS a float zero, but it wasn't normalized.  The results of
>most float operations on unnormalized values was undefined.  I suspect that
>additions and subtractions would lose precision in the process
>of alligning the points.

Are you sure?  on the 709 all zero bits is the NORMALIZED zero.  quoting from
the 709 reference manual on page 8 (I still have one) "A normal zero has no
bits in both the charactersic and the fraction."  The form o400000000000  is
the unnormalized form and using this one is the way to get "funny" results.
My recloction is that the GE 635 (I no longer have that manual) did indeed
copy the 709.

     Marv Rubinstein -- Computer historian