[net.lang.c] sizeof ptrs,lint,etc

cottrell@nbs-vms.ARPA (02/06/85)

/*
As promised, here are more irrational viewpoints:
> > i DO believe that sizeof(foo *) should be sizeof(bar *). otherwise it's
> > just too confusing. more irrational viewpoints later.
> 
> It's too confusing only if you're easily confused.  Cast your d*mn pointers
> properly (like "lint" tells you to do) and lots of problems go away.
> If you absolutely *need* a pointer that can point to any one of ten
> things, and a union of ten pointer types won't do, use "char *" (or "void *"
> if/when it appears in the ANSI standard).
> 
> 	Guy Harris
> 	{seismo,ihnp4,allegra}!rlgvax!guy

Oh my brain hurts! (Is Guy out to get me or am I just paranoid? :-)
I hate lint. All it ever does is complain about code that I know works.
I don't like casting funxions to (void). I don't like casting arguments
to funxions. I don't like /*NOTREACHED*/. I do `if (exp) return exp;'
to avoid the braces when I really mean `if (exp) { exp; return;}'
I don't declare args as ptrs if I merely pass them on to another funxion.
Even the UNIX REVIEW they gave away at Uniforum says that void just makes
programs harder to read. What I do with lint is sweep it under the rug.

Chris Torek writes:
> NO! NO! and NO!
> 
> [please turn your volume control way up]  PASSING AN UNCASTED ZERO
> TO A ROUTINE THAT EXPECTS A POINTER IS NOT PORTABLE, AND IS JUST PLAIN
> WRONG.  GET THAT STRAIGHT *NOW*!
> 
> [you can turn your volume control back down]
> 
> The following code is NOT portable and probably fails on half the
> existing implementations of C:
> 
> 	#define NULL 0		/* this from <stdio.h> */
> 
> 	f() {
> 		g(NULL);
> 	}
> 
> 	g(p) int *p; {
> 		if (p == NULL)
> 			do_A();
> 		else
> 			do_B();
> 	}
> 
> The value ``f'' passes to ``g'' is the integer zero.  What that
> represents inside g is completely undefined.  It is not the nil
> pointer, unless your compiler just happens to work that way (not
> uncommon but not universal).  It may not even be the same size (in
> bits or bytes or decidigits or whatever your hardware uses).
> 
> One tiny little simple change fixes it:
> 
> 	f() {
> 		g((int *)NULL);
> 	}
> 
> It is now portable, and all that good stuff.  You can write the
> first and hope real hard, or you can write the second and know.
> 
> The point is that the zero value and the nil pointer are two completely
> different things, and the compiler happens to be obliged to convert the
> former to the latter in expressions where this is forced (e.g., casts
> or comparison with another pointer).  It is NOT forced in function
> calls (though under the ANSI standard it would be in some cases).
> (I claim that it IS forced in expressions such as if (p) where p is a
> pointer; this is "if (p != 0)" where type-matching p and 0 forces the
> conversion.)
> 
> (Now I WILL agree that if you have the option of making the nil pointer
> and the zero bit pattern the same, then you will have less trouble with
> existing programs if you do....)
> -- 
> (This line accidently left nonblank.)
> 
> In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
> UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
> CSNet:	chris@umcp-cs		ARPA:	chris@maryland

While I hate to disagree with such a Wizard as C.T. (I can just see my
brother yelling across uom "no he doesn't, Chris, he LIKES to argue...)
LET'S GET BACK TO BASICS!!! What have they done to the poor C language?
It used to be quite clean. It was originally developed on a pdp-11, ported
to an Interdata 8/32, Honeywell 6000, & IBM 370. On each of these machines,
	either sizeof(int) = sizeof(int *) = sizeof(??? *)
	or     sizeof(long)= sizeof(long*) = sizeof(??? *).
Of these, the h6000 is not byte addressable & so the bizarre pointer
format of K&R page 211 is used. Note that a pointer is always 36 bits
even tho half may be unused. So far, so good. Now somewhere along the line
someone broke the rule & decided that maybe pointers to different objects
should be different lengths. The pros? Possible storage savings. The cons?
Now I have to cast pointers. The universe is out of kilter! What about
pointers to pointers? Is sizeof(int **) = sizeof(char **)? We all want
C & UNIX to run everywhere, but let's not bend over backwards to
accommodate weird architectures. If space is sometimes wasted on a
weird machine, it is for conceptual simplicity. When & if a prog is
ported to a bizarre machine it will probably have to be tinkered with
anyway. ALL THINGS IN MODERATION, INCLUDING PORTABILITY.
   The nil pointer in C *IS* a bit pattern of some size all zeros. This
is not lisp. If you want to generate a cell called `nil' & explicitly
compare to `(??? *) &nil' be my guest. The syntax `if (p)' or `if (!p)'
suits me just fine.
*/

ron@brl-tgr.ARPA (Ron Natalie <ron>) (02/06/85)

> While I hate to disagree with such a Wizard as C.T. (I can just see my
> brother yelling across uom "no he doesn't, Chris, he LIKES to argue...)
> LET'S GET BACK TO BASICS!!! What have they done to the poor C language?
> It used to be quite clean. It was originally developed on a pdp-11, ported
> to an Interdata 8/32, Honeywell 6000, & IBM 370. On each of these machines,
> 	either sizeof(int) = sizeof(int *) = sizeof(??? *)
> 	or     sizeof(long)= sizeof(long*) = sizeof(??? *).
Yes, and on machines where only the second case is true, the example that
Chris gives will fail.  This happens on machines that have perfectly
reasonable archictures, but larger word sizes.  On the Denelcor HEP
for instance, the word size is 64 bits and that is the size of int and
long.  64 bits is massive overkill for a pointer, 32bits is used and
guess what?  Shorts are only 16 bits, there is no integer that corresponds
the the size of a pointer.  And like many other machines long pointers
short pointers and char pointers all have different representations
(albeit they do have the same size).

> We all want
> C & UNIX to run everywhere, but let's not bend over backwards to
> accommodate weird architectures.

If by using LINT and careful program coding you can achieve portability,
why not?  The whole point of C is to give the USER a chance at portability,
not to prohibit nor require it (this was one of the policy statements of
the ANSI C people).  The whole point of making them portable is to avoid
areas of problem, like making blind assumptions about type sizes, rather
than arguing that certain machines should not be allowed to exist.

> If space is sometimes wasted on a
> weird machine, it is for conceptual simplicity. When & if a prog is
> ported to a bizarre machine it will probably have to be tinkered with
> anyway. ALL THINGS IN MODERATION, INCLUDING PORTABILITY.

So, it's up to you.  Go ahead and write non-portable code if you don't
want to have to worry about making it run on "wierd" machines.   I don't
think we should tell the C implementers on these machines to do purposefully
inefficient implementations of C.  I want a reasonable compiler on these
machines, because I don't want them to give up on C as the systems programming
language.

>    The nil pointer in C *IS* a bit pattern of some size all zeros. This
> is not lisp. 

Yes this is not lisp. Sorry, but once again the nil pointer *IS NOT*
guaranteed to be of some size and all zeros.  It's not guaranteed to
have a size at all.  There is no nil pointer.  All you are guaranteed is
that you can assign zero to a pointer and compare a poineter to zero
reliably, and that what ever it maps zero to when it stores it will
never look like a valid pointer. We could just as easily map "0" to -1
in a PDP-11 int pointer, since they are guaranteed to not be valid when
odd.  Some systems even have out of the word bits indicating pointer
validity, that could be used.


IN CONCLUSION, one of the really nice things about C (which you are
also noticing), is that it allows you to stand on your head and be
portable or blatently ignore the consequeces when you "know" what is
going to happen.  This flexability can be seen in all great works
such as the Constitution of the United States or Army Regulation
380-380.

-Ron

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (02/06/85)

> I hate lint. All it ever does is complain about code that I know works.
> I don't like casting funxions to (void). I don't like casting arguments
> to funxions. I don't like /*NOTREACHED*/. I do `if (exp) return exp;'
> to avoid the braces when I really mean `if (exp) { exp; return;}'
> I don't declare args as ptrs if I merely pass them on to another funxion.
> Even the UNIX REVIEW they gave away at Uniforum says that void just makes
> programs harder to read. What I do with lint is sweep it under the rug.

If "lint" (assuming a modern version such as the one I use) complains
about your code, then if it works it is most likely an ACCIDENT of the
particular C implementation you are working with and would not work on
some other system with considerably different machine architecture.

Casting function return values to (void) not only documents the fact
that the function returns a value which you are discarding, but if
done as a natural action and not as a blind response to "lint" warnings
it also shows that you have considered the return value and have made
the decision that it is not needed from that invocation of the function.
Far too much C code fails to test for failure of functions such as
write() or even malloc().  Not considering what should be done in such
cases is pure sloppiness.

Believe it or not, C strongly supports data typing; many of us in the
world of production software think that this is a Good Thing.  If you
need to pass a (struct foo *) to a function, then giving it an (int) or
a (struct bar *) is simply wrong.  If you write your code cleanly, you
should seldom need to cast function arguments.

/*NOTREACHED*/ should be unnecessary in many cases, were "lint" a bit
smarter.  However, how is "lint" to know that abort() and exit() never
return?  Presumably some other mechanism could be figured out to handle
these situations.  However, /*NOTREACHED*/, /*ARGSUSED*/, /*VARARGS2*/,
and other such pragmas add some useful documentation to the source code.

There is no rational defense for "return exp" when that is not what you
mean for your code to be saying to the reader.  (The same applies to
using 3 XORs to swap the contents of variables under normal circumstances.)

Undeclared data default to int, which for very sound reasons on many
architectures may not be handled the same as various pointer types.
By not declaring your pointers, you guarantee that your code will not
port to some machines.

The quote from UNIX REVIEW is in Bill Tuthill's "C Advisor" column
and is his opinion.  I have heard others claim that (void) correctly
used makes code EASIER to read, or at least to maintain.

There should be no lint to sweep under the rug, if you have done your
coding properly.  If you find yourself sweeping lint under the rug,
then you haven't understood what strong typing and "lint" is all about.

> LET'S GET BACK TO BASICS!!! What have they done to the poor C language?
> It used to be quite clean. It was originally developed on a pdp-11, ported
> to an Interdata 8/32, Honeywell 6000, & IBM 370. On each of these machines,
> 	either sizeof(int) = sizeof(int *) = sizeof(??? *)
> 	or     sizeof(long)= sizeof(long*) = sizeof(??? *).
> Of these, the h6000 is not byte addressable & so the bizarre pointer
> format of K&R page 211 is used. Note that a pointer is always 36 bits
> even tho half may be unused. So far, so good. Now somewhere along the line
> someone broke the rule & decided that maybe pointers to different objects
> should be different lengths. The pros? Possible storage savings. The cons?
> Now I have to cast pointers. The universe is out of kilter! What about
> pointers to pointers? Is sizeof(int **) = sizeof(char **)? We all want
> C & UNIX to run everywhere, but let's not bend over backwards to
> accommodate weird architectures. If space is sometimes wasted on a
> weird machine, it is for conceptual simplicity. When & if a prog is
> ported to a bizarre machine it will probably have to be tinkered with
> anyway. ALL THINGS IN MODERATION, INCLUDING PORTABILITY.

You DON'T have to cast pointers if you use them right (with a few
exceptions, for "generic" aligned pointed returned from malloc()).

Who CARES whether sizeof(int **) == sizeof(char **)?  It doesn't
matter in correctly-written code.  (The only real exception is taken
care of by the varargs mechanism, which is clearly necessary to cope
with this on real machines.)

The distinction among different data types should ALWAYS matter to
a conscientious programmer, even when the machine is forgiving.  If
your data types do not match, you have written incorrect code!  Now
that there really are architectures where effective C implementations
have to enforce the distinctions, there is additional reason for taking
care with this matter.

I consider it a BUG, and a sign of insufficient care taken in crafting
the original code, if the vast majority of one's C code (other than
obviously system-specific functions) does not port to another quite
different architecture UNCHANGED.  Of course, for this to work really
well the code has to be targeted at a carefully-specified environment
such as that described in the forthcoming ANSI C standard, the /usr/group
1984 standard, or the System V Interface Definition (all of which are
very much inter-compatible, thankfully).

>    The nil pointer in C *IS* a bit pattern of some size all zeros. This
> is not lisp. If you want to generate a cell called `nil' & explicitly
> compare to `(??? *) &nil' be my guest. The syntax `if (p)' or `if (!p)'
> suits me just fine.

Sorry, the null pointer is not necessarily a 0 bit pattern.  We
discussed this at length a few months ago.  There can be no such thing
as a "generic" null pointer on all reasonably interesting architectures.


In summary, strong data typing is not a conspiracy.  It is required
in order to be able to develop correct, reasonably portable,
applications in C.  I and others I know of have no trouble using data
types in accordance with the current (and forthcoming) rules.  Indeed,
we often find that by being careful in this regard, we can use "lint"
to catch errors that otherwise would go unnoticed (and that may even
work on the system upon which the code is first developed).  As
professionals we cannot afford to indulge in sloppy workmanship at the
expense of long-term maintenance costs and loss of product reliability.
As a dedicated amateur hacker, you clearly do not care about such
issues, so why not just be quiet, do your own thing, and let the rest
of us write our code in peace (knowing that we won't have much trouble
from it in the future)?

guy@rlgvax.UUCP (Guy Harris) (02/06/85)

> I hate lint. All it ever does is complain about code that I know works.

Correction: all it ever does is complain about code that has worked so
far on the machines you've dealt with.

> I don't like casting funxions to (void). I don't like casting arguments
> to funxions. I don't like /*NOTREACHED*/. I do `if (exp) return exp;'
> to avoid the braces when I really mean `if (exp) { exp; return;}'
> I don't declare args as ptrs if I merely pass them on to another funxion.
> Even the UNIX REVIEW they gave away at Uniforum says that void just makes
> programs harder to read. What I do with lint is sweep it under the rug.

Fine.  Just don't go bitching to the manufacturer of a particular machine
if your code doesn't work on their machine.  I can thank "lint" for having
caught quite a number of *logic* errors in code which manifested themselves
as "lint" errors.  Lots of people have no trouble getting code to pass
"lint"; their code is more trustworthy.

>    The nil pointer in C *IS* a bit pattern of some size all zeros.

*******t.  Read K&R.  It makes no such categorical statement.  There is
no other authority who can make such a statement.

> This is not lisp.

Nor is it BCPL.  If you want BCPL, you know where to find it.

> If you want to generate a cell called `nil' & explicitly compare to
> `(??? *) &nil' be my guest. The syntax `if (p)' or `if (!p)' suits me
> just fine.

Bogus.  The syntax

	if (p)

is equivalent to

	if (p == 0)

which, if you read K&R, compares "p" with the bit pattern which represents
a null pointer.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy