[comp.lang.misc] C Pointers

hankd@pur-ee.UUCP (Hank Dietz) (03/22/90)

In article <5919@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:
>In article <5200048@m.cs.uiuc.edu> robison@m.cs.uiuc.edu writes:
>>Can anyone clue me as to the basis for pointer paranoia?
...
>Let's do some pointer arithmetic.
>First we need some declarations:
>
>	typedef struct {
>		int i;		/* say 4 bytes */
>		char c;
>	    } Node, *Nodes;
>
>	Nodes x, y;
>	int i, j;
>
>And then some code (assume with me that things are initialized):
>
>	x = y + i;
>
>This means assign the sum of y and i*5.  Not too bad, 1 shift and 3 adds.

What's so bad about that?  It is exactly equivalent to:

	x = &(y[i]);

In fact, except for the lack of structures, you do this in "Good Ol'
Fortran" (Look Ma -- No Pointers!) when you write code like:

	CALL POINTER( Y(I) )

	SUBROUTINE POINTER(X)
	... operations on X or even on X(J) ...

Why is this Ok when it happens at a grungy Fortran subprogram interface and
despicable when it happens within a nice clean C assignment syntax?  Pointer
arithmetic makes as much sense as array subscripting, because that's exactly
what it is...  except pointers are specified in a more self-consistent way
and, unfortunately, *EVERYBODY* is taught array subscripting first.

>Let's do some more:
>
>	j = x - y;
>
>Hmmm.  A subtract and a DIVISION by 5!?

True, but so what?  If x and y are pointers to objects within the same array
(as C requires) I find it perfectly natural to get the number of objects
between them as the result of subtraction.  If you don't like it, then don't
use it -- few people ever do...  after all, you can't do this using Fortran's
array syntax... so I guess it is expendable.  ;-)

>This stuff is great!  Plus costs 4 times what it should and minus
>becomes worse than division.  *And* it makes my code obscure to
>myself and others, not to mention hard to optimize.  Such a deal.

If, using your compiler techniques, pointer arithmetic is harder to analyze
than "Good Ol' Fortran" array references, then you are not doing very good
analysis of Fortran.  The only aspects in which C pointers are harder to
analyze than Fortran array references are:

[1] C pointers can be multi-level, i.e., pointers-to-pointers.
[2] C supports recursion, hence pointers/subscript expressions can be
    more difficult to decipher.

I think we can agree that [2] is desirable despite this.  IMHO, [1] wouldn't
be needed if C would allow first-class declarations of parameterized types,
e.g., "int a[n];" where n is a variable...  but lacking that, even [1] isn't
very offensive to me.

>Preston Briggs				looking for the great leap forward
>preston@titan.rice.edu

Well, I usually agree with you, Preston, but I think your "Fortran is easy
to understand" bias is showing on this one.  ;-)

						-hankd@ecn.purdue.edu

jlg@lambda.UUCP (Jim Giles) (03/22/90)

From article <14823@pur-ee.UUCP>, by hankd@pur-ee.UUCP (Hank Dietz):
> [...]                 The only aspects in which C pointers are harder to
> analyze than Fortran array references are:
> 
> [1] C pointers can be multi-level, i.e., pointers-to-pointers.
> [2] C supports recursion, hence pointers/subscript expressions can be
>     more difficult to decipher.

And: [3] C pointers might be aliased while Fortran arrays are not.  This makes
a more significant difference to optimization than your two points combined.
This is why C still tends to be slower than Fortran on most scientific
applications.

Aside from this, I don't like pointers because they are too low level.
I don't use GOTOs if a language provides high-level control structures
that are adequate to my needs.  I don't use pointers for the same reason.
If my algorithm manipulates arrays, that's the appropriate data structure
to use - NOT pointers.  C, Pascal, and Modula all make the mistake of
not allowing procedures to take array arguments of arbitrary size.
C's pointers are an undesireable 'hack' to get around this constraint.
The cost is that ALL array args to procedures are turned into pointers
by the procedure interface - not at all desireable because of [3]
above!

J. Giles
> 
> I think we can agree that [2] is desirable despite this.  IMHO, [1] wouldn't
> be needed if C would allow first-class declarations of parameterized types,
> e.g., "int a[n];" where n is a variable...  but lacking that, even [1] isn't
> very offensive to me.
> 
>>Preston Briggs				looking for the great leap forward
>>preston@titan.rice.edu
> 
> Well, I usually agree with you, Preston, but I think your "Fortran is easy
> to understand" bias is showing on this one.  ;-)
> 
> 						-hankd@ecn.purdue.edu

preston@titan.rice.edu (Preston Briggs) (03/22/90)

In article <14823@pur-ee.UUCP> hankd@pur-ee.UUCP (Hank Dietz) writes:
>In article <5919@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes:

>>	x = y + i;

>>This means assign the sum of y and i*5.  Not too bad, 1 shift and 3 adds.

								I mean 2 adds.

>>	j = x - y;

>>Hmmm.  A subtract and a DIVISION by 5!?

>True, but so what?

I think a lot of C pointer fiends write a lot of code deliberately
using pointer arithmetic and then saying proudly:

	"Look, only adds and subtracts!  Must be fast."

I was trying to say that lots of pointer arithmetic is an ugly
way to achieve no extra efficiency, and in fact to occasionally
get bit badly.

I'm familier with the bite; it's happened to me.
I had a pretty little set manipulation package I used in my
register allocator.  One day I looked at the assembly.  A call to the
divide routine!?  "What's all this then!  I don't do division.
Division is for dumb people."  After we noticed what was happening
and massively rewriting chunks of the allocator, we got a factor of
three improvement in code generation time.

>The only aspects in which C pointers are harder to
>analyze than Fortran array references are:
>
>[1] C pointers can be multi-level, i.e., pointers-to-pointers.
>[2] C supports recursion, hence pointers/subscript expressions can be
>    more difficult to decipher.

I think C's pointer are used for too much.
If we want array-like behaviour, we should use arrays.
If we want interesting, dynamic structures, we should use
recursive data types.  C uses pointers for both.  I would
rather not use pointers at all.

>Well, I usually agree with you, Preston, but I think your "Fortran is easy
>to understand" bias is showing on this one.  ;-)
>						-hankd@ecn.purdue.edu

Ah, the "F word".  I hate fortran!  I don't use it.  I (almost) never
have.  However, I do like constructs that are easy to optimize. :-)

--
Preston Briggs				looking for the great leap forward
preston@titan.rice.edu

jamiller@hpcupt1.HP.COM (Jim Miller) (03/23/90)

>I believe the whole thing can be illustrated with one example.  I was 
>writing a client as a programming assignment just yesterday, and I ran into
>a strange bug.  A variable I though was limited 0..4 was very occasionaly
>getting values in the 10,000's.
>
>P.S.  For the guy who wanted bug stories, here's mine.
>
>Randy

Here's one of mine.

A user's program was bombing, but when using the debugger, it went away.
So I, debugger writer, was called in, it was clearly the languages fault.

The problem, it turned out, was in the one routine that the user had turned
checking off -- they claimed they NEVER turned checking off. 

It took me 2 days to find the truth, oh well.

They had an off-by-1 problem with an array index.  The debugger, using the
stack, changed the garbage on the stack enough so that it ran fine when
run with the debugger. 

The language was BASIC.  Clearly a good reason to avoid BASIC.

An even better reason to not ever trust the customer to tell the truth, the
whole truth and nothing but the truth :-)

   jim - nothing but the facts - miller