[comp.lang.c] Is this a bug, or is it just me?

johng@ecrcvax.UUCP (John Gregor) (12/15/87)

I think I have found a problem.  I know that operations of the same 
precedence can be reordered by the compiler, but I wouldn't think that
all three auto-increments would be done after all the additions.  This
happens on both a vax (4.3bsd) and a Sun-3.

#include <stdio.h>

int sum = 0;
int a[3] = {1, 4, 16};
int i = 0;

main () {
    sum = a[i++] + a[i++] + a[i++];
    printf ("sum: %d      i:%d\n", sum, i);
}

Results in:
sum: 3      i:3

Here is part of the code produced by the sun:

	movl	_i,d0
	lea	_a,a0
	movl	_i,d1
	lea	_a,a1
	movl	a1@(0,d1:l:4),d1
	addl	a0@(0,d0:l:4),d1
	movl	_i,d0
	lea	_a,a0
	addl	a0@(0,d0:l:4),d1
	movl	d1,_sum
	addql	#0x1,_i		    /* Here are the auto-increments */
	addql	#0x1,_i
	addql	#0x1,_i

Jeez, if it has to behave this way, you'd at least think it would optimize
the 3 addql's into 1.  [0.5 * :-)]

Am I missing some obscure part of K&R (I've looked).  Or is this truly a
bug?

			 John Gregor
		johng%ecrcvax.UUCP@germany.CSNET

trt@rti.UUCP (Thomas Truscott) (12/17/87)

In article <464@ecrcvax.UUCP>, johng@ecrcvax.UUCP (John Gregor) writes:
>     sum = a[i++] + a[i++] + a[i++];

Gems like this are posted to Usenet every month or so (it seems),
and the usual response is "Poor grasshopper, read K&R page 50
and stop bothering us".

I wish the responses would include something like
"Why don't C compilers flag blatant errors such as the above?"
If we all said bad things about compilers that permitted such code
I bet the vendors would fix them!  Maybe even AT&T.

And come to think of it, why doesn't someone fix the 4.3BSD compiler
to do this?  After all 4.3BSD "lint" finds it.
Just steal the code from lint! (And catch "i = i++;" too.)
	Tom Truscott

john@bc-cis.UUCP (John L. Wynstra) (12/18/87)

In article <464@ecrcvax.UUCP> johng@ecrcvax.UUCP (John Gregor) writes:
>
>int a[3] = {1, 4, 16};
>int i = 0;
>
>main () {
>    sum = a[i++] + a[i++] + a[i++];

	The C compiler is free to implement this as it chooses.  It is not
a precedence/association question.

	The K&R text says something about this in re: "a[i] = i++;"  Which
a[i] is it that is assigned to, a[i] or a[i+1]?  Answer: it depends on the
compiler.

	The correct way to do what you want is,

sum = a[i] + a[i+1] + a[i+2];
i += 3;

>
>Am I missing some obscure part of K&R (I've looked).  Or is this truly a
>bug?
>
	No bug.  Just a confusion on your part as to what operator precedence
and associativity mean, vs., the runtime order of execution.  This is a
currently popular one in comp.lang.c.  An interesting note is that the
comma operator *does* guarantee the order of evaluation, left to right.

	So, if you think it just wouldn't be proper C if everything weren't
all on one line,

sum = a[i] + a[i+1] + a[i+2], i += 3;

	Operator precedence, and associativity (within one level of precedence),
determine the syntactic structure of the code.  I think of it as a translation
of the code into a generalized tree structure.  (Of course, an actual tree need
not be generated, there are other forms of intermediate structure, and compilers
need not conform to this ideal, but this is my mental image.)  What you now have
is the syntactic structure (terminology borrowed from linguistics) of the
sentence (code fragment), not its semantic meaning, ie, we can `parse' the
sentence [image taken from English 101: we can diagram the sentence out] but we
don't yet know what it all means, *that's* the semantic analysis part of the
compilation.  The point is that the compiler is free to choose what is meant by
such a construction as "i=0, a[i++] + a[i++]"; is it "a[0] + a[1]" or
"a[0] + a[0]" or even "a[1] + a[1]" ... ?  The choice would seem to be up to
the particular way the code generation phase of the compiler works.  Or maybe
it's just the type of compiler itself, eg, is it LL vs. LR? (or top-down vs.
bottom-up?)  Eh!  I give up.  My humble knowledge of compilers is exceeded.
Who knows?

	The meaning should be unequivocably unambiguous.  A recommended style
of one side-effect per statement (or per comma-separated expression) seems to
be in order.

--john
-- 
	John L. Wynstra
	US mail:	Apt. 9G, 43-10 Kissena Blvd., Flushing, N.Y., 11355
	UUCP:		john@bc-cis.UUCP { eg, rutgers!cmcl2!phri!bc-cis!john }

karl@haddock.ISC.COM (Karl Heuer) (12/19/87)

In article <1923@rti.UUCP> trt@rti.UUCP (Thomas Truscott) writes:
>"Why don't C compilers flag blatant errors such as the above?"

I imagine that at least some of the historical reasons for separation of cc
and lint are still valid.

noise@eneevax.UUCP (Johnson Noise) (12/21/87)

In article <1923@rti.UUCP> trt@rti.UUCP (Thomas Truscott) writes:
>In article <464@ecrcvax.UUCP>, johng@ecrcvax.UUCP (John Gregor) writes:
>>     sum = a[i++] + a[i++] + a[i++];
>
>Gems like this are posted to Usenet every month or so (it seems),
>and the usual response is "Poor grasshopper, read K&R page 50
>and stop bothering us".
>
>I wish the responses would include something like
>"Why don't C compilers flag blatant errors such as the above?"
>If we all said bad things about compilers that permitted such code
>I bet the vendors would fix them!  Maybe even AT&T.
>
>And come to think of it, why doesn't someone fix the 4.3BSD compiler
>to do this?  After all 4.3BSD "lint" finds it.
>Just steal the code from lint! (And catch "i = i++;" too.)
>	Tom Truscott

	There is a very good reason why this and other checks caught
by lint are not caught by cc: design philosophy.  Lint is a C program
checker and cc is a C compiler.  The two functions are seperated into
different utilities so that cc can be made to run fast.  According to
the UNIX programmer's manual, lint is to be used after a program is
written, compiled and tested in order find non-portable and potentially
dangerous constructs in the source deck.  This design philosophy is
meant to be consistent with the UNIX operating system as a whole.
	In addition, the C programming language is designed and imple-
mented with the assumption that the programmer is competent and has an
intimate understanding of the subtleties involved.  This is done to allow
greater flexibility and performance with respect to compilation and
code generation, i.e. C is a real language for real men.
	Other languages, such as Pascal, Modula 2 and Ada, are not des-
igned with this philosophy in mind and therefore do not yield the type
of performance that C enjoys.

rbp@investor.UUCP (Bob Peirce) (12/23/87)

> I think I have found a problem.  I know that operations of the same 
> precedence can be reordered by the compiler, but I wouldn't think that
> all three auto-increments would be done after all the additions.  This
> happens on both a vax (4.3bsd) and a Sun-3.
> 
> #include <stdio.h>
> 
> int sum = 0;
> int a[3] = {1, 4, 16};
> int i = 0;
> 
> main () {
>     sum = a[i++] + a[i++] + a[i++];
>     printf ("sum: %d      i:%d\n", sum, i);
> }
> 
> Results in:
> sum: 3      i:3
> 
> 			 John Gregor
> 		johng%ecrcvax.UUCP@germany.CSNET

I get "sum: 21    i:3" on an Altos 3068 (M68020).
-- 
Bob Peirce, Pittsburgh, PA				 412-471-5320
uucp: ...!{allegra, bellcore, cadre, idis, psuvax1}!pitt!investor!rbp
	    NOTE:  Mail must be < 30K  bytes/message