[comp.lang.c] Does your compiler get this program right?

lvc@cbnews.ATT.COM (Lawrence V. Cipriani) (11/23/88)

A friend of mine found a bug in his C compiler.  He found
the bug on a VAX; it also exists in some 3B compilers.  The
compiler on this SVR3 3b2 does it right and is shown at the
end.  The bug was that the increment to the float pointer
was happening twice; see the /* miscompiled line */ below.
I suspect other pcc derived compilers get it wrong too.

Larry Cipriani, AT&T Network Systems, Columbus OH,
Path: att!cbnews!lvc    Domain: lvc@cbnews.ATT.COM


----------------------fbug.c-----------------------

float list1[30], list2[10];

main()
{
	int i;
	float *f, *g;
	
	for (i = 0; i < 10; ++i) {
		list1[i] = (float) i;
		list2[i] = (float) 2 * i;
	}
	printf("List 1 is: ");
	pr_list(list1,10);
	printf("List 2 is: ");
	pr_list(list2,10);
	f = list1;
	g = list2;
	for (i = 0; i < 10; ++i) {
		*f++ += *g++;		/* miscompiled line */
	}
	printf("List 1 is: ");
	pr_list(list1,10);
	printf("List 2 is: ");
	pr_list(list2,10);
}

pr_list(list,num)
float list[];
int num;
{
	int i;

	for (i = 0; i < num; ++i) {
		if (i % 5 == 0)
			printf("\n");
		printf("%7.2f\t",list[i]);
	}
	printf("\n\n");
}

-------------------fbug.out------------------
List 1 is: 
   0.00	   1.00	   2.00	   3.00	   4.00	
   5.00	   6.00	   7.00	   8.00	   9.00	

List 2 is: 
   0.00	   2.00	   4.00	   6.00	   8.00	
  10.00	  12.00	  14.00	  16.00	  18.00	

List 1 is: 
   0.00	   3.00	   6.00	   9.00	  12.00	
  15.00	  18.00	  21.00	  24.00	  27.00	

List 2 is: 
   0.00	   2.00	   4.00	   6.00	   8.00	
  10.00	  12.00	  14.00	  16.00	  18.00

meyering@cs.utexas.edu (Jim Meyering) (11/24/88)

In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
	>A friend of mine found a bug in his C compiler.  He found

It's not a bug.

	[...deleted commentary, code]
	>*f++ += *g++;		/* miscompiled line */

The standard does not specify the order of evaluation for such
statements.  It's easier to see the ambiguity if you try to rewrite
it without the += notation.  Which do you choose?

 1) *f++ = *f++ + *g++;
 2) *f++ = *f + *g++;
 3) *f = *f++ + *g++;

It can't be (1) since the side effect, f++, may be realized only once,
but it's up to the compiler writer to choose between (2) and (3).

You might be interested to know that while the Sun3/os3.2
(or an HP, don't remember which) C compiler produced code
that gave the "correct" results for your code, when I replaced
that statement by the two:

	*f += *g++;      or      *f = *f + *g++;
	f++;                     f++;

I found that the size of the object code was actually reduced.
Chalk one up for readability *and* efficiency.

tim@crackle.amd.com (Tim Olson) (11/24/88)

In article <4082@cs.utexas.edu> meyering@cs.utexas.edu (Jim Meyering) writes:
| In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
| 	>A friend of mine found a bug in his C compiler.  He found
| 
| It's not a bug.
| 
| 	[...deleted commentary, code]
| 	>*f++ += *g++;		/* miscompiled line */
| 
| The standard does not specify the order of evaluation for such
| statements.  It's easier to see the ambiguity if you try to rewrite
| it without the += notation.  Which do you choose?
| 
|  1) *f++ = *f++ + *g++;
|  2) *f++ = *f + *g++;
|  3) *f = *f++ + *g++;
| 
| It can't be (1) since the side effect, f++, may be realized only once,
| but it's up to the compiler writer to choose between (2) and (3).

Your explination doesn't address the bug that Mr.  Cipriani points out
(which is that the side effect occured twice in some compilers), and I
don't think the "ambiguity" you refer to exists.  The semantics of
compound operators require that the lvalue (*f++ in this case) be
evaluated only once, with the result of that one evaluation being used on
both sides of the assignment.  The result of a postfix ++ operator is
the value of the operand (which is the result used on both sides of the
assignment). After the result is obtained, the operand is incremented. 
Therefore, the expression should behave like:

	*f = *f + *g++; f++;

	-- Tim Olson
	Advanced Micro Devices
	(tim@crackle.amd.com)

chris@mimsy.UUCP (Chris Torek) (11/24/88)

>In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM
>(Lawrence V. Cipriani) writes:
>>A friend of mine found a bug in his C compiler. ...

[f and g are both pointers to float]
>>*f++ += *g++;		/* miscompiled line */
[f winds up being incremented twice]

In article <4082@cs.utexas.edu> meyering@cs.utexas.edu (Jim Meyering) writes:
>It's not a bug.

Ah, but it is.

>The standard does not specify the order of evaluation for such
>statements.

Please read the standard (which standard, anyway?) before making such
statements.

The operation `a += b' is semantically equivalent to the operation `a =
a + b', with the exception that the left hand side is evaluated exactly
once (rather than exactly twice).  Evaluating `*f++' once increments
`f' by one.  The lvalue produced by this evaluation (i.e., the original
*f) is converted to an rvalue, has the right hand side added, and the
result is stored in that same lvalue (i.e., the original *f).  At some
time during this process, f is incremented.  The increment is to have
finished by the next sequence point (dpANS).

>that gave the "correct" results for your code, when I replaced
>that statement by [*f += *g++; f++;]
>I found that the size of the object code was actually reduced.
>Chalk one up for readability *and* efficiency.

That merely indicates that the compiler used is not clever enough.
Unless `f' is declared volatile, the two statements are equivalent.
It is not clear to me that the latter version has any advantage in
readability.

(Incidentally, the bug relates to PCC's practise of assuming that
x op= y can be done with one instruction, which is false for
straightforward implementations of floating op= operators.  This
assumption is in a machine-dependent part of the compiler, where it can
be circumvented if necessary.  The 4.3BSD-tahoe pcc gets it right, even
to the extent of using `addf2 (rF)+,(rG)+' if the pointers are in
registers and the compiler is invoked with the `-f' [allow single
precision floating point] flag.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

cik@l.cc.purdue.edu (Herman Rubin) (11/24/88)

In article <4082@cs.utexas.edu>, meyering@cs.utexas.edu (Jim Meyering) writes:
> In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
> 	>A friend of mine found a bug in his C compiler.  He found
> 
> It's not a bug.

		[Much deleted]

> You might be interested to know that while the Sun3/os3.2
> (or an HP, don't remember which) C compiler produced code
> that gave the "correct" results for your code, when I replaced
> that statement by the two:
> 
> 	*f += *g++;      or      *f = *f + *g++;
> 	f++;                     f++;
> 
> I found that the size of the object code was actually reduced.
> Chalk one up for readability *and* efficiency.

On a machine for which the ++ notation is hardware and on which there
is memory-memory addition, this should be done by ONE operation, not
even two, such as

	ADDF2	(rG)+,(rF)+

where rF is the register holding the address  f, and rG g.  For different
architectures, different numbers of instructions.  Is there a compiler for
the VAX which could make the optimization above?  I would expect any 
competent human programmer to do it.


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

maart@cs.vu.nl (Maarten Litmaath) (11/25/88)

In article <4082@cs.utexas.edu> meyering@cs.utexas.edu (Jim Meyering) writes:
\In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
\	>A friend of mine found a bug in his C compiler.  He found
\
\It's not a bug.

It IS a bug!

\	[...deleted commentary, code]
\	>*f++ += *g++;		/* miscompiled line */
\
\The standard does not specify the order of evaluation for such
\statements.

There is NO problem concerning evaluation order! Remember: `a += b' means
`a = a + b, BUT evaluate a only ONCE!'

\...
\	*f += *g++;      or      *f = *f + *g++;
\	f++;                     f++;
\
\I found that the size of the object code was actually reduced.
\Chalk one up for readability *and* efficiency.

Readability? Hahaha! No, really :-(
`*f++ += *g++' is perfectly readable and valid C. It's constructs like this
one that make C as powerful as it is. Are you a Pascal or Modula freak?
Consider `LongVariableName = LongVariableName + 3': do you find this gem
more readable? More typable?
If there are side effects, you cannot even use the `Pascal construct'.
-- 
fcntl(fd, F_SETFL, FNDELAY):          |Maarten Litmaath @ VU Amsterdam:
      let's go weepin' in the corner! |maart@cs.vu.nl, mcvax!botter!maart

dg@lakart.UUCP (David Goodenough) (11/25/88)

From article <2298@cbnews.ATT.COM>, by lvc@cbnews.ATT.COM (Lawrence V. Cipriani):
] A friend of mine found a bug in his C compiler.  He found
] the bug on a VAX; it also exists in some 3B compilers.  The
] compiler on this SVR3 3b2 does it right and is shown at the
] end.  The bug was that the increment to the float pointer
] was happening twice; see the /* miscompiled line */ below.
] I suspect other pcc derived compilers get it wrong too.
] 
Relevant parts of program:

] float list1[30], list2[10];
] 
] 	float *f, *g;
] 	
] 	f = list1;
] 	g = list2;
] 	for (i = 0; i < 10; ++i) {
] 		*f++ += *g++;		/* miscompiled line */
] 	}

We're using BSD4.3, w/ Greenhills C compiler. Output is as it should be,
so I would say that Greenhills doesn't have the bug. It has others .....
(I won't mention 4["Hello"] - we've hashed that out enough :-) :-) :-) )

Anyone else?
-- 
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@harvard.harvard.edu	  	  +---+

lvc@cbnews.ATT.COM (Lawrence V. Cipriani) (11/26/88)

In article <4082@cs.utexas.edu> meyering@cs.utexas.edu (Jim Meyering) writes:
>In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
>	>A friend of mine found a bug in his C compiler.  He found
>
>It's not a bug.
Never mind (he says embarased).  The reason I got confused about it
was that when the type was int instead of float it did what he expected.

-- 
Larry Cipriani, AT&T Network Systems, Columbus OH,
Path: att!cbnews!lvc    Domain: lvc@cbnews.ATT.COM

lvc@cbnews.ATT.COM (Lawrence V. Cipriani) (11/26/88)

>Never mind (he says embarased).  The reason I got confused about it
>was that when the type was int instead of float it did what he expected.
Double never mind.

-- 
Larry Cipriani, AT&T Network Systems, Columbus OH,
Path: att!cbnews!lvc    Domain: lvc@cbnews.ATT.COM

dg@lakart.UUCP (David Goodenough) (11/29/88)

From article <4082@cs.utexas.edu>, by meyering@cs.utexas.edu (Jim Meyering):
> In article <2298@cbnews.ATT.COM> lvc@cbnews.ATT.COM (Lawrence V. Cipriani) writes:
> 	>A friend of mine found a bug in his C compiler.  He found
> 
> It's not a bug.
> 
> 	[...deleted commentary, code]
> 	>*f++ += *g++;		/* miscompiled line */
> 
> The standard does not specify the order of evaluation for such
> statements.  It's easier to see the ambiguity if you try to rewrite
> it without the += notation.  Which do you choose?
> 
>  1) *f++ = *f++ + *g++;
>  2) *f++ = *f + *g++;
>  3) *f = *f++ + *g++;

Am I missing something, or is the standard broken? To my mind, the statement

	*f++ += *g++;

means only one thing: take the contents of pointer g, add it to the contents
of pointer f (i.e. add what g points to, to what f points to). Now make g and
f point to the next entries in whatever arrays they are pointing to. The whole
point behind the += operator is that the address on the left hand side
IS ONLY EVALUATED ONCE. Hence there is no ambiguity:

	*f += *g;
	f++;
	g++;

should be the only way the above statement can be done (with the caveat that
the order of the f++ and g++ is A: undetermined, B: irrelevant). However the
*f += *g MUST COME FIRST.

In particular, if I say

	extern char *zoot();

	*(zoot()) += '\001';

and zoot() gets called twice in evaluating the above statement, then I'm
going to ditch my C compiler because it's broken. If the standard says that
zoot() can be called twice, then I'm going to ignore the standard, because
IT'S broken.
-- 
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@harvard.harvard.edu	  	  +---+

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/01/88)

In article <350@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes:
-In particular, if I say
-	extern char *zoot();
-	*(zoot()) += '\001';
-and zoot() gets called twice in evaluating the above statement, then I'm
-going to ditch my C compiler because it's broken. If the standard says that
-zoot() can be called twice, then I'm going to ignore the standard, because
-IT'S broken.

The Standard has this right.  The problem reported, *f++ += *g++; causing
f to be incremented twice, is clearly wrong.  In fact it's a bug in many
versions of PCC.