[comp.lang.c] short circuit evaluation & side-effects

mason@tmsoft.UUCP (02/04/87)

In K&R pg. 184 it says: "...Otherwise the order of evaluations of expressions
is undefined.  In particular the compiler considers itself free to compute
subexpressions in the order it believes most efficient, even if the
subexpressions involve side effects.  The order in which side effects take place
is undefined."
    As I understand it this means the results of:
(1)	x = *i++ * (1 - *i++);
is unpredicatable.  ANSI added the unary + to fix this (as well as for numerical
analysis (i.e. floating point) calculations) so that:
(2)	x = +(*i++ * (1 - *i++));
does what one would expect.  I have seen notes here that suggested that (1)
could be equivalent to:
(3)	x = *i * (1 - *i); i += 2;

Does the unary + force the side effects to happen where expected?  Is unary +
still in the ANSI standard proposal?

The above quote would appear to require side effects to take place (although
the order is undefined) even if short circuiting were to be done (except of
course in the case of || and &&)
-- 
	../Dave Mason,	TM Software Associates	(Compilers & System Consulting)
	..!{utzoo!ryesone seismo!mnetor utcsri!ryesone}!tmsoft!mason

henry@utzoo.UUCP (Henry Spencer) (02/06/87)

>     As I understand it this means the results of:
> (1)	x = *i++ * (1 - *i++);
> is unpredicatable.  ANSI added the unary + to fix this (as well as for
> numerical analysis (i.e. floating point) calculations) so that:
> (2)	x = +(*i++ * (1 - *i++));
> does what one would expect...

Hold it, you are attributing incorrect properties to unary +.  It does
nothing of the kind.  The sole and only relevant property of unary + is
that it inhibits regrouping of expressions, e.g. "a + +(b + c)" is
guaranteed to add b and c and then add the result to a, whereas "a+(b+c)"
might add a to c and then add the result to b.  Unary + does *not* define
a "sequence point" and hence does not promise anything about the timing of
side effects.
-- 
Legalize			Henry Spencer @ U of Toronto Zoology
freedom!			{allegra,ihnp4,decvax,pyramid}!utzoo!henry

greg@utcsri.UUCP (Gregory Smith) (02/10/87)

In article <107@tmsoft.UUCP> mason@tmsoft.UUCP (Dave Mason) writes:
>    As I understand it this means the results of:
>(1)	x = *i++ * (1 - *i++);
>is unpredictable.

True.

>  ANSI added the unary + to fix this (as well as for numerical
>analysis (i.e. floating point) calculations) so that:
>(2)	x = +(*i++ * (1 - *i++));
>does what one would expect.

Well, you're wrong, this is the same as (1). But first I would like
to point out that 'does what one would expect' is an extremely vague
notion. In this case I presume you want x=i[0]*(1-i[1]) and then i += 2.
I have seen examples where people have asked why the expression doesn't
'do what I expect', and it is next to impossible to determine for
certain what is expected. In some cases it is only possible after a
high-level analysis of the problem; e.g. in this case I make the
reasonable assumption that you want to read both i[0] and i[1] rather
than reading i[0] twice and skipping i[1]. Admittedly this was only used
to support the guess obtained from considering a left-to-right
evaluation of the expression.

You are making two big assumptions: (a) the left side of the multiply
is done first; (b) the first increment is done before the second
use of i. The first is the most unreasonable; why should any one
side of a * operator be done before the other? Also, assumption (a)
is worthless without assumption (b).
BTW, I would hope that the right-hand side, (1-*i++) be done before
the left, *i++, since the former is more complex. Fewer registers
would be used this way.

If the operation which 'one would expect' can only be determined after
considering the role the statement is likely to perform, or whether
one interpretation would perform a more useful function than another,
then how can you expect the compiler to hit on the 'right' one?

>  I have seen notes here that suggested that (1)
>could be equivalent to:
>(3)	x = *i * (1 - *i); i += 2;

I think it can be evaluated as " x= i[k1] * (1-i[k2]); i+=2;" where (k1,k2)
can come from { (0,0), (1,0), (0,1) }. Not very good odds.

The + operator forces its operand to completely evaluated as a subexpression,
independently of the expression in which it is imbedded.
Thus:
	x = +*i++ * +(1-*i++)
forces the two '*i++' expressions to read i[0] and i[1], ( but not
necessarily in that order!!). Note that the unary +'s have been placed
adjacent (tree-wise) to the multiply operator. They could have been
placed adjacent to the ++ operators: *+i++ * (1- *+i++) or in lots
of other ways. Unary + operators cannot be used to force one of the
*i++'s to be done first (as far as I can see).

>
>Does the unary + force the side effects to happen where expected?
Unary + does not cause the activation of an AI module which tries
to decide what you expect.

Moral: write
 (4) i[0] * ( 1 - i[1]); i += 2.
This won't be much faster or slower, whether i is in a register or not,
and it is completely unambiguous.  In fact, on a NS32016, which lacks
auto-increment, it is faster.

Even with auto-increment, your expected code will first read *i++,
keep it hanging around somewhere while the next *i++ is subtracted from
1, and then multiply them. The expression in (4) allows the code
generator to read i[1] first, do the subtraction, then multiply by
i[0]. This is why you want the code generator to be able to rearrange
stuff.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...