[comp.lang.c] Short circuit evaluation/expression rearrangement

gram@uctcs.uucp (Graham Wheeler) (05/27/91)

Responses are still flowing in to my question. I have had two that are quite
explicit, so I thought I would post them as a refinement on the previous
summary.

From: worley@compass.COM

   i) Does ANSI C say that compilers can rearrange the order of expression 
	   evaluation?

To a certain extent -- see section 3.3 of the standard and the
accompanying rationale document.  The key point is that the dataflow
between the operators can't be changed, but the order in which the
operators is executed can be changed (as long as each value is computed
before it is consumed!).  For example, in "a*b + c*d", the compiler
may perform the two multiplications in any order, but in "a + b + c",
the first addition must be performed before the second, because it
associates as "(a + b) + c", showing that the second addition has as
an argument the result of the first.

   ii) Does it say that Boolean expressions must be evaluated with short-
	   circuit evaluation?

The && and || operators are required to be short-circuited -- see
sections 3.3.13 and 3.3.14 of the standard.  The &, |, and ^ operators
are required to not be short-circuited -- see sections 3.3.10-12 of
the standard.

===================================================================

From: der Mouse  <mouse@thunder.McRCIM.McGill.EDU>

> i) Does ANSI C say that compilers can rearrange the order of
>    expression evaluation?

In general, yes.  There are a few exceptions; notably, the &&, ||, ?:,
and , operators promise some things about the order in which their
operands are evaluated (and in some cases, about whether certain
operands are evaluated at all).

> ii) Does it say that Boolean expressions must be evaluated with
>     short-circuit evaluation?

When using the short-circuit operators && and ||, yes.

> We think the answers to both of these are yes, but we aren't sure.

Mostly.  The second one is yes; the first one is a qualified yes.

===================================================================
Thanks again folks - the question is certainly resolved (and I can go back
to writing `if (ptr && ptr->next)...' 8-) )

Graham Wheeler <gram@cs.uct.ac.za> | "That which is weak conquers the strong,
Data Network Architectures Lab     | that which is soft conquers the hard.
Dept. of Computer Science          | All men know this; none practise it"
University of Cape Town            |		Lao Tzu - Tao Te Ching Ch.78

gwyn@smoke.brl.mil (Doug Gwyn) (05/28/91)

In article <1991May27.113628.26880@ucthpx.uct.ac.za> gram@uctcs.uucp (Graham Wheeler) writes:
>From: worley@compass.COM
>... in "a + b + c",
>the first addition must be performed before the second, because it
>associates as "(a + b) + c", showing that the second addition has as
>an argument the result of the first.

The above is incorrect.  A conforming implementation is obliged to
perform the operations so that it doesn't create an exception where
one would not have occurred in evaluating the expression strictly
according to the precedence implied by the grammar, but within
those constraints it may perform the additions in any order.  To
force an order of evaluation (in Standard C), you must use parentheses.

barrett@Daisy.EE.UND.AC.ZA (Alan P Barrett) (05/30/91)

In article <16283@smoke.brl.mil>,
gwyn@smoke.brl.mil (Doug Gwyn) writes:
> >... in "a + b + c",
> >the first addition must be performed before the second, because it
> >associates as "(a + b) + c", showing that the second addition has as
> >an argument the result of the first.
> 
> The above is incorrect.  A conforming implementation is obliged to
> perform the operations so that it doesn't create an exception where
> one would not have occurred in evaluating the expression strictly
> according to the precedence implied by the grammar, but within
> those constraints it may perform the additions in any order.  To
> force an order of evaluation (in Standard C), you must use parentheses.

I thought that the compiler was always governed by the "as if" rule:
the compiler may do whatever it likes as long as a strictly conforming
program cannot tell the difference between what the programmer asked for
and what actually happened.  But what Doug says seems to contradict
that, so I must be wrong.

Please could somebody explain exactly when the "as if" rule does or
doesn't apply.  In particular, related to Doug's message, I have the
following two questions.

If evaluation of an expression strictly according to the precedence
implied by the grammar would cause an exception, is the compiler
permitted to evaluate the expression in some way that does not cause an
exception?

Do parentheses really and truly *force* an order of evaluation?

--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
RFC822: barrett@ee.und.ac.za             Bang: m2xenix!quagga!undeed!barrett

worley@compass.com (Dale Worley) (05/31/91)

   In article <16283@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes:
   > >... in "a + b + c",
   > >the first addition must be performed before the second, because it
   > >associates as "(a + b) + c", showing that the second addition has as
   > >an argument the result of the first.
   > 
   > The above is incorrect.  A conforming implementation is obliged to
   > perform the operations so that it doesn't create an exception where
   > one would not have occurred in evaluating the expression strictly
   > according to the precedence implied by the grammar, but within
   > those constraints it may perform the additions in any order.  To
   > force an order of evaluation (in Standard C), you must use parentheses.

However, if the computation is floating-point, problems can arise
above and beyond exceptions.  The usual problem is loss of
significance.  See section 3.3 of the Rationale of the C standard for
an explanation, which makes it clear that (in Standard C) a + b + c
*must* be evaluated as (a + b) + c if a, b, and c are floating point.

(Also, beware that the Rationale claims that Fortran has the same
no-regrouping rule as the C standard.  That's not true.)

Dale Worley		Compass, Inc.			worley@compass.com
--
Dammit, we're all going to die, let's die doing something *useful*!
-- Hal Clement, on comments that space exploration is dangerous

andand@cia.docs.uu.se (Anders Andersson) (06/03/91)

In <16283@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes:

>In article <1991May27.113628.26880@ucthpx.uct.ac.za> gram@uctcs.uucp (Graham Wheeler) writes:
>>From: worley@compass.COM
>>... in "a + b + c",
>>the first addition must be performed before the second, because it
>>associates as "(a + b) + c", showing that the second addition has as
>>an argument the result of the first.

>The above is incorrect.  A conforming implementation is obliged to
>perform the operations so that it doesn't create an exception where
>one would not have occurred in evaluating the expression strictly
>according to the precedence implied by the grammar, but within
>those constraints it may perform the additions in any order.  To
>force an order of evaluation (in Standard C), you must use parentheses.

You are dead wrong. Parentheses will do you no good whatsoever. The right
for compilers to exploit associtivity and commutativity of operators was
*removed* in the ANSI standard. In the standard, (a + b) + c IS the same 
as a + b + c.
What you DO know:     The addition of a and b will be made before c is
                       added to the result.
What you DO NOT know: The order of evaluation of the expressions a, b and c.

That is, in the expression x/y + z/w + p/q, you cannot tell in which order
the divisions are performed, but you do know the order in which the 
additions are made.

-Anders Andersson (andand @ cia.docs.uu.se)

henry@zoo.toronto.edu (Henry Spencer) (06/04/91)

In article <andand.675959815@cia.docs.uu.se> andand@cia.docs.uu.se (Anders Andersson) writes:
> [a + b + c]
>What you DO know:     The addition of a and b will be made before c is
>                       added to the result.

More precisely, what you know is that the program will *behave as if* things
were done that way.  In particular, if the compiler can be sure that the
order of evaluation will not affect behavior, it can use any order it pleases.
For example, if the implementation ignores integer overflows, the additions
in a+b+c+d can be done in any order.
-- 
"We're thinking about upgrading from    | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 to SunOS 3.5."              |  henry@zoo.toronto.edu  utzoo!henry

msb@sq.sq.com (Mark Brader) (06/06/91)

Wow, I get to correct Henry Spencer.
> > [a + b + c]
> > What you DO know:     The addition of a and b will be made before c is
> >                       added to the result.
> 
> More precisely, what you know is that the program will *behave as if* things
> were done that way.  In particular, if the compiler can be sure that the
> order of evaluation will not affect behavior, it can use any order it pleases.

Still more precisely, what you know is that *if* the program would *not*
cause an exception (e.g. overflow) if things were done that way, *then*
it will behave *as if* things were done that way.

Presuming that i and j are ints which are 16 bits long, the statement

		i = 20000 + 20000 + j;

causes undefined behavior, since 20000+20000 causes int overflow.  However,
since the behavior *is* undefined, the compiler is free to ignore the "as
if" rule and compile the statement in some other way, e.g. as if it read

		printf ("This is silly, but legal\n");

or, more practically, as if it read

		i = 20000 + (20000 + j);

which *is* valid for some values of j (e.g. -20000).

On the other hand, if the statement originally read

		i = 20000 + (-20000) + j;

then the compiler would *not* be free to compile it as if it was

		i = 20000 + ((-20000) + j);

unless int overflows are ignored and cause wraparound in the common manner.
In this example, the rewritten form could cause an overflow where the
original form could not; in my first example, the reverse is true.

These examples were contrived to match the a+b+c of the earlier messages;
here is a practical example.

		long p;
		extern short q, r;

		p = q * r;

which, I hear, is compiled on some systems as if it was written

		p = q * (long) r;

which is probably what the writer meant to say in the first place.
Again, this rewriting is legal because it cannot *cause* an exception
and it produces the same result if no exception would have occurred.

-- 
Mark Brader		    "People tend to assume that things they don't know
SoftQuad Inc., Toronto	     about are either safe or dangerous or useless,
utzoo!sq!msb, msb@sq.com     depending on their prejudices."    -- Tim Freeman

This article is in the public domain.

jon@maui.cs.ucla.edu (Jonathan Gingerich) (06/07/91)

In article <1991Jun5.231833.20542@sq.sq.com> msb@sq.sq.com (Mark Brader) writes:
>Wow, I get to correct Henry Spencer.
>> > [a + b + c]
>> > What you DO know:     The addition of a and b will be made before c is
>> >                       added to the result.
>> 
>> More precisely, what you know is that the program will *behave as if* things
>> were done that way.  In particular, if the compiler can be sure that the
>> order of evaluation will not affect behavior, it can use any order it pleases.
>
>Still more precisely, what you know is that *if* the program would *not*
>cause an exception (e.g. overflow) if things were done that way, *then*
>it will behave *as if* things were done that way.

No, Henry is correct.  How can you distinguish the evaluation of a rearranged
expression from an undefined evaluation which could be _anything_?

Jon.

msb@sq.sq.com (Mark Brader) (06/08/91)

>>>> [a + b + c]
>>>> What you DO know:     The addition of a and b will be made before c is
>>>>                       added to the result.
>>> 
>>> More precisely, what you know is that the program will *behave as if*
>>> things were done that way.  In particular, if the compiler can be
>>> sure that the order of evaluation will not affect behavior, it can
>>> use any order it pleases.
>>
>> Still more precisely, what you know is that *if* the program would *not*
>> cause an exception (e.g. overflow) if things were done that way, *then*
>> it will behave *as if* things were done that way.
> 
> No...  How can you distinguish the evaluation of a rearranged
> expression from an undefined evaluation which could be _anything_?

If you can distinguish it, then the compiler can't rearrange it --
*unless* your program, as written, would cause an exception.  If it
*would*, then we have undefined behavior and the compiler *can* rewrite
in any way it pleases -- including a way which would avoid the exception
and produce the answer you wanted, if such a way exists.

-- 
Mark Brader		     "It is impractical for the standard to attempt to
SoftQuad Inc., Toronto	      constrain the behavior of code that does not obey
utzoo!sq!msb, msb@sq.com      the constraints of the standard."  -- Doug Gwyn

This article is in the public domain.