[comp.lang.c] conditional expression evaluation question

george@rebel.UUCP (George M. Sipe) (01/13/87)

References:


I need to check a string, composed of byte triples, for a null area no
less than MINSKIP triples in length.  A pointer, cp, is initialized to
a triplet boundary.  After the test, it must remain on a triplet
boundary.  Initially, I wrote the following:

	while (cp < end && triples < MINSKIP)
		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
		else triples = 0;

After looking at it, I wasn't absolutely sure that it would perform as
expected.  My question is "Does C guarantee execution of portions of a
conditional expression, even when the result is known after partial
evaluation?".  In my example, if the first byte is non-null, then the
result is known to be false and there is no need to evaluate the
remaining subexpressions.  However, I am counting on the whole
expression being evaluated.  Assuming that C does guarantee full
execution of the conditional test, would I be taking a significant
chance that an optimizing compiler might be too tricky and
(incorrectly) fail to do the full evaluation?

For now, I am using the following ugly code:

	while (cp < end && triples < MINSKIP) {
		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
		else triples = 0;
		cp += 3;
	}

ucscb.bitbug@ucbvax.berkeley.edu (01/13/87)

C guarantees that a conditional expression will NOT be fully
evaluated if the result of the expression is known after a
partial evaluation. You will have to do some restructuring
of your code if you were counting on full evaluation of a
conditional expression (as Pascal does).

tiemann@mcc-pp.UUCP (DZ-015 @ Information Retrieval) (01/13/87)

In article <207@rebel.UUCP>, george@rebel.UUCP (George M. Sipe) writes:
> I need to check a string, composed of byte triples, for a null area no
> less than MINSKIP triples in length.  A pointer, cp, is initialized to
> a triplet boundary.  After the test, it must remain on a triplet
> boundary.  Initially, I wrote the following:
> 
> 	while (cp < end && triples < MINSKIP)
> 		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
> 		else triples = 0;
> 
> After looking at it, I wasn't absolutely sure that it would perform as
> expected.  My question is "Does C guarantee execution of portions of a
> conditional expression, even when the result is known after partial
> evaluation?".  In my example, if the first byte is non-null, then the
> result is known to be false and there is no need to evaluate the
> remaining subexpressions.  However, I am counting on the whole
> expression being evaluated.  Assuming that C does guarantee full
> execution of the conditional test, would I be taking a significant
> chance that an optimizing compiler might be too tricky and
> (incorrectly) fail to do the full evaluation?
> 
> For now, I am using the following ugly code:
> 
> 	while (cp < end && triples < MINSKIP) {
> 		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
> 		else triples = 0;
> 		cp += 3;
> 	}

I wouldn't have answered, but there was a wrong answer to the net, so
I thought I'd balance it out.

First of all, the bit-wise OR operator is not the same thing as the
logical disjuntion operator. The former does evaluate all its
arguments (though it may permute them as it sees fit (just like
+ or *)), while the latter does not. The answer to your question is
"No, C does not execute portions of a conditional expression which
are not needed", but you have not specified a conditional expression,
rather a logical one, which will be fully executed.

Michael Tiemann
tiemann@mcc.com

pinkas@mipos3.UUCP (Israel Pinkas) (01/13/87)

In article <207@rebel.UUCP> george@rebel.UUCP (George M. Sipe) writes:
>
>I need to check a string, composed of byte triples, for a null area no
>less than MINSKIP triples in length.  A pointer, cp, is initialized to
>a triplet boundary.  After the test, it must remain on a triplet
>boundary.  Initially, I wrote the following:
>
>	while (cp < end && triples < MINSKIP)
>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
>		else triples = 0;
> ...
>For now, I am using the following ugly code:
>
>	while (cp < end && triples < MINSKIP) {
>		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
>		else triples = 0;
>		cp += 3;
>	}

Continue to use you ugly code.  C guarantees (at least K&R and H&S do) that
boolean expressions are evaluated only as much as possible, and left to
right.  As such, the first time *cp++ evaluates non-zero, the expression is
guaranteed to result in a non-zero value, and evaluation stops.  (You can
do wierd things like ((expr1 && expr2) || expr3); instead of
if (expr1) then expr2; else expr3; iff expr2 does not return a zero value.)
The first piece of code will actually find the first run of 3 * MINSKIP
consecutive null bytes, not guaranteed to fall on any boundary.  (This
assumes that your compiler is a "true C" compiler.  But that is another
matter.)

-Israel

-- 
----------------------------------------------------------------------
UUCP:	{amdcad,decwrl,hplabs,oliveb,pur-ee,qantel}!intelca!mipos3!pinkas
ARPA:	pinkas%mipos3.intel.com@relay.cs.net
CSNET:	pinkas%mipos3.intel.com

cccmark@ucdavis.UUCP (Mark Nagel) (01/13/87)

The original question regarded the evaluation of a bitwise or expression
which would *always* be fully evaluated.  C only guarantees that logical
expressions will be short circuited.  There may be a problem with your
expression though in other ways.  C does not guarantee that an expression
with side effects will be executed in any special order.  Therefore, the
original expression:

	if (*cp++ | *cp++ | *cp++ == 0) {
	    ...
	}

will evaluate each *cp++ eventually, but I do not believe you can assume
the result will be the same as (*cp | *(cp+1) | *(cp+2)) followed by
cp += 3 on all machines.


- Mark Nagel

ucdavis!deneb!cccmark@ucbvax.berkeley.edu               (ARPA)
mdnagel@ucdavis                                         (BITNET)
...!{sdcsvax|lll-crg|ucbvax}!ucdavis!deneb!cccmark      (UUCP)

adam@prophet.bbn.com (01/13/87)

> For now, I am using the following ugly code:
> 
> 	while (cp < end && triples < MINSKIP) {
>		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
>		else triples = 0;
>		cp += 3;
>	}

First of all I don't understand what is so ugly about this code.  The
other example with the *cp++'s was subtle and at worst confusing.
This example is not.  The only other objections to this second example
that you might have is that it compiles to significantly more code, or
that you don't like all the parentheses.  The first of these other
objections I cannot answer for you, but the second can be solved by
using array notation as follows:

	while (cp < end && triples < MINSKIP) {
		if ((cp[0] | cp[1] | cp[2]) == 0) ++triples;
		else triples = 0;
		cp = &cp[3];
	}


-- Adam Stock

henry@utzoo.UUCP (Henry Spencer) (01/14/87)

>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
> ...
> ...  My question is "Does C guarantee execution of portions of a
> conditional expression, even when the result is known after partial
> evaluation?"...

Basically, the answer is "no".  You get very few guarantees about the
order of execution or even presence of execution in C.  It is possible
that X3J11 mandates complete execution, and a good many compilers will
do it, simply because other sloppy people have made similar unwise
assumptions in the past and it causes too much trouble to violate them.
But I would suggest that it is extremely unwise to count on the above
code doing what you think it does.  (For another example, are you sure
that "*cp++" is atomic, i.e. that the postincrement will be done before
the next part of the expression?  I would think that an unwise assumption
too.)  There is too much chance of a "clever" compiler fouling things up.

Incidentally, not everyone would agree that the above is cleaner than
using "*(cp+1)" etc.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

jpn@teddy.UUCP (John P. Nelson) (01/14/87)

>	while (cp < end && triples < MINSKIP)
>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
>		else triples = 0;

Assuming that bitwise OR is really what you wanted, this code fragment will
probably break several existing compilers.  As for ANSI C, they introduced
the concept of sequence-points, at which points all side effects must have
taken effect.  Sequence-points occur at && || , :? operators (and the end
of the expression).  There are no sequence points in that bitwise OR
expression.  To quote the standard (under the ++ operator):  "The side
effect of updating the stored value of the operand may be delayed until the
next sequence point is reached."

This says to me that a ANSI conforming C compiler is free to optimize all
the increments to the end of the expression if it wishes to.  The above
code would be unportable.

tim@amdcad.UUCP (Tim Olson) (01/14/87)

George M. Sipe writes:
+---------------------------------------
|I need to check a string, composed of byte triples, for a null area no
|less than MINSKIP triples in length.  A pointer, cp, is initialized to
|a triplet boundary.  After the test, it must remain on a triplet
|boundary.  Initially, I wrote the following:
|
|	while (cp < end && triples < MINSKIP)
|		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
|		else triples = 0;
|
|After looking at it, I wasn't absolutely sure that it would perform as
|expected.  My question is "Does C guarantee execution of portions of a
|conditional expression, even when the result is known after partial
|evaluation?".  In my example, if the first byte is non-null, then the
+---------------------------------------

And James Buster Writes:
+---------------------------------------
|C guarantees that a conditional expression will NOT be fully
|evaluated if the result of the expression is known after a
|partial evaluation. You will have to do some restructuring
|of your code if you were counting on full evaluation of a
|conditional expression (as Pascal does).
+---------------------------------------

Well... C only guarantees short-circuit evaluation for the operators
'&&' and '||', not every conditional expression.  Note that the code that
Mr. Sipe includes uses only the bit-wise or '|' operator, so short
circuit evaluation is *not* guaranteed.  Since the ++ operators cause
side-effects, they must be evaluated.

However, there is another problem.  The X3J11 Draft (as I read it,
anyway) says that postfix operators (++ and --) may delay the
incrementing or decrementing side-effect until the next sequence point
is reached which, in this case, is the end of the condition.  Therefore,
the code generated could possibly test the same character 3 times (and
then increment the pointer 3 times!).  Therefore, it seems to me that
the only safe way to write this sequence is (as Mr. Sipes writes later):


	while (cp < end && triples < MINSKIP) {
		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
		else triples = 0;
		cp += 3;
	}


	Tim Olson
	Advanced Micro Devices
	ihnp4!amdcad!tim

hoey@nrl-aic.arpa (Dan Hoey) (01/14/87)

    From: "George M. Sipe" <george%rebel.UUCP@seismo.arpa>
    Message-Id: <207@rebel.UUCP>
    Date: 12 Jan 87 21:40:40 GMT

    ... Does C guarantee execution of portions of a
    conditional expression, even when the result is known after partial
    evaluation?....

The semantics of the expression ``((*cp++ | *cp++ | *cp++) == 0)''
includes the side-effect of incrementing ``cp'' three times, and so
must be preserved by any correct optimizer.  Even if the optimizer, by
clever program analysis, could deduce that in some program ``cp'' was
already pointing to a block of zeroes, and so could optimize away the
entire test (or the entire loop), it would have to leave in the
increments.  I can't estimate the likelihood of someone writing an
opitimizer that is incorrect in this respect.

However, the language explicitly leaves the *order* of these
side-effects unspecified, except of course that each increment of
``cp'' must be preceded by the corresponding dereference.  However, the
optimizer is free to postpone all of the increments until after all the
dereferences, so you may end up executing something that acts like

	while (cp < end && triples < MINSKIP) {
		if ((*cp | *cp | *cp) == 0) ++triples;
		else triples = 0;
		cp += 3;
	}

which is not what you want.  I think the rewritten code is fine, though
I would tend to use cp[1] instead of *(cp+1), and I would also tend to
use a conditional-or (||) instead of logical-or (|) and
conditional-not (!) instead of a comparison (== 0).  But that's just
how I write.

Dan Hoey

ag0@k.cc.purdue.edu (Colin Jenkins) (01/14/87)

In article <207@rebel.UUCP> george@rebel.UUCP (George M. Sipe) writes:
>
>I need to check a string, composed of byte triples, for a null area no
>less than MINSKIP triples in length.  A pointer, cp, is initialized to
>a triplet boundary.  After the test, it must remain on a triplet
>boundary.  Initially, I wrote the following:
>
>	while (cp < end && triples < MINSKIP)
>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
>		else triples = 0;
>
>After looking at it, I wasn't absolutely sure that it would perform as
>expected.  My question is "Does C guarantee execution of portions of a
>conditional expression, even when the result is known after partial
>evaluation?".  

Actually, you have only one conditional expression in your if statement.  The
"|" is a bitwise or and not a logical connective.  If that is what you
intended then you don't have to worry since you have only one conditional
expression.

The answer to you question about evaluation is no however.  From page 38 of 
K&R "The C Programming Language":

	"Expressions connected by the && or || are evaluated left to right,
	 and evaluation stops as soon as the truth or falsehood of the result
	 is known."


				Colin

philip@axis.UUCP (01/15/87)

In article <207@rebel.UUCP> george@rebel.UUCP (George M. Sipe) writes:
>References:
>
>My question is "Does C guarantee execution of portions of a
>conditional expression, even when the result is known after partial
>evaluation?".

C guarantees just the opposite.
See section 2.6 (page 38) of K&R.

Philip

ron@brl-sem.ARPA (Ron Natalie <ron>) (01/15/87)

In article <2316@brl-adm.ARPA>, adam@prophet.bbn.com writes:
> 		cp = &cp[3];

"cp += 3;"  is exactly equivalent.

lvc@danews.ATT.COM (Larry Cipriani) (01/15/87)

Tim Olson write:

> Well... C only guarantees short-circuit evaluation for the operators
> '&&' and '||', not every conditional expression.  Note that the code that
> Mr. Sipe includes uses only the bit-wise or '|' operator, so short
> circuit evaluation is *not* guaranteed...

A lot of C programmers don't know that both operands of & | and ^
*are guaranteed* to be evaulated.  Historically && and || are
more recent than & | and ^.  A lot of old code is laying around (at
least here) that should be changed to use the newer operators.

> 	Tim Olson
> 	Advanced Micro Devices
> 	ihnp4!amdcad!tim

-- 

Larry Cipriani   Cornet 353-4999 AT&T (614) 860-4999
{ihnp4|cbosgd}!cbsck!lvc   AT&T Network Systems rm 2B-220

wong@rtech.UUCP (J. Wong) (01/16/87)

In article <207@rebel.UUCP> george@rebel.UUCP (George M. Sipe) writes:
>I need to check a string, composed of byte triples, for a null area no
>less than MINSKIP triples in length.  A pointer, cp, is initialized to
>a triplet boundary.  After the test, it must remain on a triplet
>boundary.  Initially, I wrote the following:
>
>	while (cp < end && triples < MINSKIP)
>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
>		else triples = 0;
>
> ... [problems with above] ...
>
>For now, I am using the following ugly code:
>
>	while (cp < end && triples < MINSKIP) {
>		if ((*cp | *(cp+1) | *(cp+2)) == 0) ++triples;
>		else triples = 0;
>		cp += 3;
>	}
This is indicative of a problem with C:  It allows you to 
shoot yourself in the foot, by letting you be overly clever.

The problem here is that the programmer is using the ++ operator
for its side-effect and the | operator to guarantee that 'cp' is
incremented by three after the test.  That is, he's trying to be
clever and use side-effects, probably because he thinks its more
efficient.  His resolution is no uglier than his first solution in 
that neither is very clear (although the second is closer to what
he wants) or necessarily more efficient.

The `real' solution is this:

	while (cp < end && triples < MINSKIP)
	{
		if (cp[0] == '\0' && cp[1] == '\0' && cp[2] == '\0')
			++triples;
		else
			triples = 0;
		cp += 3;
	}

This is clear and to the point, and if you think about, 
not even significantly less efficient (and probably on
the average just as efficient) as his orginal solution.



P.S., I'm surprised none of the responses I've seen so far
have dealt with the real issue, which shows how irrelevant
the discussion of esoteric details of order of evaluation
in C implementations is to the real problem.
-- 
				J. Wong		ucbvax!mtxinu!rtech!wong

****************************************************************
You start a conversation, you can't even finish it.
You're talking alot, but you're not saying anything.
When I have nothing to say, my lips are sealed.
Say something once, why say it again.		- David Byrne

m5d@bobkat.UUCP (Mike McNally ) (01/16/87)

My small contribution to this silly topic:

    x = (*cp++ | *cp++ | *cp++);                                (1)
    if (x) ... /* or !x or whatever */

In computing the value to be assigned to the variable `x', the compiler
cannot short-circuit the expression.  This should be clear; in the
simpler case

    x = (a | b | c);

if the variable `a' contains zero, the compiler must still OR the
contents of `b' and `c' to determine the result.  These are bitwise
logical operators.  Short-circuiting these makes no more sense than
short-circuiting a sequence of multiplies as soon as one of the
operands evaluates to `1'.

I of course agree that a potential problem exists with (1) above.  If
the evaluation of `*cp++' is not atomic, and the increment could be
delayed, the result may be equal to the first byte of the series.  In
any case, `cp' must be incremented by `3' and both bitwise-OR operations
must be performed.  Oh well, I guess an optimizing compiler that 
delays the increments might realize this and skip the OR's...this
is getting ridiculous.

--
****                                                         ****
**** At Digital Lynx, we're almost in Garland, but not quite ****
****                                                         ****

Mike McNally                                    Digital Lynx Inc.
Software (not hardware) Person                  Dallas  TX  75243
uucp: {texsun,killer,infotel}!pollux!bobkat!m5  (214) 238-7474

greg@utcsri.UUCP (Gregory Smith) (01/19/87)

In article <207@rebel.UUCP> george@rebel.UUCP (George M. Sipe) writes:
>I need to check a string, composed of byte triples, for a null area no
>less than MINSKIP triples in length.  A pointer, cp, is initialized to
>a triplet boundary.  After the test, it must remain on a triplet
>boundary.  Initially, I wrote the following:
>
>	while (cp < end && triples < MINSKIP)
>		if ((*cp++ | *cp++ | *cp++) == 0) ++triples;
>		else triples = 0;
>
> ... [problems with above] ...

Many have pointed out that the test clause should be something like
	cp[0] == 0 && cp[1] == 0 && cp[2] == 0
and the increment should be done separately with cp += 3 or cp = &cp[3].

Here's a solution using a pointer to an array. ( Watch carefully. At
no time do my fingers leave the keyboard...):

typedef char triple[3];
triple *tp,*end;	/* or char (*tp)[3], (*end)[3]; */
...
	while( tp < end && triples < MINSKIP ){
		if( (*tp)[0] == 0 && (*tp)[1] == 0 && (*tp)[2] == 0 )
			++triples;
		else triples = 0;
		++tp;		/* next triple */
	}

Note that (*tp)[1] should give the same code as cp[1], and ++tp should
be the same as cp += 3, so the difference is merely semantic, and the
question is whether you feel that the above makes more sense than
the cp solution. If a lot of things are done with these triples,
it can end up being somewhat cleaner, since a lot of '3' constants
become implicit in the addressing.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...