[net.lang.c] Expression Sequencing Query

tomc@oakhill.UUCP (Tom Cunningham) (09/06/86)

Sorry if this topic has been overly exercised already.  In the following
code fragment:

	/* a = b + b + b */
	a = ((b=1),b) + ((b=2),b) + ((b=3),b)

I expected the result to be 6.  With the Microsoft C compiler and the
compiler on the Sun 3, the result is 9.  Apparently the parenthetical
assignments are all getting done before the comma and addition.  Any
thoughts on thi

Tom Cunningham     "Good, fast, cheap -- select two."
USPS:  Motorola Inc.  6501 William Cannon Dr. W.  Austin, TX 78735-8598
UUCP:  {ihnp4,seismo,ctvax,gatech}!ut-sally!oakhill!tomc
Phone: 512-440-2953

eectrsef@titan.UUCP ( SA User Serv.) (09/19/86)

In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>	/* a = b + b + b */
>	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>
>I expected the result to be 6.  With the Microsoft C compiler and the
>compiler on the Sun 3, the result is 9.  Apparently the parenthetical
>assignments are all getting done before the comma and addition.  Any
>thoughts on this?
>
Tom, I agree, the result should be 6, as defined by K&R, but I have tried
it on a Cyber 180/830 running NOS VE, and get 9, also AT&T's 3B5
System V, gets 9, But A copy of the Small-C Compiler that I have ported
comes up with a 6.  Does this seam to imply that Small-C is a better
(more accurate) compiler, than those that AT&T produces?  I find it
totally unaccepable that AT&T can not produce a working C compiler.
I would like everyone to test it on as many machines as prossible, to
see if we can find as least ONE other besides Small-C, that works.

Mike Stump  ucbvax!hplabs!csun!csunb!beusemrs

jason@hpcnoe.UUCP (Jason Zions) (09/23/86)

> / tomc@oakhill.UUCP (Tom Cunningham) /  4:16 pm  Sep  5, 1986 /
> Sorry if this topic has been overly exercised already.  In the following
> code fragment:
> 
> 	/* a = b + b + b */
> 	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> 
[ Wants result to be 6; his micro compiler does that, his big system compiler
  produces 9. Doesn't know if it's a bug. ]
> Tom Cunningham     "Good, fast, cheap -- select two."
> UUCP:  {ihnp4,seismo,ctvax,gatech}!ut-sally!oakhill!tomc

Sorry. K&R does guarantee that the left side of a comma operator will be
evaluated sometime before the right side of the operator, but does not say
ANYTHING about interleaving evaluations of other expressions.

This was covered about a month ago, but not quite in the same context. The
upshot of it all was that interleaving of expression evaluation is indeed
legal, and some compilers (4.2BSD, I believe, is a notorious example) are
known to do so.

"Doctor, Doctor, I get bit every time I use multiple side-effects in the
 same statement!"

"So don't do that!"
--
This is not an official statement of Hewlett-Packard Corp., and does not 
necessarily reflect the views of HP. It is provided completely without warranty
of any kind. Lawyers take 3d10 damage and roll a saving throw vs. ego attack.

Jason Zions				Hewlett-Packard
Colorado Networks Division		3404 E. Harmony Road
Mail Stop 102				Ft. Collins, CO  80525
	{ihnp4,seismo,hplabs,gatech}!hpfcdc!hpcnoe!jason

david@ztivax.UUCP (09/23/86)

On Ultrix V1.2 (4.3bsd, more or less), the answer is...

NINE!  oops...

drw@cullvax.UUCP (Dale Worley) (09/24/86)

> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
> >	/* a = b + b + b */
> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> >
> >I expected the result to be 6.  With the Microsoft C compiler and the
> >compiler on the Sun 3, the result is 9.  Apparently the parenthetical
> >assignments are all getting done before the comma and addition.  Any
> >thoughts on this?

Harbison&Steele (7.11) makes it clear that an implementation must
evaluate one argument of a binary operator completely before starting
evaluation of the other argument.  Thus, the result should be 6.  I
don't know what the ANSI standard says.

Dec VAX Ultrix gives 9.

Lattice C 3.00 for MS-DOS gives 7!!!  (Yes, that's "7", not a typo!)

Dale

donn@utah-cs.UUCP (Donn Seeley) (09/25/86)

I sure thought that someone would finally read the manual and see where
the 'problem' was, but I guess I was wrong...  Section 7 of the C
reference manual:  '... [T]he order of evaluation of expressions is
undefined,' except in specific cases: '&&', '||', ',' and '?'.  These
cases together with the precedence rules define a partial ordering on
evaluations.  Let's look at an example:

	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
	  1    2  3   4    5  6   7    8  9

I've numbered the operators in the expression to indicate the
subexpressions.  Here is the transitive closure of the set of ordered
pairs which defines the partial ordering:

 <2,1>  <2,4>  <3,4>  <5,1>  <5,6>  <6,1>  <6,7>  <8,1>  <8,9>  <9,7>
 <2,3>  <3,1>  <4,1>  <5,4>  <5,7>  <6,4>  <7,1>  <8,7>  <9,1>

(Notice that '[e]xpressions involving a commutative and associative
operator ... may be rearranged arbitrarily', which actually reduces the
number of orderings -- <4,7> isn't in the set for this reason.) Notice
that <2,5>, <2,8> and <5,8> are not in the set; the expressions 2, 5,
and 8 ('b=1', 'b=2' and 'b=3') may be evaluated in any order.  Thus 'a'
may have any value between 3 and 9, inclusive, after this statement is
executed.

Actually my favorite order-of-evaluation bug appeared in some poor
user's code to add an array of N ints:

	int array[N] = { ... };
	int *R = &array[0];
	int sum = *R++ + *R++ + *R++ + *R++ + *R++ + ... + *R++;

This worked (believe it...  or not!) with the Ritchie compiler on
the PDP11 and failed miserably under the PCC on a VAX.

Wondering what code Lattice C generated to get a == 7,

Donn Seeley    University of Utah CS Dept    donn@utah-cs.arpa
40 46' 6"N 111 50' 34"W    (801) 581-5668    decvax!utah-cs!donn

PS -- I suppose you've noticed that I've oversimplified the treatment
of 4 and 7 in the example, since the commutative/associative rule
causes an ambiguity (e.g. 3 must be done before one of 4 or 7, since
3 may be reordered under 7 in the expression tree)...

mwm@eris.berkeley.edu (Mike Meyer) (09/26/86)

In article <353@cullvax.UUCP> drw@cullvax.UUCP (Dale Worley) writes:
>> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>> >	/* a = b + b + b */
>> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>> >
>> >I expected the result to be 6.  With the Microsoft C compiler and the
>> >compiler on the Sun 3, the result is 9.  Apparently the parenthetical
>> >assignments are all getting done before the comma and addition.  Any
>> >thoughts on this?
>
>Harbison&Steele (7.11) makes it clear that an implementation must
>evaluate one argument of a binary operator completely before starting
>evaluation of the other argument.  Thus, the result should be 6.  I
>don't know what the ANSI standard says.

My reading of the ANSI hardcopy is that it doesn't say. I thought H&S
was a description, not a definition.

>Dec VAX Ultrix gives 9.

As do VAX 4.2 and 4.3.

>Lattice C 3.00 for MS-DOS gives 7!!!  (Yes, that's "7", not a typo!)

Some messy - and almost believable - arguments can be made that the
value of that expression is "implementation dependent". Rather than do
that, I'll just point out that *ANY* time++ you have a single
statement that changes a variable, then uses the variable in a
different place, you're asking for trouble. For that particular
expression, I'd expect the following, with increasing surprise as you
move down the list:

	6
	3, 9
	4, 5, 7, 8
	other integers representable on the machine
	NANs of various flavors
	dropped cores.

	<mike

++ Except for those cases that are in different operands of a logical
operator, as the evaluation order on those is known.

tom@hcrvx1.UUCP (Tom Kelly) (09/26/86)

In article <3926@utah-cs.UUCP> donn@utah-cs.UUCP (Donn Seeley) discusses
the example:

	a = ((b=1),b) + ((b=2),b) + ((b=3),b)

showing the partial order induced by the expression evaluation rules
of C language (as defined in K & R) and how various orders of
evaluation of the side effects are compatible with that partial
order.  His conclusion is that the "correct" answer is an integer
between 3 and 9 inclusive.  Various other people have posted the
answers obtained by various compilers.

The ANSI C committee (X3J11) has considered this question at some
length, mostly in the context of being able to write safe macros
that involve side-effects (prototypical example: getchar() + getchar().
Can the evaluation of the ?: operators in the "usual" implementation
of getchar be interleaved?).

The current draft (86-098: 9 July 1986) specifies in section 3.3 Expressions
(p. 31):

	Except as indicated by the syntax, or otherwise specified
	later (for the function-call operator (), the unary plus
	operator, &&, ||, ?: and comma operators), the order
	of evaluation of an expression is unspecified.  The
	implementation may evaluate subexpressions in any order,
	even if the subexpressions produce side effects.  The
	order in which side effects take place is unspecified,
	except that the evaluation of the operands of an
	operator that involves a sequence point shall not be
	interleaved with other evaluations.

Section 3.3.17 Comma Operator (p. 46)

	The left operand of a comma operator is evaluated as a
	void expression; there is a sequence point after its
	evaluation.  Then the right operand is evaluated; the
	result has its type and value.

This has the result (along with the definition of sequence point) of
specifying that the value of "a" will be 6, since once one of the
comma expressions is started, it must be fully evaluated before any
of the others are.  Of course, the resulting value of "b" is still
either 1, 2 or 3.

Tom Kelly  (416) 922-1937
Human Computing Resources Corp.
{utzoo, ihnp4, decvax}!hcr!tom

gbm@galbp.UUCP (Gary McKenney) (09/26/86)

> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
> >	/* a = b + b + b */
> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> >
> >I expected the result to be 6.  With the Microsoft C compiler and the
> >compiler on the Sun 3, the result is 9.  Apparently the parenthetical
> >assignments are all getting done before the comma and addition.  Any
> >thoughts on this?
> >
> Tom, I agree, the result should be 6, as defined by K&R, but I have tried
> it on a Cyber 180/830 running NOS VE, and get 9, also AT&T's 3B5
> System V, gets 9, But A copy of the Small-C Compiler that I have ported
> comes up with a 6.  Does this seam to imply that Small-C is a better
> (more accurate) compiler, than those that AT&T produces?  I find it
> totally unaccepable that AT&T can not produce a working C compiler.
> I would like everyone to test it on as many machines as prossible, to
> see if we can find as least ONE other besides Small-C, that works.
> 
> Mike Stump  ucbvax!hplabs!csun!csunb!beusemrs

You are both wrong.  All expressions in lowest set of parenthesis are evaluated
first from left to right, therefore before any addition occurs b = 3.
a = 3 + 3 + 3;

gbm

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/27/86)

In article <111@titan.UUCP> eectrsef@titan.UUCP (Sean Eric Fagan - SA User Serv.) writes:
>In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>>	/* a = b + b + b */
>>	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>>I expected the result to be 6.  With the Microsoft C compiler and the
>>compiler on the Sun 3, the result is 9.
plus other comments tsk-tsk'ing this, etc.

X3J11 states that subexpressions may be evaluated in any [presumably
valid-jsdy] order, even if they produce side effects.  The order in
which side effects take place is unspecified.  Side effects only
have to be complete at what X3J11 calls "sequence points."  The
comma operator  i s  (in my [old] version) a sequence point, and the
standard seems to require that at each comma, the left operand be
evaluated and then the right, and the result be the latter.  The
example bears this out; but it is not as complicated as the above.
Addition is not called a sequence point.

HOWEVER

As has been said before, X3J11 has little or nothing to do with
contemporary C compilers.  It hasn't even been issued yet!  Not
in final form.  (So if the salesman knocks on your door with an
"ANSI C compiler," slam it!)  The bible has been K&R, which says
specifically in the sections on Precedence and Order of Evaluation
(K&R 2.12, Ref. 7.) the first two sentences above, plus:
"When side effects (...) takes [sic] place is left to the discretion
of the compiler, ...
"... writing code which depends on order of evaluation is a bad
programming practice in any language."

Guy & Steele, although slightly influenced by X3J11, echo:
(7.11) "It is, of course, bad programming style to have two side
effects on the same variable in the same expression, because the
order of the side effects is not defined; but the all-too-clever
programmer here has reasoned that the order of the side effects
doesn't matter, ..."

It looks like your C compiler decided to evaluate all the lefts
of the commas first, then the rights.  For what it's worth, this
resembles what G&S call "interleaving", which they deprecate in
the evaluation of function arguments.  (But not explicitly in
this case.)  Also for what it's worth, I would have hoped that
the compiler did what you thought.  HOWEVER (again), I've learned
that where the document doesn't explicitly specify some part of
the language, the definition of the language (for all practical
purposes) resides in the compiler.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
			jsdy@hadron.COM (not yet domainised)

jason@hpcnoe.UUCP (Jason Zions) (09/28/86)

Dale Worley (drw@cullvax.UUCP) writes:

    Harbison&Steele (7.11) makes it clear that an implementation must
    evaluate one argument of a binary operator completely before starting
    evaluation of the other argument.  Thus, the result should be 6.  I
    don't know what the ANSI standard says.

Unfortunately, he didn't read far enough into section 7.11, or he would have
seen the following passage:

    The original description of C specified that subexpressions may be
    evaluated in any order ... The matter of interleaving was not
    discussed... We advise implementors to adhere rigidly to the
    restrictions outlined here...

In this passage, H&S state very clearly that the restriction on interleaving
is one they have added, and that the restriction did not exist in prior
implementations.

In short, a C compiler implemented under K&R rules does indeed permit side-
effect evaluation in any order. Pure PCC-based compilers belong in this class.
I personally agree with H&S; it's an ugly thing to do. Of course, H&S also
say:
	We also advise programmers not to exploit these restrictions too
	heavily...

which I would contend the original example does. As I have said before, people
who use more than one side-effect on the same variable in a single statement
deserve everything they get.

Could one of the true C wizards (I'm still in training, sort of...) come up
with a general statement of "things to avoid doing" to keep from getting
bitten by this interleaving bug? The statement I make in the previous para-
graph is a first cut, but I recognize that it is both too restrictive and
insufficiently restrictive to avoid the problem.
--
Jason Zions				Hewlett-Packard
Colorado Networks Division		3404 E. Harmony Road
Mail Stop 102				Ft. Collins, CO  80525
	{ihnp4,seismo,hplabs,gatech}!hpfcdc!hpcnoe!jason

barada@maggot.applicon.UUCP (09/28/86)

/* Written  4:08 pm  Sep 24, 1986 by cullvax.UUCP!drw in maggot:net.lang.c */
> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
> >	/* a = b + b + b */
> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> >
> >I expected the result to be 6.  With the Microsoft C compiler and the
> >compiler on the Sun 3, the result is 9.  Apparently the parenthetical
> >assignments are all getting done before the comma and addition.  Any
> >thoughts on this?

Harbison&Steele (7.11) makes it clear that an implementation must
evaluate one argument of a binary operator completely before starting
evaluation of the other argument.  Thus, the result should be 6.  I
don't know what the ANSI standard says.

Dec VAX Ultrix gives 9.

Lattice C 3.00 for MS-DOS gives 7!!!  (Yes, that's "7", not a typo!)

Dale
/* End of text from maggot:net.lang.c */

I rearraged the above to

	a = (b,(b=1)) + (b,(b=2)) + (b,(b=3));

And my BSD4.2 VAX produced:

        movl    $1,_b
        movl    _b,r0
        movl    $2,_b
        movl    _b,r1
        addl2   r1,r0
        movl    $3,_b
        movl    _b,r1
        addl2   r1,r0
        movl    r0,_a

As you can see, the comma operator is evaluated right to left. I think that
this is a serious bug.

BTW, this code produces the proper answer of 6.

--
Peter Barada                                   | (617)-671-9905
Applicon, Inc. A division of Schlumberger Ltd. | Billerica MA, 01821

UUCP: {allegra|decvax|mit-eddie|utzoo}!linus!raybed2!applicon!barada
      {amd|bbncca|cbosgd|wjh12|ihnp4|yale}!ima!applicon!barada

	Sanity is yet another perversion of reality.

KLH@SRI-NIC.ARPA (Ken Harrenstien) (09/29/86)

Someone asked what other compilers produced for the expression:
	a = ((b=1),b) + ((b=2),b) + ((b=3),b);

I just tried it on KCC, the PDP-10 C compiler that I and others have
developed (not a commercial product, but available).  It returns 6.
Of course, if it didn't I would have fixed it!
	As a general observation on questions of this nature, I would
expect that compilers written since the publication of H&S have been
much better (cleaner, more consistent, more predictable, etc) than
their predecessors, because H&S is a much, MUCH better guide than K&R.
Unfortunately for future implementors, the ANSI draft is more like K&R
than H&S...
-------

bet@ecsvax.UUCP (Bennett E. Todd III) (09/29/86)

In article <353@cullvax.UUCP> drw@cullvax.UUCP (Dale Worley) writes:
>> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>> >	/* a = b + b + b */
>> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>> >
>> >I expected the result to be 6.  With the Microsoft C compiler and the
>> >compiler on the Sun 3, the result is 9.
>
>Dec VAX Ultrix gives 9.
>
>Lattice C 3.00 for MS-DOS gives 7!!!  (Yes, that's "7", not a typo!)

Microsoft C 3.0 small memory model gives 6 and DeSmet C 2.51 small
memory model gives 7.

Looks like this falls in the "don't count on the order of evaluation of
subexpressions with side-effects" bucket, even though it doesn't look
illegal.

Certainly, anything that produces widely different answers under
different popular implementations of C should be avoided; it is all well
and good to try to say "such and so is RIGHT, and anything that does
different is wrong" but that doesn't help portability.

-Bennett
-- 

Bennett Todd -- Duke Computation Center, Durham, NC 27706-7756; (919) 684-3695
UUCP: ...{decvax,seismo,philabs,ihnp4,akgua}!mcnc!ecsvax!duccpc!bet

fgd3@jc3b21.UUCP (Fabbian G. Dufoe) (09/29/86)

In article <111@titan.UUCP>, eectrsef@titan.UUCP writes:
> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
> >	/* a = b + b + b */
> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> >
> >I expected the result to be 6.  With the Microsoft C compiler and the
> >compiler on the Sun 3, the result is 9.  Apparently the parenthetical
> >assignments are all getting done before the comma and addition.  Any
> >thoughts on this?
> >
> Tom, I agree, the result should be 6, as defined by K&R, but I have tried
> it on a Cyber 180/830 running NOS VE, and get 9, also AT&T's 3B5
> System V, gets 9. 
> I would like everyone to test it on as many machines as prossible, to
> see if we can find as least ONE other besides Small-C, that works.
> 
> Mike Stump  ucbvax!hplabs!csun!csunb!beusemrs

     I compiled the following code on an AT&T 3B2 (System V) and an
Amiga (Lattice 3.03):

main()
{
     int a, b;
     a = ((b=1),b) + ((b=2),b) + ((b=3),b);
     printf("%d\n", a);
     a = (b=1) + (b=2) + (b=3);
     printf("%d\n", a);
     a = (b=1), a += (b=2), a += (b=3);
     printf("%d\n", a);
}

     On the 3B2 it produced:

9
9
6

     On the Amiga, surprisingly, it produced:

7
7
6

     Both are wrong, but one can see where the 3B2 went wrong.  How in
the world did Lattice come up with 7?

Fabbian Dufoe
  350 Ling-A-Mor Terrace South
  St. Petersburg, Florida  33705
  813-823-2350

UUCP: ...akgua!akguc!codas!peora!ucf-cs!usfvax2!jc3b21!fgd3

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/29/86)

Oops, minor citation correction.  Guy HARRIS reminds me:

In article <580@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>Guy & Steele, although slightly influenced by X3J11, echo:
should be:
>Harbison & Steele, although slightly influenced by X3J11, echo:

Maybe one of these days I'll get married & stop posting after
midnight ... being too busy with more important things ...
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
			jsdy@hadron.COM (not yet domainised)

throopw@dg_rtp.UUCP (Wayne Throop) (09/30/86)

> drw@cullvax.UUCP (Dale Worley)
>> tomc@oakhill.UUCP (Tom Cunningham)

>> 	/* a = b + b + b */
>> 	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>>
>> I expected the result to be 6.

> Harbison&Steele (7.11) makes it clear that an implementation must
> evaluate one argument of a binary operator completely before starting
> evaluation of the other argument.  Thus, the result should be 6.

True.  But note what H&S say at the end of section 7.11:

    The matter of interleaving was not discussed [in the original
    description of C...]  We advise implementors to adhere [to the
    non-interleaving rule.]  We also advise programmers not to exploit
    [this rule] too cleverly.

And, as we have seen, their second bit of advice was well taken, since
their first bit of advice seems to have been ignored in almost all
implementations.

> I don't know what the ANSI standard says.

And the clincher is that ANSI didn't go along with H&S on this point.
They say that expressions separated by "sequence points" may not be
interleaved.  These sequence points occur between "full expressions".
Full expressions are initializers, expression statements, expressions in
"if", "while" and the like, and expressions in "return".  (I suspect
that (?:) and (,) expressions have sequence points also, but couldn't
find this on a trivial inspection of the draft standard.) Thus, ANSI C
says that 9 is as good as 6, and repudiates H&S.  So it goes.

--
Sometimes I think the only universal in the computing field is the
fetch-execute cycle.
    --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti-sel!dg_rtp!throopw

mdapoz@watrose.UUCP (Mark Dapoz) (09/30/86)

In article <4230@brl-smoke.ARPA> KLH@SRI-NIC.ARPA (Ken Harrenstien) writes:
>Someone asked what other compilers produced for the expression:
>	a = ((b=1),b) + ((b=2),b) + ((b=3),b);

I just tried this on my CP/M system using BDS C 1.5 and it gave me a 
value of 6.

     Mark Dapoz
  mdapoz@watrose.UUCP

pedz@bobkat.UUCP (Pedz Thing) (09/30/86)

In article <673@galbp.UUCP> gbm@galbp.UUCP (Gary McKenney) writes:
>> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>> >	/* a = b + b + b */
>> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>> >

Lets change the expression from
	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
to
	a = ((a1),r1) + ((a2),r2) + ((a3),r3)

Where a1 stands for assignment (to b) 1 and r1 stands for reference
(to b) 1.  The following conditions must be meet for proper code:
a1 before r1
a2 before r2
a3 before r3

That is all of the restrictions imposed by C.  Thus the following is
correct code:

a3, a2, a1, r1, r2, r3

which produces the answer of 3.  I think I can come up with correct
code which produces an answer anywhere form 3 to 9.

This same question comes up every few days it seems like.  I do not
see what is so confusing about it.  The simple law that C imposes no
restrictions on the ordering of most operators never seems to be
understood.

Perry
-- 
Perry Smith
ctvax ---\
megamax --- bobkat!pedz
pollux---/

blarson@usc-oberon.UUCP (Bob Larson) (10/01/86)

In article <8161@watrose.UUCP> mdapoz@watrose.UUCP (Mark Dapoz) writes:
>In article <4230@brl-smoke.ARPA> KLH@SRI-NIC.ARPA (Ken Harrenstien) writes:
>>Someone asked what other compilers produced for the expression:
>>	a = ((b=1),b) + ((b=2),b) + ((b=3),b);
Microware C 2.0 for os9/68k has correct answers for both int and register
int declarations of a and b.  In other words, a is between 6 and 9 inclusive
and b is between 1 and 3 inclusive.  (If anyone is curious, a happens to be
6 and b happens to be 3, but I don't see how this matters.)
-- 
Bob Larson
Arpa: Blarson@Usc-Eclb.Arpa	or	blarson@usc-oberon.arpa
Uucp: (ihnp4,hplabs,tektronix)!sdcrdcf!usc-oberon!blarson

fgd3@jc3b21.UUCP (Fabbian G. Dufoe) (10/03/86)

     Earlier this week I posted the following (Message-ID: <468@jc3b21.UUCP>):

>      I compiled the following code on an AT&T 3B2 (System V) and an
> Amiga (Lattice 3.03):
> 	
> main()
> {
>      int a, b;
>      a = ((b=1),b) + ((b=2),b) + ((b=3),b);
>      printf("%d\n", a);
>      a = (b=1) + (b=2) + (b=3);
>      printf("%d\n", a);
>      a = (b=1), a += (b=2), a += (b=3);
>      printf("%d\n", a);
> }
> 	
>      On the 3B2 it produced:
> 	
> 9
> 9
> 6

     This evening I received the following mail:

> While I disagree about the "required" result for the first expression,
> I have no doubt that 6 and 6 are required for the second two, and I
> believe that our 3B2 compilers give those results.  The value of an
> assignment or assignment-op is always the value of its left side.
> I would appreciate it if you would double-check your result and post
> a correction on the net.
> 
> Dave Kristol
> AT&T
> ...akgua!acguc!codas!sfbc!dmk

     OK, Dave.  I compiled and ran the program again on the 3B2.  It
produced 9, 9, and 6, just as I said the first time.  Now let me take issue
with the errors in your note.

     (1) There is no significant difference between the first two expressions.
Whatever reason you have to disagree about the required result for the
first should apply to the second as well.  Thus ((b=1),b) doesn't do
anything in this case which (b=1) doesn't do.

     (2) I reviewed K&R for some light on what result the first and second
expressions should produce.  I found the following sentence on page 49:
"As mentioned before, expressions involving one of the associative and
commutative operators (*, +, &, ^, |) can be rearranged even when
parenthesized."  And people call BASIC brain-damaged!  This means
the results of the first two expressions are unpredictable without
knowledge of the specific C compiler involved.  Furthermore, it implies the
following expression is unpredictable:

     a = (((b=1) + (b=2)) + (b=3))

Incredible as it may seem, the 3B2 thinks "a" will be 9.  (The Amiga,
with Lattice, thinks "a" will be 7.)  I believe that is a serious flaw in the
language definition.  Two C compilers, both correctly following the
definition in K&R, can compile the same legal C expression and come up
with different results.  Oh, well.  Who said C was a portable language?

     (3) You said you believe the 3B2 produces the results you thought
correct.  Why didn't you check?  Your reasoning seemed to be "I think the
correct results are 9, 6, and 6.  I think the 3B2 compiler is always right.
Therefore the results must have been 9, 6, and 6."  To my thinking, the
proper response would have been "Maybe there is something wrong here.  I'll
compile the program and see the results for myself.  Hmm...The results are
9, 9, and 6.  I wonder what's wrong with the compiler or my understanding
of C?"  Do you see the difference?

     This has certainly been an instructive exercise for me.  It is
important to understand peculiarities like that.  If I hadn't gone into it
this carefully I'd have been bitten later on when the result was critical.
Thanks for making me take such a hard look at the question.

Fabbian Dufoe
  350 Ling-A-Mor Terrace South
  St. Petersburg, Florida  33705
  813-823-2350

UUCP: ...akgua!akguc!codas!peora!ucf-cs!usfvax2!jc3b21!fgd3

gof@NOSC.ARPA (10/03/86)

>      a = ((b=1),b) + ((b=2),b) + ((b=3),b);

Someone posted that Microsoft C gives 9.  I think they were refering to pre v.3
(which was identical to Lattice) as Version 3.0 gives 6.

Jerry Fountain
crash!pnet01!gof@nosc

garys@bunker.UUCP (Gary M. Samuelson) (10/03/86)

Instead of

	a = ((b=1),b) + ((b=2),b) + ((b=3),b);

I would like to suggest using

	a = ((b=1),b) + ((b=4),b) + ((b=16),b);

This has the advantage that the result in a is unambiguous.
For example, the answer '7' could be 1+3+3 or 2+2+3, but with
the modified test case, the same order of evaluation would
produce 33 or 24.  Furthermore, the (supposedly) correct answer
'6' could be incorrectly generated by 2+2+2, instead of 1+2+3.
The modified test will produce 12 or 21, respectively.

Gary Samuelson

markp@valid.UUCP (Mark P.) (10/05/86)

> In article <673@galbp.UUCP> gbm@galbp.UUCP (Gary McKenney) writes:
> >> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
> >> >	/* a = b + b + b */
> >> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> >> >
> 
> Lets change the expression from
> 	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
> to
> 	a = ((a1),r1) + ((a2),r2) + ((a3),r3)
> 
> Where a1 stands for assignment (to b) 1 and r1 stands for reference
> (to b) 1.  The following conditions must be meet for proper code:
> a1 before r1
> a2 before r2
> a3 before r3
> 
> That is all of the restrictions imposed by C.  Thus the following is
> correct code:
> 
> a3, a2, a1, r1, r2, r3
> 
> which produces the answer of 3.  I think I can come up with correct
> code which produces an answer anywhere form 3 to 9.
> 
> This same question comes up every few days it seems like.  I do not
> see what is so confusing about it.  The simple law that C imposes no
> restrictions on the ordering of most operators never seems to be
> understood.
> 
> Perry

Finally a sensical response.  In other words, instead of all crying of how
all the compilers do different things under ambiguous K&R guidelines, just
DON'T WRITE THIS TYPE OF CODE.  Face it, since there are compilers in
existence that don't generate code for expressions in exactly the same
order, then engaging in this "creative" programming is NON-PORTABLE.

Expression rearrangement is an extremely valuable tool in optimization.
For instance, consider the following simple example:

   a= (b+c)+d

Assume that b is a memory variable, but that a and c and d are registers,
and furthermore that the processor has a delayed load (not an unreasonable
situation for a RISC).  The generated code would then look like:

   load rb, _b
   add ra, rc, rd	; load completes during this instruction
   add ra, ra, rb

Here we see the value of breaking the user's parentheses in associative/
commutative operator evaluation in order to save one instruction.

In another type of machine with multiple functional units, the possibilities
for expression optimization are endless.  I would not want to castrate the
potential of such machines just so some people could write "clever-looking"
code that is explicitly warned against in the ANSI standard.  And please
stop complaining about a situation that won't, can't, and shouldn't go away.

	Mark Papamarcos
	Valid Logic Systems
	hplabs!ridge!valid!markp

robison@uiucdcsb.cs.uiuc.edu (10/05/86)

>      (2) I reviewed K&R for some light on what result the first and second
> expressions should produce.  I found the following sentence on page 49:
> "As mentioned before, expressions involving one of the associative and
> commutative operators (*, +, &, ^, |) can be rearranged even when
> parenthesized."  And people call BASIC brain-damaged!  This means
> the results of the first two expressions are unpredictable without
> knowledge of the specific C compiler involved....
> ...I believe that is a serious flaw in the
> language definition.  Two C compilers, both correctly following the
> definition in K&R, can compile the same legal C expression and come up
> with different results.  Oh, well.  Who said C was a portable language?
>

1.   It is not a flaw in a language to specify certain constructs as undefined.
     If the C compilers were not allowd to re-arrange expressions, then the 
     results could be exactly defined, at the cost of slowing the program down. 
     On page 50, K&R points out that "the best order strongly depends on machine
     architecture".  C was designed for writing efficient programs on a variety 
     of machines.
 
2.   The operations *,+,&,^,| are commutative and associative in mathematics.
     To write code depending on a particular evaluation order is a very poor
     practice.  The meaning of the code may be obvious to you, but other
     readers will most likely be mislead.  Programmers should strive for clarity
     of expression.

3.   Determinism is not always a good thing, for example it can slow down 
     parallel processors.  Some languages (e.g. Dijkstra's guarded commands)
     may not give you the same result on the same machine.

4.   Programs are portable if written properly, i.e. pay heed to K&R's warnings.
     I use to port and maintain programs for a large computer company.
     The company has a "portable" language which supposedly runs identically
     on all its various processors.  Therefore the programmers should never
     have to worry about machine-dependencies.  Those programs were difficult
     to port, because the language did NOT run identically on all 
     processors, and the programmers never bothered to think about the
     implications.

5.   In contrast to (3), I have written a large program in C which runs on
     VAX's, PCs, and the CRAY.  The port to the PC/RT took 13 minutes, because
     I took time when writing to avoid portability problems.

In short, C's undefined constructs are not a flaw and do not deter writing
portable code.

Arch D. Robison					robison@uiucdcs
University of Illinois at Urbana-Champaign

Wilkinson@HI-MULTICS.ARPA (10/06/86)

On an Intel 286/310 running Xenix 3 updt 3 I get:
  a= ((b=1),b)+((b=2),b)+((b=3),b);          RESULT a = 9
  a= (b,(b=1))+(b,(b=2))+(b,(b=3));          RESULT a = 6
          Richard Wilkinson   (Wilkinson@HI-MULTICS)

len@geac.UUCP (Leonard Vanek) (10/07/86)

The one thing that is clear from all of the discussion on
the problem of expression sequencing is that one can never
be sure of the order in which an expression in C will be
evaluated.

Although that I agree that it is asking for trouble to mix
side effects in with complicated expressions, I still
believe that it is a pity that C (even Ansi C) does not even
let the programmer tell it what order is desired by the use
of parentheses! To ignore parentheses in determining the
evaluation order (i.e. (a+b)+c does not guarantee that a and
b are added first) causes problems with round off errors,
not just side effects -- and is totally counter-intuitive.

---------------------------------------------------------------------
Leonard Vanek                       phone (416) 475-0525
Geac Computers International
350 Steelcase Rd. West              USENET ... !utzoo!yetti!geac!len
Markham Ontario L3R 1B3
Canada

Note: My Usenet path is subject to change on short notice!

chris@umcp-cs.UUCP (Chris Torek) (10/09/86)

In article <160@geac.UUCP> len@geac.UUCP (Leonard Vanek) writes:
>... I agree that it is asking for trouble to mix
>side effects in with complicated expressions, [but] I still believe
>that it is a pity that C (even Ansi C) does not even let the
>programmer tell it what order is desired by the use of parentheses!
>To ignore parentheses in determining the evaluation order (i.e.
>(a+b)+c does not guarantee that a and b are added first) causes
>problems with round off errors, not just side effects -- and is
>totally counter-intuitive.

If you want `a+b' to be done first, then `result + c', use

	result = a + b;
	result += c;

The solution is trivial, and the `problem' is well documented.  To
those to whom parentheses directly determine order of evaluation,
I suppose this is indeed troublesome.  As for myself, I never expect
parentheses to do more than override default precedence, so it is
not `totally counter-intuitive' to me.  Even in Fortran I (where,
I believe, the language specification says that (A+B)+C is requires
doing A+B first, then result+C) I would write

	RESULT = A + B
	RESULT = RESULT + C

---if that were what I meant.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

chris@umcp-cs.UUCP (Chris Torek) (10/09/86)

Incidentally, I once used a system in which parentheses determined
only precedence.  (I used it early in my programming ventures,
which may explain my mindset.)  It did something rather novel,
which might help all these C programmers who think parentheses
directly control evaluation order:  It removed any `unnecessary'
parentheses.

(The system, incidentally, was an HP 9825A `desk-top calculator',
as part of an HP 3060A board test system.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

coltoff@burdvax.UUCP (Joel Coltoff) (10/09/86)

In article <160@geac.UUCP>, len@geac.UUCP (Leonard Vanek) writes:
> The one thing that is clear from all of the discussion on
> the problem of expression sequencing is that one can never
> be sure of the order in which an expression in C will be
> evaluated.

Not always true. Logical expressions are evaluated left to right.

> I still believe that it is a pity that C (even Ansi C) does not even
> let the programmer tell it what order is desired by the use
> of parentheses! To ignore parentheses in determining the
> evaluation order (i.e. (a+b)+c does not guarantee that a and
> b are added first) causes problems with round off errors,
> not just side effects -- and is totally counter-intuitive.

Agreed. But keep this in mind. C can change the order in which it
evaluates expressions, we told you that when you took this job,
what it doesn't do is change the order in which it evalutes statements.
If you really want to do (a+b) + c then do it like this

	x = a + b;
	x += c;

Yes it uses more variables, generates more code and is less efficient but
it gets the correct answer. This is often much more important than
any of the other considerations we make when we write programs.

Now to throw in my few shekels. When you do something like

	a = (b=1,b) + (b=2,b) + (b=3,b);

or more realistically

	if ( a < b || ( c = ( x/y ) ) == 42 || d == data[i] )

you deserve to get burned. Sure the langauge lets you do things like
assign a value to c and compare it to 42 in the same expression but
keep in mind that that assignment isn't done on each pass through that
block of code.

henry@utzoo.UUCP (Henry Spencer) (10/09/86)

> ... I still believe that it is a pity that C (even Ansi C) does not even
> let the programmer tell it what order is desired by the use
> of parentheses! ...

The problem, as has been mentioned before, is that parentheses are also
used to override the precedence rules for operators.  This is a use in
which one does *not* necessarily want to imply the forcing of evaluation
order.  There really is no entirely satisfactory solution except to use
two different constructs for the two different roles.  C opted long ago
to use parentheses for precedence overrides, and to require explicit
assignment to a temporary to force order of evaluation.  X3J11 has actually
added a few hooks for order control, but changing the meaning of parentheses
for that purpose wasn't considered wise.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

tps@sdchem.UUCP (Tom Stockfisch) (10/10/86)

In article <160@geac.UUCP> len@geac.UUCP (Leonard Vanek) writes:
>The one thing that is clear from all of the discussion on
>the problem of expression sequencing is that one can never
>be sure of the order in which an expression in C will be
>evaluated.
>
>...
>... To ignore parentheses in determining the
>evaluation order (i.e. (a+b)+c does not guarantee that a and
>b are added first) causes problems with round off errors,
>not just side effects -- and is totally counter-intuitive.

Allowing optimizing compilers the freedom to rearrange expressions is (I
am told) crucial to performance on many machines.  Good portability means
not only that a program will  w o r k  on any machine, but that it will
work  e f f i c i e n t l y  on any machine, and people won't be tempted
to waste time fine-tuning programs to many different machines (a maintenance
nightmare).  For non-numerical calculations, order of evaluation almost
never matters, except with poorly written expressions (e.g.  multiple
side effects).
For numerical calculations it would be very nice to be able
to specify order.  Breaking up the expression into individual ones is  n o t
a viable option, as serious numerical work can have rather long expressions
which would be rendered unreadable.

I think the best solution would be a unary operator with syntax like 'return',
so that the compiler would have to respect parentheses for order of evaluation
of the following statement.
Unfortunately, this would probably mean another key word or another obscure
overloading of a current one.  But a feature like this is really important for
numerical programming.  For instance, call the operator "respect".  Then

	respect	<expr> ;

would mean that <expr> must have its parentheses respected.  I think this is
much better than

	respect( <expr> );

because extra parentheses tend to muck things up in numeric work, altho of
course wimps (:-) who insist on using

	return(expr);

instead of

	return	expr;

could do likewise with "respect".
-- 

-- Tom Stockfisch, UCSD Chemistry

Schauble@MIT-MULTICS.ARPA (Paul Schauble) (10/10/86)

OK, so (b=1,b)+(b=2,b) is ambiguous.  Now, is it reasonable to ask lint
to flag things like this??

          Paul
          Schauble at MIT-Multics

peters@cubsvax.UUCP (Peter S. Shenkin) (10/10/86)

In article <umcp-cs.3769> chris@umcp-cs.UUCP (Chris Torek) writes:
>
>If you want `a+b' to be done first, then `result + c', use
>
>	result = a + b;
>	result += c;
>
>The solution is trivial, and the `problem' is well documented.
>						...I never expect
>parentheses to do more than override default precedence, so it is
>not `totally counter-intuitive' to me.

One of the attractions of C is its elegance and conciseness of expression;
having to declare a variable only for the purpose of defining order of
evaluation, even when the expression is extremely simple, is inelegant and 
inconcise, and the requirement to do so can easily double the size (as measured 
by the number of lines) of source code in numerical work where rounding error
is significant and such order has to be thoroughly thought through.

C wasn't originally designed for such applications, of course, but now that
we're going to be able to do single-precision arithmetic across function
calls there's going to be less and less reason to avoid using C;  unfortunately,
this parentheses thing is going to remain one of them.

I understand the reason for the accepted convention, and I accept that reason,
but even if it's necessary it's a necessary evil;  let's not make a virtue
out of it.  I wish there were some way of forcing order of execution, to 
this extent anyway, within a line.

Peter S. Shenkin	 Columbia Univ. Biology Dept., NY, NY  10027
{philabs,rna}!cubsvax!peters		cubsvax!peters@columbia.ARPA

henry@utzoo.UUCP (Henry Spencer) (10/10/86)

> For numerical calculations it would be very nice to be able
> to specify order.  Breaking up the expression into individual ones is  n o t
> a viable option...  I think the best solution would be a unary operator
> with syntax like 'return', so that the compiler would have to respect
> parentheses for order of evaluation of the following statement...
> 	
> 	respect	<expr> ;

X3J11 has not done quite this, but they have provided a way to specify a
"fence" within an expression.  The unary plus operator, originally provided
for relatively minor reasons of consistency, also does this job.  To force
the parenthesized addition in "(a+b)+c" to be done first, write "+(a+b)+c".
(Unary operators have higher priority than binary, so "+(a+b)" is an operand
of the last "+".)  The official definition of the effect is something like
"inhibits regrouping of subexpressions within its operand with subexpressions
outside it".  I can't say that I am in love with the syntax, but the facility
is there.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

chris@umcp-cs.UUCP (Chris Torek) (10/11/86)

In article <4504@brl-smoke.ARPA> Schauble@MIT-MULTICS.ARPA (Paul Schauble)
writes:
>OK, so (b=1,b)+(b=2,b) is ambiguous.  Now, is it reasonable to ask lint
>to flag things like this??

Yes.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

simon@its63b.ed.ac.uk (ECSC68 S Brown CS) (10/11/86)

In article <2076@ecsvax.UUCP> bet@ecsvax.UUCP (Bennett E. Todd III) writes:
>In article <353@cullvax.UUCP> drw@cullvax.UUCP (Dale Worley) writes:
>>> In article <760@oakhill.UUCP> tomc@oakhill.UUCP (Tom Cunningham) writes:
>>> >	/* a = b + b + b */
>>> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>>> >
>>> >I expected the result to be 6.  With the Microsoft C compiler and the
>>> >compiler on the Sun 3, the result is 9.
>>Dec VAX Ultrix gives 9.
>>Lattice C 3.00 for MS-DOS gives 7!!!  (Yes, that's "7", not a typo!)
>Microsoft C 3.0 small memory model gives 6 and DeSmet C 2.51 small
>memory model gives 7.
>
>Looks like this falls in the "don't count on the order of evaluation of
>subexpressions with side-effects" bucket, even though it doesn't look
>illegal.
>
>Certainly, anything that produces widely different answers under
>different popular implementations of C should be avoided; it is all well
>and good to try to say "such and so is RIGHT, and anything that does
>different is wrong" but that doesn't help portability.
>

Ok, this seems good advice for expressions where each subexression
_must_ be evaluated at _some_ time.
But, how about
	c = (b=1,a==1) || (b=2,a==0) || (b=3,a==3) || (b=4,a==2);
as a quick nice way of saying
	switch (a) {
	    case 1: b=1; c=1; break;
	    case 0: b=2; c=1; break;
	    case 3: b=3; c=1; break;
	    case 2: b=4; c=1; break;
	    default: c=0;
	}
where the evaluation will "break off" at the point where one of
the comparisons first succeeds.

(I actually wanted to use something a bit like this a few days ago,
but now I'm not too sure its that portable at all, considering
all the problems with "+").

Of course, compilers on which _this_ doesn't work should be
considered even *more* wrong than those as described above, but
that's not too constructive a complaint...


--
Simon Brown
Dept. of Computer Science, University of Edinburgh.
{seismo,ihnp4,decvax}!mcvax!ukc!cstvax(!its63b?)!simon

levy@ttrdc.UUCP (Daniel R. Levy) (10/15/86)

In article <559@cubsvax.UUCP>, peters@cubsvax.UUCP (Peter S. Shenkin) writes:
>>If you want `a+b' to be done first, then `result + c', use
>>
>>	result = a + b;
>>	result += c;
>>
>>The solution is trivial, and the `problem' is well documented.
>>						...I never expect
>>parentheses to do more than override default precedence, so it is
>>not `totally counter-intuitive' to me.
>One of the attractions of C is its elegance and conciseness of expression;
>having to declare a variable only for the purpose of defining order of
>evaluation, even when the expression is extremely simple, is inelegant and 
>inconcise, and the requirement to do so can easily double the size (as measured 
>by the number of lines) of source code in numerical work where rounding error
>is significant and such order has to be thoroughly thought through.
>
>C wasn't originally designed for such applications, of course, but now that
>we're going to be able to do single-precision arithmetic across function
>calls there's going to be less and less reason to avoid using C;  unfortunately,
>this parentheses thing is going to remain one of them.
>I understand the reason for the accepted convention, and I accept that reason,
>but even if it's necessary it's a necessary evil;  let's not make a virtue
>out of it.  I wish there were some way of forcing order of execution, to 
>this extent anyway, within a line.
>Peter S. Shenkin	 Columbia Univ. Biology Dept., NY, NY  10027

In C, you can put more than one statement on a line!  So it would be
feasible, if awkward, to do something like

	float a,b,c,d,e,t;

	t=a+b;t+=c;d*=t;d/=e;

for the FORTRAN

	D=(D*((A+B)+C))/E

Obviously it's possible but unnecessary to use four different lines:

	t=a+b;
	t+=c;
	d*=t;
	d/=e;
-- 
 -------------------------------    Disclaimer:  The views contained herein are
|       dan levy | yvel nad      |  my own and are not at all those of my em-
|         an engihacker @        |  ployer or the administrator of any computer
| at&t computer systems division |  upon which I may hack.
|        skokie, illinois        |
 --------------------------------   Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
	   go for it!  			allegra,ulysses,vax135}!ttrdc!levy

fgd3@jc3b21.UUCP (10/15/86)

In article <139200039@uiucdcsb>, robison@uiucdcsb.cs.uiuc.edu writes:
> 
> > the results of the first two expressions are unpredictable without
> > knowledge of the specific C compiler involved....
> > ...I believe that is a serious flaw in the
> > language definition.  Two C compilers, both correctly following the
> > definition in K&R, can compile the same legal C expression and come up
> > with different results.  Oh, well.  Who said C was a portable language?
> 
> 4. Programs are portable if written properly, i.e. pay heed to K&R's warnings.
>      I use to port and maintain programs for a large computer company.
>      The company has a "portable" language which supposedly runs identically
>      on all its various processors.  Therefore the programmers should never
>      have to worry about machine-dependencies.  Those programs were difficult
>      to port, because the language did NOT run identically on all 
>      processors, and the programmers never bothered to think about the
>      implications.
> 
> 5.   In contrast to (3), I have written a large program in C which runs on
>      VAX's, PCs, and the CRAY.  The port to the PC/RT took 13 minutes, because
>      I took time when writing to avoid portability problems.
> 
> 
> Arch D. Robison					robison@uiucdcs
> University of Illinois at Urbana-Champaign

     Your examples support my point: a _portable_ language is one which
runs identically on all implementations.  A language which permits the code
generated by its statements to be implementation-dependent is not portable.
When dealing with a non-portable language (like C) you can work around the
problem by avoiding those statements which are evaluated in an
implementation-dependent way.  But if portability is important to you--and
I believe it should be--then it is a flaw in the language definition to
permit the evaluation of statements which are syntactically correct to
depend on the implementation.

     In (4) above you describe a language which was supposed to be portable
but wasn't.  Failing to work around the non-portable features caused
problems.  In (5) above you describe a C program which was portable
specifically because you "took time when writing to avoid portability
problems."

Fabbian Dufoe
  350 Ling-A-Mor Terrace South
  St. Petersburg, Florida  33705
  813-823-2350

UUCP: ...akgua!akguc!codas!peora!ucf-cs!usfvax2!jc3b21!fgd3

chris@umcp-cs.UUCP (Chris Torek) (10/16/86)

In article <559@cubsvax.UUCP> peters@cubsvax.UUCP (Peter S. Shenkin) writes:
>One of the attractions of C is its elegance and conciseness of
>expression; having to declare a variable only for the purpose of
>defining order of evaluation, even when the expression is extremely
>simple, is inelegant and inconcise, ....

I wrote a nice long reply to this, but it fell into the news rubbish
heap because the `active' file was too long.  Naturally the posting
software did not bother to keep a copy of my text, since inews
never fails.  Right.  Summary: if you think C needs something to
force evaluation order, *implement something*.  Test it well.  If
it works, *then* lobby for its adoption.  If, for example, you
think altering the meaning of parentheses will not be harmful, try
it.  If you prefer `order bracketing' such as `[a + b] + c', try
that.  But whatever you try, TEST IT WELL before you cast it in
concrete (via X3J11 or whatnot).

00R0DHESI%bsu.csnet@CSNET-RELAY.ARPA (Rahul Dhesi) (10/16/86)

Peter S. Shenkin <cubsvax!peters@columbia.ARPA> writes:
<In article <umcp-cs.3769> chris@umcp-cs.UUCP (Chris Torek) writes:
<<	result = a + b;
<<	result += c;
<...
<...I wish there were some way of forcing order of execution, to 
<this extent anyway, within a line.

Try this:
     {int t = a + b; result = t + c;}  /* note:  force evaluation order */

This construct is general and it achieves the purpose.  The extra variable
exists only in the block and doesn't clutter up the rest of the program.  And 
its very presence shouts "Careful!" to next person who modifies the code.  
C doesn't cry wolf.  Other languages do. 


                                Rahul Dhesi <dhesi%bsu@csnet-relay.ARPA>
                                !seismo!csnet-relay.ARPA!bsu!dhesi
                                Yes, I know they go the wrong way.

greg@utcsri.UUCP (Gregory Smith) (10/17/86)

In article <483@jc3b21.UUCP> fgd3@jc3b21.UUCP writes:
>     Your examples support my point: a _portable_ language is one which
>runs identically on all implementations.  A language which permits the code
>generated by its statements to be implementation-dependent is not portable.

But if the statements produce the same result, despite being evaluated
in a different order, then the *program* is portable.

>When dealing with a non-portable language (like C) you can work around the
>problem by avoiding those statements which are evaluated in an
>implementation-dependent way.  But if portability is important to you--and
>I believe it should be--then it is a flaw in the language definition to
>permit the evaluation of statements which are syntactically correct
>depend on the implementation.

The following is a syntactically correct statement
1 = 1.0();
which obviously is semantically incorrect. So insert '... and semantically ..'
in the above assertion. Now one could argue that a[i]=i++; is
semantically incorrect in 'portable' C, even though it takes lint to
catch it.

The programmer can determine whether a statement will be portable
or not due to side-effects. If you make a compiler which does
not fiddle with the order of evaluation, then you may be making
*all* statements portable, but your compiler will generate significantly
less efficient code for a lot of statements which were already portable.

The main point is that the bounds of portability are well defined,
( at least they will be with ANSI C), so all you have to do is avoid
the non-portable stuff.

Why would anybody write x[i]=i++; or func( --i,--i) or j = i+i++; etc?
Even if you know it will not ported and are clever enough to know what
it does on your current compiler, it is not obvious to someone reading
the code. In fact the reader will have to look at the assembler
produced, or write a test program. If you fix this by forcing order
of evaluation, we will just have a whole new set of rules to remember.
C operator hierarchy is already enough :-)

From: mwm@eris.berkeley.edu (Mike Meyer)
>	Euclid: Maybe portable, but is there more than one implementation?

We have coders for PDP-11, 68000, vax, 6809, NS16K and possibly a few more.
You can still get burned by different word sizes, but there are tools
which can and should be used if you want portability.

Sorry if this is a little muddled, but current system load makes
editing a very time-consuming chore.

-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

dik@mcvax.uucp@ndmce.uucp (Dik T. Winter) (10/17/86)

In article <3769@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes:
>                                        Even in Fortran I (where,
>I believe, the language specification says that (A+B)+C is requires
>doing A+B first, then result+C) I would write
>
>	RESULT = A + B
>	RESULT = RESULT + C
>
And so should it.  Once I required the expression (A - B) - C evaluated
exactly that way.  It didn't do so.  However, after removing the
parenthesis it was evaluated the way I wanted.  Counterintuitive?
-- 
dik t. winter, cwi, amsterdam, nederland
UUCP: {seismo,decvax,philabs,okstate,garfield}!mcvax!dik
  or: dik@mcvax.uucp
ARPA: dik%mcvax.uucp@seismo.css.gov

faustus@ucbcad.BERKELEY.EDU@ndmce.uucp (Wayne A. Christopher) (10/18/86)

In article <483@jc3b21.UUCP>, fgd3@jc3b21.UUCP writes:
>      Your examples support my point: a _portable_ language is one which
> runs identically on all implementations.  A language which permits the code
> generated by its statements to be implementation-dependent is not portable.

A language can't be called portable, but programs written in that language
can.  A measure of how useful a language is how easy it is to write portable
code in it.  By your definition, you could call any language "non-portable"
because you can always (at least if the language is used for real things)
write code that determines what kind of processor it is running on.

> When dealing with a non-portable language (like C) ...

I know of no other (useful) language which is as easy to write portable code
in as C.  Most other common languages such as lisp suffer from a lack of
a standard such as K&R which all compilers follow (at least in theory).  (I
know, there is a lisp standard now...)

> But if portability is important to you--and
> I believe it should be--then it is a flaw in the language definition to
> permit the evaluation of statements which are syntactically correct to
> depend on the implementation.

Most of these cases where things are implementation-dependent are situations
where the hardware must dictate how things are to be done if they are to be
done in an optimal manner.  For instance, order of argument evaluation must
depend on the implementation because different machines like different
stack setups.  Anyway, there is 'lint', so it should be very easy to write
portable C code if you try.

	Wayne

dant@tekla.UUCP (10/18/86)

Simon Brown ( simon@its63b.ed.ac.uk (ECSC68 S Brown CS)) writes:
>But, how about
>	c = (b=1,a==1) || (b=2,a==0) || (b=3,a==3) || (b=4,a==2);
>as a quick nice way of saying
>	switch (a) {
>	    case 1: b=1; c=1; break;
>	    case 0: b=2; c=1; break;
>	    case 3: b=3; c=1; break;
>	    case 2: b=4; c=1; break;
>	    default: c=0;
>	}
>where the evaluation will "break off" at the point where one of
>the comparisons first succeeds.
>
>(I actually wanted to use something a bit like this a few days ago,
>but now I'm not too sure its that portable at all, considering
>all the problems with "+").

These two statements are NOT identical.  To make them the same the 
default case would need to be changed to:

	    default: b=4; c=0;

The switch is superior to the assignment for three reasons:

1) You probably don't want to change b in the default case.

2) Efficiency.  In all but the first case the assignment statement
will assign several values to b before reaching the true condition.

3) Maintainability.  The switch makes it obvious what's happening
The assignment has potential in the Obfuscated C Contest.

 Dan Tilque		UUCP:		tektronix!dadla!dant
			CSnet:		dant%dadla@tektronix
			ARPAnet:	dant%dadla%tektronix@csnet-relay

 "This is a bust!" she yelled, as she ripped open her coat, boldly
 displaying her ample authority.	-- R. J. Wilcek

		 From _Son_of_"It Was a Dark and Stormy Night"_

JOSH@ibm-sj.ARPA (Joshua W. Knight) (10/18/86)

There have been lots of lamentations about C not providing a way to
force evaluation order.  The original subject (with multiple "side
effect" assignments to the same variable) isn't really the issue here.
One legitimate concern is truncation and such in numerical calculations.
The ANSI C draft standard provides the unary plus operator for coercing
evaluation order.  Thus

    a = +(b + c) + +(d) ;

should force the sum of b+c to be calculated and added to d. This is
probably less pleasing to the eye but, as has been pointed out before,
parentheses already have a meaning in C, and it is explicitly NOT one
that forces order of evaluation.

Of course I speak only for myself, not my employer.

			Josh Knight
			IBM T.J. Watson Research Center
josh@ibm.com, josh@yktvmh.BITNET

peters@cubsvax.UUCP (10/20/86)

In article <brl-smok.4642> 00R0DHESI%bsu.csnet@CSNET-RELAY.ARPA (Rahul Dhesi) writes:
>Peter S. Shenkin <cubsvax!peters@columbia.ARPA> writes:
><In article <umcp-cs.3769> chris@umcp-cs.UUCP (Chris Torek) writes:
><<	result = a + b;
><<	result += c;
><...
><...I wish there were some way of forcing order of execution, to 
><this extent anyway, within a line.
>
>Try this:
>     {int t = a + b; result = t + c;}  /* note:  force evaluation order */

So many have replied to me on this subject, both here and by netmail, that
I'd like to point out publicly what someone reminded me of by mail.

The draft ANSI standard specifies that unary "+" guarantees that a
parenthesized expression following it will be evaluated without interleaving
or other change of order;  thus, using the example previously bandied about, 
the usage:
	a = +( (b=1), b ) + +( (b=4), b ) + +( (b=16), b )
will guarantee that a is given a value of 21.

I rather like this;  it allows the code writer to specify where he thinks
order of evaluation is important, and allows the compiler to optimize
the order where it's not. 

We now have the following comparison:
	Fortran: parentheses specify both precedence of operators and
	   order of evaluation of operators with equal precedence, even
	   when the operators are commutative;
	C:  parentheses specify only precedence of operators.  Commutative
	   operators can have their arguments evaluated in any order, but
	   the default order, which will be compiler-dependent, can be
	   overridden with the unary "+".

You guys are giving us numerical folks less and less excuse to keep using
Fortran (break my heart)....  'Nuff said?

Peter S. Shenkin	 Columbia Univ. Biology Dept., NY, NY  10027
{philabs,rna}!cubsvax!peters		cubsvax!peters@columbia.ARPA

asw@tony.UUCP (Tony Williams) (10/20/86)

In article <11900001@maggot> barada@maggot.applicon.UUCP writes:

>> >	a = ((b=1),b) + ((b=2),b) + ((b=3),b)
>
>I rearraged the above to
>
>	a = (b,(b=1)) + (b,(b=2)) + (b,(b=3));

   this isn't a rearragement (sic), but a different statement.
>
>And my BSD4.2 VAX produced:
>
>        movl    $1,_b
>        movl    _b,r0
>        movl    $2,_b
>        movl    _b,r1
>        addl2   r1,r0
>        movl    $3,_b
>        movl    _b,r1
>        addl2   r1,r0
>        movl    r0,_a
>									 
>As you can see, the comma operator is evaluated right to left. I think that

No, it is not.

>this is a serious bug.

Ah yes, but the bug is in your code, not the compiler.
>
>BTW, this code produces the proper answer of 6.
>
There is no single proper answer.

The comma operator is defined to evaluate the left operand, discard it,
then evaluate the right operand, and return the resulting value, ie
that of the RIGHT operand.  So,
   (b,b=1)
evaluates b and discards it, evaluates b=1, assigning one to b, and returns
the result which is the new value of b.  Your statement is therefore
equivalent to
	a = (b=1) + (b=2) + (b=3);
which is equiivalent to
	a = 1 + 2 + 3;
followed or preceded by the assignements to b in ANY order.
The compiler has elided the evaluations of b, as the result is to be
discarded and there are no side-effects.

In summary, the compiler *is* allowed to reorder operands of operators
like +, but you are *not* allowed to reorder the operands of ","
and expect the same result.
---------------------------------------------------------------------------
Tony Williams					|Informatics Division
UK JANET:	asw@uk.ac.rl.vd			|Rutherford Appleton Lab
Usenet:		{... | mcvax}!ukc!rlvd!asw	|Chilton, Didcot
ARPAnet:	asw%vd.rl.ac.uk@ucl-cs.arpa	|Oxon OX11 0QX, UK

c@hpdsd.UUCP (C Compiler Project(Debbie Coutant)) (10/21/86)

/Just for the record, There is a new operator specified by the proposed ANSI

:q!

:q
q
hpdsd:net.lang.c / faustus@ucbcad.BERKELEY.EDU@ndmce.uucp (Wayne A. Christopher) /  2:09 pm  Oct 17, 1986 /
In article <483@jc3b21.UUCP>, fgd3@jc3b21.UUCP writes:
>      Your examples support my point: a _portable_ language is one which
> runs identically on all implementations.  A language which permits the code
> generated by its statements to be implementation-dependent is not portable.

A language can't be called portable, but programs written in that language
can.  A measure of how useful a language is how easy it is to write portable
code in it.  By your definition, you could call any language "non-portable"
because you can always (at least if the language is used for real things)
write code that determines what kind of processor it is running on.

> When dealing with a non-portable language (like C) ...

I know of no other (useful) language which is as easy to write portable code
in as C.  Most other common languages such as lisp suffer from a lack of
a standard such as K&R which all compilers follow (at least in theory).  (I
know, there is a lisp standard now...)

> But if portability is important to you--and
> I believe it should be--then it is a flaw in the language definition to
> permit the evaluation of statements which are syntactically correct to
> depend on the implementation.

Most of these cases where things are implementation-dependent are situations
where the hardware must dictate how things are to be done if they are to be
done in an optimal manner.  For instance, order of argument evaluation must
depend on the implementation because different machines like different
stack setups.  Anyway, there is 'lint', so it should be very easy to write
portable C code if you try.

	Wayne
----------

c@hpdsd.UUCP (C Compiler Project(Debbie Coutant)) (10/21/86)

      There is a new operator in the proposed ANSI C standard that forces
      expression grouping, it is the unary plus operator. The unary plus is
      only useful for expressions that involve >1 uses of the same operator,

      for example:   a+b+c
      With the unary plus operator a+b+c can be forced to be evaluated as
      (a+b)+c using the following:  +(a+b) + c
      By ANSI's definition this will force (a+b) to be evaluated first.
      Looks like somebody thought of the order-of-evaluation problem and
      attempted to 'fix' it in the ANSI proposal already.

rbutterworth@watmath.UUCP (Ray Butterworth) (10/21/86)

> The ANSI C draft standard provides the unary plus operator for coercing
> evaluation order.  Thus
>     a = +(b + c) + +(d) ;
> should force the sum of b+c to be calculated and added to d. This is
> probably less pleasing to the eye but, as has been pointed out before,
> parentheses already have a meaning in C, and it is explicitly NOT one
> that forces order of evaluation.

Does X3J11 (or any other C "standard") say anything about the
order of evaluation of (possibly redundant) cast expressions?

e.g.      ( ((double)(a+b)) + ((double)(c+d)) )

where a, b, c, and d may or may not be type (double)?

It would certainly be prettier than the unary " +" operator,
and certainly more obvious that the programmer really did want
the given grouping.