[comp.lang.c] Autoincrement question

schaefer@ogcvax.UUCP (12/04/87)

(I realize this might be similar to another question asked recently, but ...)

Another student here at OGC recently came to me with a question about the
C autoincrement operator.  The following program is representative of the
code he wrote, which did not do what he expected:

    struct foo { struct foo *tmp; char junk[32]; } foolist[4];

    main ()
    {
	struct foo *bar;

	bar = foolist;
	/* Do something with bar */
	bar->tmp = bar++;		/* This is the problem line */
	/* Do something else */
    }

This was compiled with the 4.3 BSD UNIX C compiler (not ANSI conformant).
What he really wanted was the equivalent of
	bar->tmp = bar;
	bar++;
This program DOES do what he expected the above to accomplish:

    main ()
    {
	struct foo *bar;

	bar = foolist;
	/* Do something with bar */
	(bar++)->tmp = bar;
	/* Do something else */
    }

I know HOW the results of the two programs differ (I looked at the assembly
code) but I was wondering if someone could explain WHY they differ.  Obviously,
when bar++ appears on the right side of the assignment, the old value of bar
is saved, then the increment is computed, and lastly bar->tmp is evaluated and
the assignment done.

The real question is:  Is the order of evaluation in a statement like
	bar->tmp = bar++;
well-defined, or is it implementation-dependent?  And if it is well-defined,
WHERE is it defined?  Reference please.

-- 
Bart Schaefer			CSNET:	schaefer@cse.ogc.edu
				UUCP:	...{tektronix,verdix}!ogcvax!schaefer
"A band of BIG, DUMB, LOUDMOUTHED, BUNGLING OGRES is a GREAT ASSET to the
neighbohood.  It keeps out the RIFF-RAFF."		-- Wormy

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/05/87)

In article <1507@ogcvax.UUCP> schaefer@ogcvax.UUCP (Barton E. Schaefer) writes:
>	bar->tmp = bar++;		/* This is the problem line */

The order of evaluation of the operands of "=" is unspecified.
If the address of the lvalue on the left-hand size is constructed before
the postfix expression on the right is evaluated, one result is obtained;
if the expression is evaluated before the lvalue address, the new value
of "bar" will be used.

Don't take gambles like this.

ark@alice.UUCP (12/06/87)

In article <1507@ogcvax.UUCP>, schaefer@ogcvax.UUCP writes:
> The real question is:  Is the order of evaluation in a statement like
> 	bar->tmp = bar++;
> well-defined, or is it implementation-dependent?

It is implementation-dependent.

swarbric@tramp.Colorado.EDU (SWARBRICK FRANCIS JOHN) (12/06/87)

There has been a big argument in the FidoNet C ECHO about this.  Someone had
stated that i += i++ is ambiguous in C, and is not defined.  There is supposedly
a section in K&R (I don't have it) that says a[i] = i++ is ambiguous.  So I
would assume that bar->temp = bar++ is also undefined.  (As well as
(bar++)->temp = bar)  So you should probably use the bar++ on the next line.

Frank Swarbrick
swarbrick@tramp.Colorado.EDU
...!hao!boulder!tramp!swarbric

carroll@snail.CS.UIUC.EDU (12/06/87)

	Quotes from K&R:
A.16 (page 212)
    "The order in which side effects take place is also unspecified"

2.12 (page 50)
    "One unhappy situtation is typified by the statement
	a[i] = i++;
     The question is whether the subscript is the old valu eof i or the new.
     The compiler can do this in different ways, and generate different
     answers, depending on its interpretation"

Alan M. Carroll		amc@woodshop.cs.uiuc.edu	carroll@s.cs.uiuc.edu
...{ihnp4,convex}!uiucdcs!woodshop!amc
Quote of the day :
	"I've consulted all the sages I could find in Yellow Pages,
	 but there aren't many of them" - AP & EW

jbs@eddie.MIT.EDU (Jeff Siegal) (12/06/87)

In article <3333@sigi.Colorado.EDU> swarbric@tramp.Colorado.EDU (SWARBRICK FRANCIS JOHN) writes:
>[...]There is supposedly
>a section in K&R (I don't have it) that says a[i] = i++ is ambiguous.
>[...]

Page 50 (end of Chapter 2)

"One unhappy situation is typified by the statement

	a[i] = i++;

The question is whether the subscript is the old value of i or the
new...."

Summary: Don't use the target of an auto-increment or auto-decrement
operator elsewhere in the same expression.

Jeff Siegal

rcvie@tuvie (ELIN Forsch.z.) (12/07/87)

In article <1507@ogcvax.UUCP>, schaefer@ogcvax.UUCP (Barton E. Schaefer) writes:
> (I realize this might be similar to another question asked recently, but ...)
> 
> Another student here at OGC recently came to me with a question about the
> C autoincrement operator.  The following program is representative of the
> code he wrote, which did not do what he expected:
> 
>     struct foo { struct foo *tmp; char junk[32]; } foolist[4];
> 
>     main ()
>     {
> 	struct foo *bar;
> 
> 	bar = foolist;
> 	/* Do something with bar */
> 	bar->tmp = bar++;		/* This is the problem line */
> 	/* Do something else */
>     }
> 

This is really dangerous programming. The points where the left and where the
right "bar" are evaluated are implementation defined. The problem is similar to
another one, which a friend of mine had some time ago. He tried to pack as much
as possible into the control part of a while loop using the following statement:

while (a[i]=b[i++])
  ;

Things were even worse here, as the program behaved even differently depending
on whether it was compiled with the optimization option or not. Non optimized
everything worked as expected but in the optimized version only for the first
assignment "i" was incremented after the assignment, for all the following
assignments it was incremented after the evaluation of "b[i]" but before the 
assignment. Nevertheless this behaviour was in the sense of both K&R and ANSI.
The only thing you can trust on, is that the *operand* of the increment
operator is evaluated before its incrementation. One way to achieve the desired
behaviour is, as you suggested yourself, to write:

> What he really wanted was the equivalent of
> 	bar->tmp = bar;
> 	bar++;

and not (for the same reasons stated above):

> 	(bar++)->tmp = bar;

If there is any necessity to have the whole semantic in one *expression*, use
the comma operator, as

bar->tmp = bar, bar++;

This operator *guarantees* the sequential evaluation of its operands from
left to right.

In real life: Dipl.Ing. Dietmar Weickert
              ALCATEL Austria - ELIN Research Center
              Floridusg. 50
          A - 1210 Vienna / Austria

kers@otter.HP.COM (Christopher Dollin) (12/08/87)

david@elroy.Jpl.Nasa.Gov (David Robinson) says:

>What are the current arguments against having the ANSI stndard
>defining the order of evaluation of the assignment operator?
>I can understand that the evaluation of one side being left
>as implementation dependant for efficency purposes but a
>require ment that "right hand side fully evaluated, then left
>hand side fully evaluated BEFORE assignment" would solve the
>"a[i++] = i * 2;" problem without unduly restricting the
>compiler writers.

Well, I've written compilers (not for C), and you'd be surprised at how much
of a restriction that sort of ordering can be if you're trying to generate
good code. Depending on the target machines instruction set, it can really
cripple your code generator. The problem is that 127 times out of 128, the
order really doesn't matter, and the compiler can use this to generate the
"best" code for that sequence; the other 1 time isn't really worth catering
for.

I also think that depending on implict control flow rules such as that
above can produce opaque code. But then, who bothers about maintaing C anyway
:-)

Regards,
Kers                                | "Why Lisp when you can talk Poperly?"

jfh@killer.UUCP (John Haugh) (12/09/87)

In article <7507@alice.UUCP>, ark@alice.UUCP writes:
> In article <1507@ogcvax.UUCP>, schaefer@ogcvax.UUCP writes:
> > The real question is:  Is the order of evaluation in a statement like
> > 	bar->tmp = bar++;
> > well-defined, or is it implementation-dependent?
> 
> It is implementation-dependent.

No, it is guaranteed to produce different results on all machines.  I thnk
that qualifies it for being implementation independent doesn't it ;-))) ?

The obvious answer to how to code that is

	bar->tmp = bar;		or	(bar + 1)->tmp = bar;
	bar++;				bar++;

I'm not going to pretend to know _what_ you intended.  Your example is not
only barfed from a C point of view, I wouldn't want to see it.

- John.
-- 
John F. Haugh II                  SNAIL:  HECI Exploration Co. Inc.
UUCP: ...!ihnp4!killer!jfh                11910 Greenville Ave, Suite 600
      ...!ihnp4!killer!rpp386!jfh         Dallas, TX. 75243
"Don't Have an Oil Well?  Then Buy One!"  (214) 231-0993

franka@mmintl.UUCP (Frank Adams) (12/09/87)

In article <7593@eddie.MIT.EDU> jbs@eddie.MIT.EDU (Jeff Siegal) writes:
>Summary: Don't use the target of an auto-increment or auto-decrement
>operator elsewhere in the same expression.

Slightly stronger: don't use the target of an assignment operator elsewhere
in the same expression.  Auto-increment and auto-decrement operators are
assignment operators.

(Actually, you can use such results, provided that the assignment and the
other use are within different arguments of a comma operator.  That is, both
"f(i) , i++;" and "i++ , f(i);" are unambiguous.  But don't do this unless
(a) you know exactly what you are doing, and (b) you really need to.)
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

jfh@killer.UUCP (John Haugh) (12/15/87)

In article <2610@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes:
> In article <7593@eddie.MIT.EDU> jbs@eddie.MIT.EDU (Jeff Siegal) writes:
> >Summary: Don't use the target of an auto-increment or auto-decrement
> >operator elsewhere in the same expression.
 
This is correct, but not completely.  Consider the other operators which
return results, but aren't ++ or --.  Things like, +=, -=, and so on.  Even
just plain '=' is a candidate.

> Slightly stronger: don't use the target of an assignment operator elsewhere
> in the same expression.  Auto-increment and auto-decrement operators are
> assignment operators.

Way to strong.  What you (seem) to be saying is `don't do "a = a + 1"',
although I don't think anyone is going to stop doing that.  The expression

	a += a++ + 5

is still very legal.  *a += a++ + 5 is a different matter because the value
of a is being used - at the same time (ie, the order of evalauation is
undefined)  more than once.  So, even a[b++] = b; is undefined - the value
is side effected and used - in the same statement.  The statement

	(a[b + 1] = b), b++;

is legit, since the ',' operator insures the left side is evaluated first,
and this removes the `undefined' nature of the beast.

- John.

> (Actually, you can use such results, provided that the assignment and the
> other use are within different arguments of a comma operator.  That is, both
> "f(i) , i++;" and "i++ , f(i);" are unambiguous.  But don't do this unless
> (a) you know exactly what you are doing, and (b) you really need to.)

I  don't even know what you are saying here.  Unless f() is a macro, your
argument is not even needed.  The arguments are evaluated _prior_ to the
function being called.  Since you didn't pass the address of i, and assuming
the address is not hanging around someplace, you can't change i from inside
the function call.  Given the lack of f() being a macro, f(++i) and f(i++)
are both `safe'.  Maybe in a different language it might be risky, but not
in C.

> Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
> Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

- John.
-- 
John F. Haugh II                  SNAIL:  HECI Exploration Co. Inc.
UUCP: ...!ihnp4!killer!jfh                11910 Greenville Ave, Suite 600
      ...!ihnp4!killer!rpp386!jfh         Dallas, TX. 75243
"Don't Have an Oil Well?  Then Buy One!"  (214) 231-0993

karl@haddock.ISC.COM (Karl Heuer) (12/17/87)

In article <2464@killer.UUCP> jfh@killer.UUCP (John Haugh) writes:
|In article <2610@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes:
|> Slightly stronger: don't use the target of an assignment operator
|> [including ++] elsewhere in the same expression.
|
|Way to strong.  ...  The expression "a += a++ + 5" is still very legal.

Yes, but it only works because the final result happens to be independent of
the order of the two side effects.  "a *= a++ + 5" is ambiguous, for example.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

dant@tekla.TEK.COM (Dan Tilque;1893;92-789;LP=A;60aC) (12/17/87)

John Haugh writes:
>The expression  a += a++ + 5  is still very legal.  

Yes, it is legal in the sense that the compiler won't reject it.  However,
it's also undefined.  That expression is the same as 

	a = a + a++ + 5

Now, when is the ++ done; before or after the a to the right of the = is
evaluated?

The rest of the points you made were good; this was just a bad example.

---
Dan Tilque
dant@tekla.tek.com  or dant@tekla.UUCP

mrd@sun.mcs.clarkson.EDU (Michael R. DeCorte) (12/18/87)

   From: John Haugh <killer!jfh@ndsuvm1.bitnet>
   The expression

       a += a++ + 5

   is still very legal.

Sorry but this is not legal.  Look at the original expression with the
a's subscripted and assume that a = 10 for the discussion.

a += a++ + 5
1    2

The right hand side will evaluate to 15, no problem.  The += will
Take a's current value (5) and add 15 to it yielding 20, no problem.
Now the problem.  a2 must be incremented but you don't know if
it will occur before the assignment to a1 or after.  If it occurs
before then this is equivelent to

t = a + a + 5;
a++;
a = t;

a will equal 20.  If the increment  occurs after the assignment then
you have

t = a + a + 5;
a = t;
a++;

a will equal 21. (the a=t here is not needed but just put in
for clarity and this is the code that the compiler will generate)

The statment that you should NEVER have an expression where a single
variable has more than one post(pre)-increment(decrement)'s is correct.
You should also NEVER post(pre)-increment(decrement) a variable that
occurs on both side of the equation.

Michael DeCorte
mrd@clutx.clarkson.edu
mrd@clutx.bitnet

mwm@eris.BERKELEY.EDU (Mike (My watch has windows) Meyer) (12/19/87)

In article <10898@brl-adm.ARPA> mrd@sun.mcs.clarkson.EDU (Michael R. DeCorte) writes:
<The statment that you should NEVER have an expression where a single
<variable has more than one post(pre)-increment(decrement)'s is correct.

Well, almost. The following expression is deterministic:

	a[i++] && a[i++] && a[i++]

<You should also NEVER post(pre)-increment(decrement) a variable that
<occurs on both side of the equation.

This is correct - the key words are "both sides of an equation".

I haven't been able to come up with a simple summary, but I think it's
better stated as:

	No expression should change the value of a variable more than once,
or reference a variable and make an assignment to it, unless the
assignment operator is at the top of the parse tree. Expressions
connected by operators that guarantee order of evaluation (`&&', `||'
and `,') can be treated disjointly.

	<mike
--
Take a magic carpet to the olden days			Mike Meyer
To a mythical land where everybody lays			mwm@berkeley.edu
Around in the clouds in a happy daze			ucbvax!mwm
In Kizmiaz ... Kizmiaz					mwm@ucbjade.BITNET

franka@mmintl.UUCP (Frank Adams) (12/24/87)

In article <2464@killer.UUCP> jfh@killer.UUCP (John Haugh) writes:
>In article <2610@mmintl.UUCP>, franka@mmintl.UUCP (Frank Adams) writes:
>> Slightly stronger: don't use the target of an assignment operator elsewhere
>> in the same expression.  Auto-increment and auto-decrement operators are
>> assignment operators.
>
>Way to strong.  What you (seem) to be saying is `don't do "a = a + 1"',

Oops.  Try: don't use the target of an assignment operator elsewhere
in the same expression, except in the expression to be assigned to the
variable.  Don't use the target of auto-increment and auto-decrement
operators elsewhere in the same expression.

>The expression
>
>	a += a++ + 5
>
>is still very legal.

I don't think so.  It may be that the fine details of expression order
evaluation in the ANSI draft make this unambiguous, but since I don't have a
copy handy I can't check it.  I would *not* trust compilers in general
enough to consider it portable, and I think it is in fact legal for
compilers to generate code which makes this statement equivalent to "a++"
instead of "a += 5" -- the store of the incremented value can be the last
code executed for the statement.

>> (Actually, you can use such results, provided that the assignment and the
>> other use are within different arguments of a comma operator.  That is, both
>> "f(i) , i++;" and "i++ , f(i);" are unambiguous.  But don't do this unless
>> (a) you know exactly what you are doing, and (b) you really need to.)
>
>Given the lack of f() being a macro, f(++i) and f(i++) are both `safe'.

Indeed they are; nor did I say otherwise.

This was, perhaps, not a very good example.  Try f(&i) instead of f(i) in
both cases.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

rushfort@esunix.UUCP (Kevin Rushforth) (01/21/88)

in article <548@tuvie>, rcvie@tuvie (ELIN Forsch.z.) says:
> If there is any necessity to have the whole semantic in one *expression*, use
> the comma operator, as
> 
> bar->tmp = bar, bar++;
> 
> This operator *guarantees* the sequential evaluation of its operands from
> left to right.

Not quite.  While it is true that bar will be evaluated before bar++,
it is bar++, not bar, that will be assigned to bar->tmp.  There is no
guarantee as to whether bar++ or *bar (as used in bar->tmp) will be
evaluated first.  In fact, this expression is identical to:

	bar->tmp = bar++;

which was the original expression that caused the problem.  Better stick to
the two statement version:

bar->tmp = bar;
bar++;

Or the following one statement version:

bar->tmp = bar + 1;
-- 
                Kevin C. Rushforth
                Evans & Sutherland Computer Corporation

UUCP Address:   {ihnp4,ucbvax,decvax,allegra}!decwrl!esunix!rushfort
Alternate:      {bellcore,cbosgd,ulysses}!utah-cs!esunix!rushfort

rushfort@esunix.UUCP (Kevin Rushforth) (01/21/88)

in article <618@esunix.UUCP>, rushfort@esunix.UUCP (Kevin Rushforth) says:
> in article <548@tuvie>, rcvie@tuvie (ELIN Forsch.z.) says:
>> If there is any necessity to have the whole semantic in one *expression*, use
>> the comma operator, as
>> 
>> bar->tmp = bar, bar++;
>> 
>> This operator *guarantees* the sequential evaluation of its operands from
>> left to right.
> 
> Not quite.  While it is true that bar will be evaluated before bar++,
> it is bar++, not bar, that will be assigned to bar->tmp.

Oops.  When I wrote the above nonsense, I had incorectly parsed the
expression as:

	bar->tmp = (bar, bar++);	/* This won't work */

Instead of:

	bar->tmp = bar, bar++;		/* This will work fine */

Well, since it is obviously past my bedtime, I will shut-up now.
-- 
                Kevin C. Rushforth
                Evans & Sutherland Computer Corporation

UUCP Address:   {ihnp4,ucbvax,decvax,allegra}!decwrl!esunix!rushfort
Alternate:      {bellcore,cbosgd,ulysses}!utah-cs!esunix!rushfort