[comp.std.c] No sequence points in assignment

karzes@mfci.UUCP (Tom Karzes) (09/13/89)

I have a question about assignment expressions.  According to the standard,
an assignment expression does not introduce a sequence point, although the
side effect of updating the stored value of the left operand must occur
between the previous sequence point and the next sequence point.  This
seems to imply that in an expression with multiple assignments, the actual
assignments may occur in any order provided the stored values can be
determined and the assignments all take place between the previous
sequence point and the next sequence point.  If this interpretation is
correct, it seems to me that it can lead to some counter intuitive
results.  For example, consider the following statement:

    x = a + (x = b);

Could the assignment for (x = b) be performed after the outer
assignment?  Since it knows that the value of (x = b) is b,
can't it delay the assignment of b to x?  I.e., could this
statement be validly translated as "evaluate b, evaluate a,
add, assign the result to x, assign the previous result, b,
to x"?

jeenglis@nunki.usc.edu (Joe English) (09/13/89)

karzes@mfci.UUCP (Tom Karzes) writes:
>I have a question about assignment expressions.  According to the standard,
>an assignment expression does not introduce a sequence point, although the
>side effect of updating the stored value of the left operand must occur
>between the previous sequence point and the next sequence point.  This
>seems to imply that in an expression with multiple assignments, the actual
>assignments may occur in any order provided the stored values can be
>determined and the assignments all take place between the previous
>sequence point and the next sequence point.  If this interpretation is
>correct, it seems to me that it can lead to some counter intuitive
>results.  For example, consider the following statement:
>
>    x = a + (x = b);
>
>Could the assignment for (x = b) be performed after the outer
>assignment?  Since it knows that the value of (x = b) is b,
>can't it delay the assignment of b to x?  I.e., could this
>statement be validly translated as "evaluate b, evaluate a,
>add, assign the result to x, assign the previous result, b,
>to x"?


As far as I know, yes.  I have a question, though:  Does it really
matter?  There seems to be a lot of traffic lately asking questions
like:  "Will (obviously buggy and weird code) work like you would
expect it to, or is the compiler allowed to do something other than
What I Mean?" 

I fail to see how these questions are relevant, unless you're trying to
write incredibly hairy expressions like using a temporary variable
multiple times in a single expression.  (Which you probably shouldn't
be doing in the first place!)

Just wondering,

--Joe English

  jeenglis@nunki.usc.edu

jeffrey@algor2.algorists.com (Jeffrey Kegler) (09/13/89)

In article <1021@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) writes:
>Consider the following statement:
>
>    x = a + (x = b);
>
>Could the assignment for (x = b) be performed after the outer
>assignment?

In ANSI C (3.3), this is undefined behavior, meaning anything (core
dump, generation of random number, etc.) can happen.  In all
likelihood, and in Classic C, you get the assignments performed in
whatever order amuses the compiler.

In any case, the practical implications are the same.  Anyone coming
across statements like this in code should rearrange them to introduce
a sequence point.  Breaking the above into two statements, or
rearrangement around a comma operator, will do it in the above.
-- 

Jeffrey Kegler, Independent UNIX Consultant, Algorists, Inc.
jeffrey@algor2.ALGORISTS.COM or uunet!algor2!jeffrey
1762 Wainwright DR, Reston VA 22090

karzes@mfci.UUCP (Tom Karzes) (09/14/89)

In article <1989Sep13.005247.20121@algor2.algorists.com> jeffrey@algor2.UUCP (Jeffrey Kegler) writes:
-In article <1021@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) writes:
->Consider the following statement:
->
->    x = a + (x = b);
->
->Could the assignment for (x = b) be performed after the outer
->assignment?
-
-In ANSI C (3.3), this is undefined behavior, meaning anything (core
-dump, generation of random number, etc.) can happen.  In all
-likelihood, and in Classic C, you get the assignments performed in
-whatever order amuses the compiler.

Yes, this is what I was looking for.  I simply didn't look in the right
place (I looked under the description of assignment operators).  The
second paragraph in section 3.3 says:

    Between the previous and next sequence point an object shall have
    its stored value modified at most once by the evaluation of an
    expression.  Furthermore, the prior value shall be accessed only
    to determine the value to be stored.

Presumably a good compiler would give a warning in the obvious case
where there are multiple assignments to a scalar with no intervening
sequence points.

karzes@mfci.UUCP (Tom Karzes) (09/14/89)

In article <5059@merlin.usc.edu> jeenglis@nunki.usc.edu (Joe English) writes:
-karzes@mfci.UUCP (Tom Karzes) writes:
->
->    x = a + (x = b);
->
->Could the assignment for (x = b) be performed after the outer
->assignment?
-
-As far as I know, yes.  I have a question, though:  Does it really
-matter?  There seems to be a lot of traffic lately asking questions
-like:  "Will (obviously buggy and weird code) work like you would
-expect it to, or is the compiler allowed to do something other than
-What I Mean?" 
-
-I fail to see how these questions are relevant, unless you're trying to
-write incredibly hairy expressions like using a temporary variable
-multiple times in a single expression.  (Which you probably shouldn't
-be doing in the first place!)

I believe it is important to be able to take any sequence of input code
to a compiler and be able to classify its behavior into one of the following
ways:

    1.  It is legal, and the results will be the following, or
        one of the following.

    2.  It is syntactically legal, but is illegal code because it
        violates a certain restriction.  The compiler may give an error
        or warning if it can detect it at compile time, otherwise the
        results are undefined and the compiler is free to do as it wishes.

    3.  It is syntactically illegal.  A compile time error should
        be given.

This is a major goal of any good language standard, and is invaluable to
anyone attempting to implement a language.  It also serves as a useful
guide to anyone using a given language to help them determine whether
they are writing truly portable code or whether they just "got lucky"
on the machine they were using.

The lack of a complete specification in the original C reference manual
has been the cause of much grief on the part of compiler writers and C
programmers alike.  There are many places where you have to read between
the lines, and many places where you have no choice but to try it on the
pcc compiler to see what it does.  For this reason, it is extremely
difficult to write a C compiler from scratch and have it support all
of the little undocumented features of a pcc derived compiler (unless
you happen to know about all of those features in advance).

I'm not trying to justify the example I gave as being reasonable code,
and in fact in my opinion it is not.  However, it is not unreasonable
to expect to be able to answer the question of whether or not it is
legal, and if so what its defined behavior is.  Trying to arbitrarily
draw a line between "reasonable" and "unreasonable" is an unachievable,
and hence useless, goal.  It is hopeless to try to guarantee that you
have documented and supported all "reasonable" constructs in a language
and that anything not explicitly addressed is probably "unreasonable".
This just invites people to see what it does on their system and to
then assume it is supported behavior.  This is one of the mistakes
that the original C implementors made (and in fact I'm not certain
that Dennis appreciates this mistake even to this day, although I
may be wrong about that).

bill@twwells.com (T. William Wells) (09/14/89)

In article <5059@merlin.usc.edu> jeenglis@nunki.usc.edu (Joe English) writes:
: As far as I know, yes.  I have a question, though:  Does it really
: matter?  There seems to be a lot of traffic lately asking questions
: like:  "Will (obviously buggy and weird code) work like you would
: expect it to, or is the compiler allowed to do something other than
: What I Mean?"
:
: I fail to see how these questions are relevant, unless you're trying to
: write incredibly hairy expressions like using a temporary variable
: multiple times in a single expression.  (Which you probably shouldn't
: be doing in the first place!)

One good reason for asking such questions is to check one's knowledge
of C. Imagine this scenario: discover or think up some C wierdness;
think about it for a while to understand it; post to the newsgroup to
see if the understanding is correct.

Some of these questions also come from people who have discovered or
been told that "X" doesn't work and now want to understand why.

No doubt there are also people out there who just want to get away
with some outrageous sillyness and want agreement that what they wish
should be what is. But, while those people tend to be vociferous and
so very noticable, I don't think that they are in the majority of
people asking such questions.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

rfg@ics.uci.edu (Ronald Guilmette) (09/18/89)

In article <5059@merlin.usc.edu> jeenglis@nunki.usc.edu (Joe English) writes:
>karzes@mfci.UUCP (Tom Karzes) writes:
>>I have a question about assignment expressions.  According to the standard,
>>an assignment expression does not introduce a sequence point, although the
>>side effect of updating the stored value of the left operand must occur
>>between the previous sequence point and the next sequence point.  This
>>seems to imply that in an expression with multiple assignments, the actual
>>assignments may occur in any order provided the stored values can be
>>determined and the assignments all take place between the previous
>>sequence point and the next sequence point.  If this interpretation is
>>correct, it seems to me that it can lead to some counter intuitive
>>results.  For example, consider the following statement:
>>
>>    x = a + (x = b);
>>
>>Could the assignment for (x = b) be performed after the outer
>>assignment?

>
>As far as I know, yes.  I have a question, though:  Does it really
>matter?  There seems to be a lot of traffic lately asking questions
>like:  "Will (obviously buggy and weird code) work like you would
>expect it to, or is the compiler allowed to do something other than
>What I Mean?" 
>
>I fail to see how these questions are relevant...

If you don't understand why such questions *are* relevent to this newsgroup
then maybe you are reading the wrong newsgroup.  You obviously don't
understand what the standardization effort is all about.

Consider this.  You are given a piece of code and asked to port it to your
company's new ZLOP-19 micro-supercomputer.  The guy who wrote the code (let's
call him Fred) left the company six months ago.  You recompile the code on
the ZLOP-19 and run it.  It issues a prompt and core dumps.  You spend an
hour with your favorite debugger and track the problem to the following
statement:

	x = a + (x = b);

Now you know this program worked fine on the good ol' VAX, so what is the
problem?  Should you (a) tell your boss that Fred was a turkey who wrote
non-ANSI (and non-portable) code and then change the code or (b) contact your
compiler vendor to file a bug report.

Only the "standard" can help you decide on the proper action.

// rfg

diamond@csl.sony.co.jp (Norman Diamond) (09/18/89)

In article <1026@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) quotes:

>The second paragraph in section 3.3 says:
>    Between the previous and next sequence point an object shall have
>    its stored value modified at most once by the evaluation of an
>    expression.  Furthermore, the prior value shall be accessed only
>    to determine the value to be stored.

Even if the expression contains two assignments to the same object?
(As did the example which led to this thread.)  Even without the
optimizer turned on, the compiler is REQUIRED to notice and delete
all but one assignment to the same object?

(Uh, remember that other thread about whether two pointers are equal,
and it depends on whether they point to the same object, i.e. things
like ring numbers have to be masked out.)  Wow, a non-optimizing
compiler is going to have to do a lot of alias checking in order to
meet section 3.3.

--
-- 
Norman Diamond, Sony Corporation (diamond@ws.sony.junet)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

karzes@mfci.UUCP (Tom Karzes) (09/19/89)

In article <10851@riks.csl.sony.co.jp> diamond@riks. (Norman Diamond) writes:
-In article <1026@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) quotes:
-
->The second paragraph in section 3.3 says:
->    Between the previous and next sequence point an object shall have
->    its stored value modified at most once by the evaluation of an
->    expression.  Furthermore, the prior value shall be accessed only
->    to determine the value to be stored.
-
-Even if the expression contains two assignments to the same object?
-(As did the example which led to this thread.)  Even without the
-optimizer turned on, the compiler is REQUIRED to notice and delete
-all but one assignment to the same object?
-
-(Uh, remember that other thread about whether two pointers are equal,
-and it depends on whether they point to the same object, i.e. things
-like ring numbers have to be masked out.)  Wow, a non-optimizing
-compiler is going to have to do a lot of alias checking in order to
-meet section 3.3.

This paragraph is merely describing restrictions on the side effects a
user may have in an expression.  An ANSI conforming C compiler certainly
isn't required to detect every situation in which this is violated.  It
would be nice if it could detect some of the cases at compile time, but
at best it would only be able to catch a few simple instances (most likely
involving direct scalar references).  The point is that a compiler writer
doesn't have to worry about the behavior of cases like this because they
are explicitly disallowed by the standard.

jeffrey@algor2.algorists.com (Jeffrey Kegler) (09/20/89)

Those without a taste for semantic nit-picking should skip the following,
which contains nothing else of value.

karzes> In article <1033@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) writes:
diamond> In article <10851@riks.csl.sony.co.jp> diamond@riks. (Norman Diamond) writes:
dpANS 3.3> In article <1026@m3.mfci.UUCP> karzes@mfci.UUCP (Tom Karzes) quotes:

dpANS 3.3> Between the previous and next sequence point an object shall have
dpANS 3.3> its stored value modified at most once by the evaluation of an
dpANS 3.3> expression.  Furthermore, the prior value shall be accessed only
dpANS 3.3> to determine the value to be stored.

diamond> Even if the expression contains two assignments to the same
diamond> object?  (As did the example which led to this thread.)  Even
diamond> without the optimizer turned on, the compiler is REQUIRED to
diamond> notice and delete all but one assignment to the same object?
diamond> Wow, a non-optimizing compiler is going to have to do a lot
diamond> of alias checking in order to meet section 3.3.

karzes> This paragraph is merely describing restrictions on the side
karzes> effects a user may have in an expression.

I am glad someone besides me (Norman Diamond) had the same problem reading
those two sentences in 3.3 as I did.  They are ambiguous depending on
whether they are taken to restrict the program or the implementation.
Footnote 31 and Appendix A.6.2 make quite clear that the intention was
to restrict the program, but they are not supposed to be part of the
standard.

It is clear what the dpANS intended to say, and it seems equally clear
that what it intended to say is the only thing that makes sense.
However, I do not believe the standard here says what it intends.

dpANS 1.6> If a "shall" or "shall not" requirement that appears out of
dpANS 1.6> a constraint is violated, the behavior is undefined.  ...
dpANS 1.6> Constraints -- syntactic and semantic restrictions by the
dpANS 1.6> which the exposition of language elements is to be
dpANS 1.6> interpreted."

Are the above two sentences from 3.3 "semantic restrictions", in which
case the burden is on the implementation?  I think so.

The definition of "shall" in 1.6 here is much too hard to interpret.
I think whereever undefined behavior is allowed the standard should
explicitly say so.  This requires only an editorial effort, since
Appendix A.6.2 lists all such cases.
-- 

Jeffrey Kegler, Independent UNIX Consultant, Algorists, Inc.
jeffrey@algor2.ALGORISTS.COM or uunet!algor2!jeffrey
1762 Wainwright DR, Reston VA 22090