pardo@june.cs.washington.edu (David Keppel) (01/26/88)
( This isn't related to anything, but I've been unable to figure ) ( out a mail path to the guy who asked, and I wanted to answer ) >> ;-D on (My favorite sintax: switch(x+=*a+++*b){...}) Pardo > >Boy, you've got that. That's a sin, or should be. >What, precisely, does x+=*a+++*b mean? Well, actually it is ambiguous. There are 2 (or more?) ways to parse it. I'm not sure the behavior is defined from lexer to lexer as to which one is preferred. Here are some possible interpretations: tmp = *a + ++*b; /* means increment *b and add that to *a */ x += tmp; switch (x) {...} the (an) other is tmp = *a++ + *b; /* means add *a to *b and increment a */ x += tmp; switch (x) {...} Not, of course that this is the most obfuscated code possible, but it does get the point across. ;-D on (What point?) Pardo
ark@alice.UUCP (01/26/88)
In article <4080@june.cs.washington.edu>, pardo@uw-june.UUCP writes: > ( This isn't related to anything, but I've been unable to figure ) > ( out a mail path to the guy who asked, and I wanted to answer ) > > >> ;-D on (My favorite sintax: switch(x+=*a+++*b){...}) Pardo > > > >Boy, you've got that. That's a sin, or should be. > >What, precisely, does x+=*a+++*b mean? > > Well, actually it is ambiguous. There are 2 (or more?) ways to parse > it. I'm not sure the behavior is defined from lexer to lexer as to > which one is preferred. It's not ambiguous: C lexical analysis is defined by "maximal munch." There is one issue, though: whether this is run on a C compiler in which =* is a token (ancient). If so, then += is probably defined as two tokens as well. On such an ancient compiler, it means x + =* a ++ + * b which is illegal. On more modern compilers, it means x += * a ++ + * b with no ambiguity.
bright@Data-IO.COM (Walter Bright) (01/28/88)
In article <4083@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes:
<<What, precisely, does x+=*a+++*b mean?
<Well, actually it is ambiguous. There are 2 (or more?) ways to parse
<it. I'm not sure the behavior is defined from lexer to lexer as to
<which one is preferred. Here are some possible interpretations:
< x += *a++ + *b; /* means add *a to *b and increment a */
<the (an) other is
< x += *a + ++*b; /* means increment *b and add that to *a */
It's not ambiguous since the following rule is always applied:
Tokens are formed from the longest possible sequence of characters
that could form a token.
Therefore, (x+=*a+++*b) always parses as (x += * a ++ + * b). If there wasn't
this rule, there would be all kinds of problems, as in (x + = * a + ++ * b).
Note that the += was parsed as two separate tokens! Obviously impractical.
al@gtx.com (0732) (01/29/88)
In article <4080@june.cs.washington.edu> pardo@uw-june.UUCP (David Keppel) writes: ->>What, precisely, does x+=*a+++*b mean? -> ->Well, actually it is ambiguous. There are 2 (or more?) ways to parse ->it. I'm not sure the behavior is defined from lexer to lexer as to ->which one is preferred. Here are some possible interpretations: -> -> tmp = *a + ++*b; /* means increment *b and add that to *a */ -> x += tmp; -> switch (x) {...} -> ->the (an) other is -> -> tmp = *a++ + *b; /* means add *a to *b and increment a */ -> x += tmp; -> switch (x) {...} -> ->Not, of course that this is the most obfuscated code possible, but ->it does get the point across. Actually, it is not ambiguous. RTFK&R. The issue is whether x+++y means (x++)+y or x+(++y). According to K&R, p 179, the answer can only be (x++)+y. The rule is "the next token is taken to include the longest string of characters which could possibly constitute a token." I think there is an implicit assumption of left-to-right parsing. Notice that this makes 1+++b syntactically illegal, even though it might reasonably be interpreted as 1+(++b). I'll stick my neck out and say I don't think there ARE any syntactic ambiguities in C. (there are, of course, the semantic ones resulting from undefined order of evaluation). The potential ambiguities such as the dangling else and those involving the comma operator are solved by fiat. Are there any syntactic ambiguities K&R didn't resolve? ---------------------------------------------------------------------- | Alan Filipski, GTX Corp, 2501 W. Dunlap, Phoenix, Arizona 85021, USA | | {ihnp4,cbosgd,decvax,hplabs,amdahl}!sun!sunburn!gtx!al (602)870-1696 | ----------------------------------------------------------------------
mnc@m10ux.UUCP (Michael Condict) (02/03/88)
In <548@gtx.com>, al@gtx.com (Alan Filipski) writes: > I'll stick my neck out and say I don't think there ARE any syntactic > ambiguities in C. (there are, of course, the semantic ones resulting > from undefined order of evaluation). The potential ambiguities such > as the dangling else and those involving the comma operator are solved > by fiat. > > Are there any syntactic ambiguities K&R didn't resolve? > > ---------------------------------------------------------------------- > | Alan Filipski, GTX Corp, 2501 W. Dunlap, Phoenix, Arizona 85021, USA | > | {ihnp4,cbosgd,decvax,hplabs,amdahl}!sun!sunburn!gtx!al (602)870-1696 | > ---------------------------------------------------------------------- Yes, there are, at least if you mean context-free ambiguities. That is, without a symbol table or other context-sensitive techniques, the following is ambiguous in C: { t (x); ... } If ``typedef ... t;'' appeared previously then this is a declaration of a var named x of type t, otherwise it is a call to a function t with argument x. This is more than a theoretical problem, since it implies that any parser for C must include a symbol table package. Thus if you wish to provide a black-box C parser module to be used in multiple applications, either each application must agree to use the symbol table package embedded in the parser, or the application must redundantly provide its own symbol table. Neither prospect is very appealing to me. -- Michael Condict {ihnp4|vax135|cuae2}!m10ux!mnc AT&T Bell Labs (201)582-5911 MH 3B-416 Murray Hill, NJ
wsmith@uiucdcsb.cs.uiuc.edu (02/05/88)
In <548@gtx.com>, al@gtx.com (Alan Filipski) writes: > I'll stick my neck out and say I don't think there ARE any syntactic > ambiguities in C. (there are, of course, the semantic ones resulting > from undefined order of evaluation). The potential ambiguities such > as the dangling else and those involving the comma operator are solved > by fiat. > > Are there any syntactic ambiguities K&R didn't resolve? > > ---------------------------------------------------------------------- > | Alan Filipski, GTX Corp, 2501 W. Dunlap, Phoenix, Arizona 85021, USA | > | {ihnp4,cbosgd,decvax,hplabs,amdahl}!sun!sunburn!gtx!al (602)870-1696 | > ---------------------------------------------------------------------- main() { int a,b; a = 0; b = 0; printf("%0d",a+++b); } Is this (a++)+b or a+(++b)? Bill Smith pur-ee!uiucdcs!wsmith wsmith@a.cs.uiuc.edu
john@viper.Lynx.MN.Org (John Stanley) (02/08/88)
In article <165600034@uiucdcsb> wsmith@uiucdcsb.cs.uiuc.edu writes: > >main() >{ > int a,b; > > a = 0; b = 0; > > printf("%0d",a+++b); >} > >Is this (a++)+b or a+(++b)? > >Bill Smith >pur-ee!uiucdcs!wsmith >wsmith@a.cs.uiuc.edu Because the C parser is defined as always taking the largest number of characters that can be interpreted as a single token, and that it scans from left to right, the answer should always be: ((a++)+b) --- John Stanley (john@viper.UUCP) Software Consultant - DynaSoft Systems UUCP: ...{amdahl,ihnp4,rutgers}!meccts!viper!john
daniel@sco.COM (daniel edelson) (02/09/88)
In article <4083@june.cs.washington.edu< pardo@uw-june.UUCP (David Keppel) writes:
<
...
<<< ;-D on (My favorite sintax: switch(x+=*a+++*b){...}) Pardo
<<
<<What, precisely, does x+=*a+++*b mean?
<
<Well, actually it is ambiguous. There are 2 (or more?) ways to parse
<it. I'm not sure the behavior is defined from lexer to lexer as to
<which one is preferred. Here are some possible interpretations:
<
< tmp = *a++ + *b; /* means add *a to *b and increment a */
< x += tmp;
< switch (x) {...}
<
<the (an) other is
<
< tmp = *a + ++*b; /* means increment *b and add that to *a */
< x += tmp;
< switch (x) {...}
<
...
< ;-D on (What point?) Pardo
It is ambiguous in that there are two valid expressions that
could look taht way. But correct compilers would only give
one of those twwo parsings. There is a little-used rule
in K&R that says that when forming tokens, you ust the longest
string that could be a token. Thus, the expression a+++b
would be (correctly) parsed as a++ + b. Parsing it
as a + ++b would violate the longest-token rule.
Things becomes even more obfiscurated in ANSI C with the
introduction of unary plus.
--
uucp: {uunet|decvax!microsoft|ucbvax!ucscc|ihnp4|amd}!sco!daniel
ARPA: daniel@sco.COM Inter: daniel@saturn.ucsc.edu--------------
pingpong: a dink to the right side with lots of top spin | Disclaimed |
fishing: flies in morning and evening, day - spinners | as usual |