jwp@sdcsvax.UUCP (John Pierce) (01/16/86)
Lex is in some instances improperly handling expressions involving trailing context. The problem is demonstrated by the rule "ab?/[\nb]": %{ %} %% ab?/[\nb] {printf("test: yytext=%s\n",yytext);} . {printf("dot: yytext=%s\n",yytext);} \n {printf("newline\n");} %% Given the input "abc", this produces: test: yytext=ab dot: ytext=c This is incorrect. "ab" matches "ab?", but "c" does not match "/[\nb]"; thus the rule cannot be matched that way. "a" matches "ab?", and "b" matches "/[\nb]", but then the output is wrong since input that matches the trailing context part of a rule is not supposed to be part of yytext for that rule. Thus, what should be produced is: test: yytext=a dot: yytext=b dot: yytext=c This problem is known to exist for 4.{1,2,3beta}BSD VAXen, Sun 2.0, Pyramids, Celerities (4.2), and V.2 3B20s. No Version 7 or earlier systems were tested. I do not have a fix for this. The problem *looks* as though the code in /usr/lib/lex/ncform is at fault somewhere around the loops while (lsp-- > yylstate) { ... ... while (yyback((*lsp)->yystops, -*yyfnd) != 1 && lsp > yylstate) ... but it's unclear to me that that is the case. Using dbx, I was able to obtain the correct result by forcing yyback() to cause an extra iteration of the inner loop. I believe I have found other [not quite analogous] cases where the '?' operator coupled with "trailing context" causes incorrect results (I have not yet thoroughly tested them). This leads me to suspect the construction of the state tables in such cases, rather than the ncform code. John Pierce, Chemistry, UC San Diego jwp@sdcsvax.arpa ucbvax!sdcsvax!jwp