drt@notecnirp.Princeton.EDU (David Tarditi) (08/10/89)
The previous article by Heirach may contain some errors. The principal error is the confusion of possible lookahead sets for a reduction in a state with the set of expected tokens. The possible lookaheads for a reduction may be larger than the set of expected tokens, given the left context up to the point of the error in the parse. This is because Yacc is an LALR parser and combines states containing identical sets of items. Thus the lookahead for a reduction may contain some terminals for which a reduction is possible, but which will be detected as an error later in the parse before the terminal is even shifted. This problem exists even without default reductions being present. Default reductions merely add the complication, as mentioned before, of ending up in a state where possible shifts have been lost. Here is an example showing how the possible lookaheads for a reduction may be larger than the expected tokens: Consider the following example: S -> A S -> a A b A -> x The LALR(1) set of items is (1) S -> .A , $ goto on A to (2) S -> .a A b, $ shift on a to (4) A -> .x , $ shift on x to (3) (2) S -> A. , $ reduce on $ (3) A -> x. , b/$ reduce on b or $ (4) S -> a .A b , $ goto on A to (5) A -> .x , b shift on x to (3) (5) S -> a A . b , $ shift on b to (6) (6) S -> a A b., $ reduce on $ Consider the input string: x q The actions would be shift on x from state 1 to state 3 error in state 3 - possible lookahead is $ or b, but actually only $ Now look at the input string: a x q shift on a from state 1 to state 4 shift on x from state 4 to state 3 error in state 3 - possible lookahead is $ or b, but actually only b Without default reductions, the set of expected tokens is easy to calculate. Take each token in the lookahead, and parse until either the token is shifted or an error is encountered (whichever happens first). If an error is encountered, it is not possible to shift the token here, and the token is not expected. Heirich does have the right idea for solving the problem with default actions. Grab the state we are in before any reduce actions called for by the current terminal are performed. This can be done by grabbing the state immediately after yylex returns a token, as Heirich points out. When an error is detected, compute a list of states that would be passed through while performing default reductions. Start with the saved state and perform default reductions until none are possible. Then take the list and combine all the possible tokens which could be shifted in each state to get a list of expected tokens. This requires adding code to save the stack for the LR parser and to turn off the execution of semantic actions during the test parsing. ------------------------------- David Tarditi (drt@notecnirp.princeton.edu) -- -- Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU { decvax | harvard | yale | bbn }!ima. Meta-mail to ima!compilers-request. Please send responses to the author of the message, not the poster.