sankar@Neon.Stanford.EDU (Sriram Sankar) (06/19/91)
Hello! I'm interested in references/pointers and also your views and opinions on the following: 1. Tools/algorithms for parsing of general context free grammars 2. LL-parsing tools/algorithms 3. Advantages and disadvantages of LL and LR parsing as compared to each other. I'll post a summary if there is a good response and sufficient interest. Please mail your response to sankar@cs.stanford.edu even if you think your contribution is trivial. Thank you very much. Sriram. [The only algorithm I know of for parsing general context-free grammars is Earley's. LL and LR(n) algorithms are too weak. -John] -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
mauney@eos.ncsu.edu (Dr. Jon Mauney) (06/20/91)
>I'm interested in references/pointers and also your views and opinions >on the following:> > >1. Tools/algorithms for parsing of general context free grammars %A Susan L. Graham %A Michael A. Harrison %A Walter L. Ruzzo %T An improved context-free recognizer %J TOPLAS %V 2 %N 3 %P 415-462 %D July 1980 Describes a general parsing algorithm, including a good discussion of the relationship between their algorithm, Earley's, and the Cocke-Kasami-Younger algorithm. >2. LL-parsing tools/algorithms The standard compiler texts do a pretty good job. For gory detail on LL, including how to test language-equivalence, see Aho-Ullman "The Theory of Parsing, Translation, and More Parsing" :-) volume 2. The LL(1) parser generator with syntax-error handling described in Fischer&LeBlanc "Crafting a Compiler [in C]" is available, as is (I believe) the one described by Holub in his book. >3. Advantages and disadvantages of LL and LR parsing as compared to each > other. The LL(1) parsing algorithm is simpler, and so perhaps easier to understand and implement. LL(1) tables tend to be a little smaller and you should be able to make the parser loop a hair faster than LR and its popular variants. But parsing speed is not usually an important factor, and these days the table size should be irrelevant. The form of LL(1) grammars is more restrictive: no left-recursion, no common prefixes. So it is easier to bash a grammar into LALR form than LL. But once you know what LL(1) requires, it really is not difficult to make most grammars LL(1). Mostly it is a matter of putting things into an idiomatic form; and people who program in C should have no problems with strange-looking idioms. Once a grammar is LL, calls to semantic actions can be placed anywhere. In LR, actions are normally associated with reductions only. LR parsers needs must restrict the placement of semantic actions, since in some cases it is not clear which production is being matched, and therefore which action should be done. These are exactly the places where LL(1) requires you to factor out the common prefixes. Recovery from syntax errors is easier with LL(1), because the information on the stack is straightforward: a list of the grammar symbols expected to match the rest of the input. with LR methods, the stack cleverly encodes information in states, and you must cleverly decode the states in order to figure how to repair the error. A question that periodically arises on the net is "how do I produce a list of tokens that could be accepted from a given parse stack in YACC?" The answer is "very painfully", but for an LL(1) parser it should be simple. -- Jon Mauney, parsing partisan Computer Science Dept. N.C. State University -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.