johnl@ima.UUCP (Compilers mailing list) (07/13/86)
(Article from Dick Dunn = nbires!rcd) This article moves off on a tangent from Guido van Rossum's article about the Acta Informatica paper on ELL(1) notation and my followup to van Rossum's article. I think that the main significance of ELL(1) notation (or any other major notational extension for programming-language grammars) is that it gives you a form for the grammar which is simultaneously easier to read (and write) and easier to process. For example, it's unnatural for a human reader or writer to turn simple iteration into recursion. It is similarly a nuisance for the parser generator to discover the types of recursion which can be replaced by iteration by a transformation of the grammar. The solution is to introduce iteration as a primitive in the grammar notation; don't make your notation the bottleneck. The same argument can be applied to other common constructs. For example, a common language form is a list with separators. The RE form of this is A (B A)* where A is the list item and B is the separator. This is a nuisance if A is not a single item; it may force you to introduce a nonterminal just to make it a simple item. Why not introduce into the grammar notation a form for "this separated by that"? Other useful notation: - Option--replace (A|) (read "A or nothing") with [A]. This gets rid of the funny "nothing" that you can't see and is common in other notation anyway. - Iteration, one or more--replace A A* with A+. As with the separated list, this eliminates the double appearance of the item in the rule. At first glance it may seem that having an item appear twice in a rule is not that big a deal. However, if you have semantics that have to be attached to items, things get cluttered much faster--and the number of entries in the parse tables increases in direct correspondence to the "clutter" in the rules. The sense of the notational changes is to make the form of a language defi- nition more like a programming language (at a higher level). They allow a much more readable, compact notation. With a little care, they don't complicate the parser generation process any. In fact, they may simplify it since the parser generator need not search for "optimizations" to remove non-essential recursion, fold duplicate states, and the like. --- Dick Dunn {hao,ucbvax,allegra}!nbires!rcd (303)444-5710 x3086 ...Simpler is better. -- Send compilers mail to ima!compilers or, in a pinch to Levine@YALE.EDU Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbncca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request