[mod.compilers] Opinions sought on combining r.e.'s with LL

johnl@ima.UUCP (Compilers mailing list) (06/29/86)
I just read an article in Acta Informatica about parser generation for
"ELL(1) grammars."

--- Start explanation of ELL(1) grammars ---

An ELL(1) grammar (Extended LL(1) grammar) is a context free grammar
where the right hand sides of rules are not strings of grammar symbols
but regular expressions over the set of grammar symbols.  Regular
expressions in this case use only ( ) | and *; this has not much to
do with r.e.'s as used in Unix programs like ed or (e)grep.  Many common
extensions to grammar notation use this form of regular expressions, or
further operators like + and ?; the operators mentioned here are the
essential ones.

Such a grammar is only called ELL(1) if it has the ELL(1) property,
which is similar to the LL(1) property, and basically means that you can
always choose which path next to take based on the next input token
(unlike LR(1), where this decision can be delayed until a handle has
been recognized).

ELL(1) languages are languages for which an ELL(1) grammar exists; these
are the same as the LL(1) languages, so the ELL(1) paradigm does not add
essential power to the mechanism [I think], unlike LR(1) which really
contains more languages.  However, an ELL(1) grammar for a particular
language is easier to write than an LL(1) grammar, as generally fewer
non-terminals are needed (notably by the replacement of many forms of
recursion by iteration).  Also, the parse tree generated by parsing a
program is often more usable for code generation and other activities
such as static semantic checks or structured editing (my hobby).

--- End explanation ---

All this triggered a different idea in my mind.  The ELL(1) property
requires (amongst others) that the regular expressions correspond to
Deterministic Finite Automata.  In general r.e.'s correspond to
Nondeterministic Finite Automata.  However, my trusty Aho&Ullman tells
me that *every* NFA can be transformed to a DFA (with possibly
exponentially many more states) that recognizes the same language.  This
makes me feel that a larger groups of grammars could be called ELL(1) by
proper redefinition of that term, and that generation of parsing tables
should still be possible.  (The method given in my reference is not
easily adapted to do this, however, at least I don't see trivially how
to do it.)  (The class of languages is still the same.)

And now ready for the questions:
	- has anyone actually used ELL(1) grammars?  experiences?
	- any ideas on how to implement proper error recovery?
	- has anybody done table generation for the change I propose?
	- does the idea of ELL(1) appeal to you?  if not, why not?

Answers sollicited, discussion more than welcome!

	Guido van Rossum, CWI, Amsterdam <guido@mcvax.uucp>
-- 
Send compilers mail to ima!compilers or, in a pinch to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbncca}!ima

Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request