markh@csd4.csd.uwm.edu (Mark William Hopkins) (09/19/90)
Most of you should know that a grammar defined with productions whose
right hand sides are allowed to be regular expressions over the set of
terminals and non-terminals is just a fancy Context-Free Grammar.
Now ... let's define a Context-Free Expression (CFE)...
CFE's use the alternation ("|"), concatenation ("."), and iteration ("*"),
just like regular expressions. They have the null expression, and the 'fail'
expression (with null.X = X = X.null, fail.X = fail = X.fail, and
fail| X = X = X | fail).
But in addition, they can be defined by recursive systems of equations
where each equation is of the form:
Non-terminal = Regular-expression in Terminals and Non-terminals.
For instance:
list(a, b, c, d) would be the solution, Z, to the system:
Z = T (d T)*. T = c | a Z b.
If a, b, c, and d are context-free expressions, then so is Z (and so is T).
A CFG-squared grammar is a phrase structure grammar, where each production
is of the form:
NonTerminal -> CFE in Terminals and NonTerminals.
Here is your challenge:
[1] Develop an algorithm that converts each CFG-squared grammar into an
equivalent CFG-grammar.
[2] Define a suitable extension of the LR(k) item-set construction for
CFG-squared grammars. You might be interested to know that this
has been done for CFG-grammars where the right-hand sides are allowed
to be regular expressions. Use this definition to come up with an
enhanced LR(k) parser-generator.
[3] Define LR(k) CFG-squared grammars in analogy to LR(k) CFG grammars.
Develop a theory, write a book :).
If you actually come up with a parser generator algorithm, try it out on
this sample grammar:
Block -> list(Statement, null, '{', '}').
Statement -> 'v' '=' Expression.
Expression -> list('p', 'i', '(', ')').
You can see where all this is ultimately leading to: a new powerful AND
efficient parser-generator technology, that leads to efficiently
'generatable' concise grammar specifications...