markh@csd4.csd.uwm.edu (Mark William Hopkins) (09/19/90)
Most of you should know that a grammar defined with productions whose right hand sides are allowed to be regular expressions over the set of terminals and non-terminals is just a fancy Context-Free Grammar. Now ... let's define a Context-Free Expression (CFE)... CFE's use the alternation ("|"), concatenation ("."), and iteration ("*"), just like regular expressions. They have the null expression, and the 'fail' expression (with null.X = X = X.null, fail.X = fail = X.fail, and fail| X = X = X | fail). But in addition, they can be defined by recursive systems of equations where each equation is of the form: Non-terminal = Regular-expression in Terminals and Non-terminals. For instance: list(a, b, c, d) would be the solution, Z, to the system: Z = T (d T)*. T = c | a Z b. If a, b, c, and d are context-free expressions, then so is Z (and so is T). A CFG-squared grammar is a phrase structure grammar, where each production is of the form: NonTerminal -> CFE in Terminals and NonTerminals. Here is your challenge: [1] Develop an algorithm that converts each CFG-squared grammar into an equivalent CFG-grammar. [2] Define a suitable extension of the LR(k) item-set construction for CFG-squared grammars. You might be interested to know that this has been done for CFG-grammars where the right-hand sides are allowed to be regular expressions. Use this definition to come up with an enhanced LR(k) parser-generator. [3] Define LR(k) CFG-squared grammars in analogy to LR(k) CFG grammars. Develop a theory, write a book :). If you actually come up with a parser generator algorithm, try it out on this sample grammar: Block -> list(Statement, null, '{', '}'). Statement -> 'v' '=' Expression. Expression -> list('p', 'i', '(', ')'). You can see where all this is ultimately leading to: a new powerful AND efficient parser-generator technology, that leads to efficiently 'generatable' concise grammar specifications...