[comp.lang.c] Is right recursive C grammar necessary?

rsc@erc3ba.UUCP (Rich Chomiczewski) (06/15/88)

The ANSI C yacc parser written by Jeff Lee contains right recursive rules.
For example:

	declaration_specifiers
		: storage_class_specifier
	 	| storage_class_specifier declaration_specifiers
			etc.
		;

Is this right recursive grammar required by the ANSI C committee?
Can the above rule be written left recursive and still conform
to the ANSI C standard?
___
Rich (erc3ba!rsc)

markhall@pyramid.pyramid.com (Mark Hall) (06/15/88)

>In article <428@erc3ba.UUCP> rsc@erc3ba.UUCP (Rich Chomiczewski) writes:
>
>The ANSI C yacc parser written by Jeff Lee contains right recursive rules.
>For example:
>
>	declaration_specifiers
>		: storage_class_specifier
>	 	| storage_class_specifier declaration_specifiers
>			etc.
>		;
>
>Is this right recursive grammar required by the ANSI C committee?
>Can the above rule be written left recursive and still conform
>to the ANSI C standard?
>___
>Rich (erc3ba!rsc)

When I first read this, I read ``right-recursive'' as ``left-recursive''
and immediately went into my canned speech about how all left-recursion
can be eliminated in any grammar.  Hopefully, my Cancel msg caught the
missive before anyone read it.

The answer is still `NO'.  The GRAMMAR isn't required by 
anyone, but the LANGUAGE it recognizes IS important.  There is
certainly a left-recursive grammar that recognizes the above 
C-language construct.  This time it is easy to eliminate the 
recursion:

	ds : scs 
	    | scs ds

recognizes the same language as

	ds : ds scs
	    | scs

and this can be proven using induction on the number of scs
productions seen (just act like scs is a terminal and builds
syntax trees for the string ``scs scs scs scs'' for both
grammars).

The production was probably written this way in the first 
place to imply some sort of associative precedence.  This has to
do with the PARSE TREE (implicit or explicit) built during the parse.
Note that, for many constructs, a left recursive grammar 
is useful, because for expressions in Pascal like:
	a - b - c
they naturally build the `correct' parse tree, ie, the one that 
interprets the above as:
	(a - b) - c
a right-recursive production implies a parse tree which makes
the above have the `meaning':
	a - (b - c)
If storage_class_specifiers are right-associative, you probably
want to keep the right-recursive grammar, even though a left-recursive
grammar could recognize the same language (let me point out that
there are ways to build the correct parse tree with a left recursive 
grammar, but you probably just want to use the YACC parse stack 
as your implicit tree, so don't bother).

Note that for LR grammars, it is important to use left-recursion 
whenever it's convenient because that uses less stack
space at parse time.  To be precise, a string of s_c_s productions
of length N requires O(N) stack space to recognize using a right 
recursive grammar, but O(1) space for a left recursive grammar.

smryan@garth.UUCP (Steven Ryan) (06/16/88)

In article <428@erc3ba.UUCP> rsc@erc3ba.UUCP (Rich Chomiczewski) writes:
>Is this right recursive grammar required by the ANSI C committee?
>Can the above rule be written left recursive and still conform
>to the ANSI C standard?

A left recursive grammar can be automatically transformed into right recursive
grammar and vice versa--the defined language remains context-free or regular.

Right recursive or Griebach form is useful for writing a recursive-descent
parser or using an LL(k) parse generator. Left or right recursive can be
used for xxLR(k).

The production was probably right recursive to ensure storage-class-specifiers
precede the type-specifier (I don't know what the entire grammar is). A left
recursive rule would require two productions.

A -> xA  |  y

would produce the same language as

A -> A' y
A' -> A'x  | null

sm ryan
april come she will/may she will stay/june she'll change her tune/
july she will fly/august die she must/september i'll remember.

scjones@sdrc.UUCP (Larry Jones) (06/16/88)

In article <428@erc3ba.UUCP>, rsc@erc3ba.UUCP (Rich Chomiczewski) writes:
> 
> The ANSI C yacc parser written by Jeff Lee contains right recursive rules.
> For example:
> 
> 	declaration_specifiers
> 		: storage_class_specifier
> 	 	| storage_class_specifier declaration_specifiers
> 			etc.
> 		;
> 
> Is this right recursive grammar required by the ANSI C committee?
> Can the above rule be written left recursive and still conform
> to the ANSI C standard?

This is a common confusion.

Strictly speaking,  any grammar can be written with either left or
right recursion -- it doesn't make any difference.  All a grammar does is
indicate which sequences of tokens are valid.  The grammar does not say
anything about how the sequence is interpreted, no precedence, no
associativity, no nothing.  How the grammar is written does affect whether
it satisfies various properties, however, and that brings us to the next
part of the discussion.

When writing a parser, it is frequently necessary to have a grammar that
satisfies a particular property such as LL, SLR, or LALR.  Also, it is
usually handy to arrange things so that the resulting parse tree DOES
accurately specify precedence and associativity.  Common computer language
grammars can almost always be written in a way that does this.  However,
how you write the grammar to arrange this depends on the type of parser
you are writing.  For example, an LALR parser such as those generated by
yacc generates a rightmost derivation (the rightmost nonterminal is replaced
at each step) although it does it in reverse (from the bottom up) whereas
an LL parser such as one you might write by hand generates a leftmost
derivation (from the top down).

The practical result of this difference is that if you are writing an LL
parser, you want your grammar to be right recursive since left recursive
grammers do not possess the LL property.  If you are using yacc to generate
a grammar, you want your grammar to be left recursive since that minimizes
the parse stack depth.  So it's just one more example of what you want
depending on what you're doing.

----
Larry Jones                         UUCP: ...!sdrc!scjones
SDRC                                AT&T: (513) 576-2070
2000 Eastman Dr.                    BIX:  ltl
Milford, OH  45150
"When all else fails, read the directions."

smryan@garth.UUCP (Steven Ryan) (06/17/88)

In article <433@dmk3b1.UUCP> dmk@dmk3b1.UUCP (David Keaton) writes:
>     No, and in fact if you're using yacc you want left recursion
>instead, wherever reasonable.  This is because the yacc-generated parser
>has to keep all the states around until the appropriate rules get
>reduced, and with right recursion this can make the stack get huge.

This depends on how much memory you have. On VM it shouldn't matter,
unless YACC is incredibly stupid and Fortranish by using a fixed size
stack. You shouldn't have to stand a grammar on its head for anything
advertised as an LR(1) parse generator.

>     If you were using a recursive descent parser or an LL(1) parser,
>then you would use right recursion to avoid ambiguity.

Not ambiguity, but infinite recursion if you're not careful. An LL(k)
parser needs to know the first k terminals of each production:

      A -> A b.

You need the first k terminals of A b which needs the first k terminals of A
which needs the first k terminals of A b which needs the first k terminals ...

This is not an ambiguity, but it does require cleverness. Or right recursion.

markhall@pyramid.pyramid.com (Mark Hall) (06/18/88)

In article <746@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>In article <433@dmk3b1.UUCP> dmk@dmk3b1.UUCP (David Keaton) writes:
>>     No, and in fact if you're using yacc you want left recursion
>>instead, wherever reasonable.  This is because ... the stack [will]
>>get huge.
>
>... On VM it shouldn't matter,
>unless YACC is incredibly stupid and Fortranish by using a fixed size
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>stack. 
>

So.... you've never seen the fabled ``yacc stack overflow''
message.  I have.  Of course, it's easy to make the stack size
bigger, by recompiling the y.tab.c file.  But you never know when IT
is going to happen.

-Mark Hall (smart mailer): markhall@pyramid.pyramid.com
	   (uucp paths  ): 
		{amdahl|decwrl|sun|seismo|lll-lcc}!pyramid!markhall

smryan@garth.UUCP (Steven Ryan) (06/19/88)

My only experience with Yacc so far make me wish I had used recursive descent.
I had to chop up an LL(3) grammar just to soothe its ineffectual lookahead
computation.

If Yacc uses a fixed size stack, it is lousy program. It should be fixed
instead of making its users workaround and study its internals. (So much for
modular software.)

Actually, I think Yacc and its ilk are like driver licenses--they give
the general public freedom but think of all those traffic fatalities.

                                   Make it simple--not simplistic.
                                               smryan