[comp.lang.c] LL

quasar@krazykat.ctt.bellcore.com (Laurence R. Brothers) (05/09/89)

First: Please reply via email, since I don't read this group.

I want a program that produces parse trees of c programs. In the
absence of one, I'll take a good grammar and write the program myself.

Since I want the parse tree to at least vaguely resemble the intuitive
structure of the code, I'd prefer a LL grammar that would allow a
recursive-descent parser to be easily written. I realize
that you can't have a true LL(k) grammar for C, but I'd like one
that's close, at least.

Alternatively, I'd like a yacc style LR grammar, but one with
reaonable rules for things like expressions. I currently have
a yacc grammar, but about 64 rules fire just to parse a literal
expression.

For an example of what I don't want, this is the parse of the c
statement "Return y;" in the yacc grammar I have now.

       decl_or_stmt
        stmt
         bal_stmt
          basic_stmt
           return_stmt
            Return
            list_exp
             exp
              assign_exp
               cond_exp
                or_oper_exp
                 and_oper_exp
                  bitor_oper_exp
                   bitxor_oper_exp
                    bitand_oper_exp
                     equ_oper_exp
                      rel_oper_exp
                       shift_oper_exp
                        add_oper_exp
                         mult_oper_exp
                          unary_oper_exp
                           cast_exp
                            prefix_exp
                             postfix_exp
                              primary_exp
                               primary_p2_exp
                                primary_p1_exp
                                 identifier
            ;


	    Laurence R. Brothers (quasar@ctt.bellcore.com)
Bellcore -- Computer Technology Transfer -- Knowledge-Based Systems Development
		 "...to seek the helpless future...."

jos@and.nl (J. Horsmeier) (06/17/91)

Hi there, 

anyone out there having (knowing of) a LL(1) grammar for `C'. ANSI or K&R or
both would be fine. 

I'm just interested in the syntax, no semantics embedded in the syntax of the
grammar (e.g. operator precedence). This would greatly reduce the complexity
of the syntax (I think). I'm currently working on a low profile project:
Reference document generation, given `C' source files. 

The thing is supposed to generate `frames' of reference manuals. We are
working with a home brew LL(1) parser generator, including a lex look alike
NFA->DFA regular expression scanner. For several reasons (interpretation
versus compilation, incremental parser building etc.), we do not want to use
lex/yacc.  Ambiguities (shift/reduce, non factored rules etc.) are okay, the
generator can deal with them ...

thanx in advance,


Jos

J.A. Horsmeier AND Software B.V.        phone : +31 10 4367100
               Westersingel 106/108     fax   : +31 10 4367110
               3015 LD  Rotterdam NL    e-mail: jos@and.nl
[I know of several yacc grammars, but no LL(1) grammar other than the one
implicitly in the Ritchie C compiler. -John]
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

djones@megatest.UUCP (Dave Jones) (06/18/91)

From article <91-06-021@comp.compilers>, by jos@and.nl (J. Horsmeier):
> ... brew LL(1) parser generator ...
> ... For several reasons (interpretation
> versus compilation, incremental parser building etc.), we do not want to use
> lex/yacc.


Huh? There's nothing that can be done with an LL parser
that can't be done with an LR parser. What do you mean by "interpretation
versus compilation, incremental parser building etc"?

mauney@eos.ncsu.edu (Dr. Jon Mauney) (06/18/91)

I have an LL(1) grammar for C.  It is in the format accepted by my parser
generator (as described in Fischer and LeBlanc "Crafting a Compiler" It
should be easy to convert to any other format.

I don't use this grammar much, so I can't claim it's perfect.  It does
have one place where I deliberately punted: the grammar will accept any
expression as a statement label.  The problem is a common-prefix between
labels and expressions, and I was not willing to factor it up through 13
levels of operator precedence.  If you collapse the expression grammar to
1 level of precedence, it should be easy to factor labels. 

I'd be happy to mail it to you.  Just send me a reply so I know I've got
your address correct.
--
Jon Mauney,  parsing partisan
Computer Science Dept.
N.C. State University
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

mauney@eos.ncsu.edu (Dr. Jon Mauney) (06/18/91)

I have an LL(1) grammar for C.
It is in the format accepted by my parser generator
(described in Fischer & LeBlanc's "Crafting a Compiler [in C]"

I don't use the grammar much, so I make no claims
that it is fully debugged.  But I'm willing to share it.
--
Jon Mauney,  parsing partisan
Computer Science Dept.
N.C. State University
mauney@csc.ncsu.edu

henry@zoo.toronto.edu (Henry Spencer) (06/18/91)

In article <91-06-021@comp.compilers> jos@and.nl (J. Horsmeier) writes:
>anyone out there having (knowing of) a LL(1) grammar for `C'...

Parsing C using LL(1) is tricky.  (Actually the same is true of LR(1)...)
C is pretty intractable without some special-purpose fudges in the scanner,
which don't fit cleanly into a normal LL(1)/LR(1) skeleton.  Although I
would not call it definitely impossible, I would be very surprised to see
a workable LL(1) grammar for C without such help.

If you want an example of why, consider the following legal C program:

	typedef int t;

	main(){
		t i = sizeof(t);

	t:	i = (t)(i+sizeof(i));
	}

C is *almost* LL(1), but typedefs, labels, casts, and sizeof make for some
interesting parsing problems.  It's doable, but it tends to require ad-hoc
mechanisms rather than just machine-processed grammars.
-- 
"We're thinking about upgrading from    | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 to SunOS 3.5."              |  henry@zoo.toronto.edu  utzoo!henry

EGNILGES@pucc.Princeton.EDU (Ed Nilges) (06/19/91)

In article <91-06-021@comp.compilers>, jos@and.nl (J. Horsmeier) writes:

>Hi there,
>
>anyone out there having (knowing of) a LL(1) grammar for `C'. ANSI or K&R or
>both would be fine.

The 2nd Edition of the C programming book includes a grammar for C in
an appendix which, although not LL(1), could be hacked-with until
you get unambiguous FIRST sets.  This is not the most fun thing in
the world (I speak from the experience of hacking the grammar in the
IBM reference manual for the REXX language into LL(1)-hood) but if
you cannot find it you must make it.

jos@and.nl (J. Horsmeier) (06/19/91)

In article <18959@prometheus.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <91-06-021@comp.compilers>, by jos@and.nl (J. Horsmeier):
>> ... brew LL(1) parser generator ...
>> ... For several reasons (interpretation
>> versus compilation, incremental parser building etc.), we do not want to use
>> lex/yacc.
>
>
>Huh? There's nothing that can be done with an LL parser
>that can't be done with an LR parser. What do you mean by "interpretation
>versus compilation, incremental parser building etc"?

Hi there, 

I know. I do have an ANSI C LALR(1) yac description of the grammar, but
I have to use an interactive parser generator tool (home brew), which allows
a user to fiddle/diddle with a grammar on the fly. Parser and scanner generation
are both wrapped up in one. They want to use the thing for 'automated' 
reference documentation generation. It has some real nice features in it,
but it's LL(1) :-(

The LL(1) grammar is not supposed to be a full fletched 'C' grammar, e.g.
operator precedence may be left out. Recognizing user defined types is done
by the scanner etc. I am certainly *not* very optimistic about the whole
project, but we'll give it a try.

Hope this clarifies things a bit,

Jos

ps

I got a mail this morning: someone is willing to give me a LL(1) C grammar
description, I'll let you know about the results.


|O   J.A. Horsmeier AND Software B.V.        phone : +31 10 4367100   O|
|O                  Westersingel 106/108     fax   : +31 10 4367110   O|
|O                  3015 LD  Rotterdam NL    e-mail: jos@and.nl       O|

volpe@camelback.crd.ge.com (Christopher R Volpe) (06/19/91)

In article <18959@prometheus.megatest.UUCP>, djones@megatest.UUCP (Dave
Jones) writes:
|>Huh? There's nothing that can be done with an LL parser
|>that can't be done with an LR parser. What do you mean by "interpretation
|>versus compilation, incremental parser building etc"?
          
Isn't is just much more convenient to deal with a LR grammar? It's true
that any LR(1) grammar can be converted to an equivalent LL(1) grammar 
that accepts (generates) the same language, but (to me) they are very unnatural
looking and it's harder to tell what's going on. I've LL(1)-ized
grammars before
and the resulting productions don't look much like anything one could associate
meaningful and intuitive action routines to. Am I missing something?

==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com

jos@and.nl (J. Horsmeier) (06/19/91)

In article <20726@crdgw1.crd.ge.com> volpe@camelback.crd.ge.com (Christopher R Volpe) writes:
>In article <18959@prometheus.megatest.UUCP>, djones@megatest.UUCP (Dave
>Jones) writes:
>|>Huh? There's nothing that can be done with an LL parser
>|>that can't be done with an LR parser. What do you mean by "interpretation
>|>versus compilation, incremental parser building etc"?
>          
>Isn't is just much more convenient to deal with a LR grammar? It's true
>that any LR(1) grammar can be converted to an equivalent LL(1) grammar 
[...]
>and the resulting productions don't look much like anything one could associate
>meaningful and intuitive action routines to. Am I missing something?
>
>==================
>Chris Volpe
>G.E. Corporate R&D
>volpecr@crd.ge.com

Nope, you've missed nothing, you're right and so is Dave Jones. 
But I have to work with a parser generator that just eats LL(1) grammars. sigh! :-(

On the other hand, it *is* a nice thing to work with.

Jos

ps

You guys outthere are actually helping me, I got a mial (mial?) mail
with a few pointers to LL(1) C grammars (at last!).

|O   J.A. Horsmeier AND Software B.V.        phone : +31 10 4367100   O|
|O                  Westersingel 106/108     fax   : +31 10 4367110   O|
|O                  3015 LD  Rotterdam NL    e-mail: jos@and.nl       O|