[comp.lang.c] YACC grammer for c

karsh@nicmad.UUCP (11/28/86)

[]

  Pardon me if this has been discussed here before.

  I tried to write a yacc grammer for the C language by using the
grammer in K&R appendix A.  I typed in the grammer from the book almost
verbatim.  I've found the following anomaly:

     Under the definition for "function-body" there is a reference to
     the non-terminal "type-decl-list".  However, "type-decl-list" is
     not defined anywhere.

  Is there any difference between "type-decl-list" and "struct-decl-list"?
I.e. can I get away with:

         type-decl-list : struct-decl-list ;

  By the way, does anybody know if there are any known bugs in the K&R
C grammer?  What's a good reference for information on the precise
syntax of the C language?

   Does anybody out there have a yacc grammer for the C language?  How
about for the ANSI C language?  Could you send me a copy?

                          Thanks,

                          Bruce Karsh
                          {ihnp4,uwvax}!nicmad!karsh

greg@utcsri.UUCP (Gregory Smith) (11/29/86)

In article <1338@nicmad.UUCP> karsh@nicmad.UUCP (Bruce Karsh) writes:
>  I tried to write a yacc grammer for the C language by using the
>grammer in K&R appendix A.  I typed in the grammer from the book almost
>verbatim.  I've found the following anomaly:
>
>..."type-decl-list" is not defined anywhere.
>
>  Is there any difference between "type-decl-list" and "struct-decl-list"?
>I.e. can I get away with:
>
>         type-decl-list : struct-decl-list ;
No.
type-decl-list:
	<nothing> /* since type-decl-list is not optional in function-body */
	type-decl-list type-declaration
type-declaration:
	register(opt) type-specifier(opt) declarator-list ;
declarator-list:
	declarator
	declarator-list , declarator

"declarator" is defined on page 216. "struct-decl-list" does not allow
'register', demands the type-specifier, and allows bitfields, so you
can't use that.

As a side issue, K&R uses a right-recursive definition, e.g.
parameter-list can be "identifer, parameter-list". Since you are using
yacc, I hope you are turning these around to a left-recursive form,
i.e. "parameter-list, identifier".

>  By the way, does anybody know if there are any known bugs in the K&R
>C grammar?

On page 218, 'function-declarator' is given as

	declarator ( parameter-list(opt) )

which is bogus - the way to declare a function
which returns a pointer to an array of 10 ints is

int	(*f())[10];

Therefore if this function has parameters a,b it is defined using

int (*f(a,b))[10]
int a,b;
{ etc...

...which is legal but cannot be produced by the K&R grammar. I think
this function can be declared in some other way which fits into the K&R
grammar, on UNIX compilers.  I would stay away from this, since such
declarations are confusing enough without using 'slang', and because
Ansi C apparently will not allow the 'slang' form.

Anyhow...
function-declarator:
	identifier ( parameter-list(opt) )
	function-declarator ( )
	( function-declarator )
	* function-declarator
	function-declarator [ constant-expression ]

The first two lines enforce the restriction that the parameter list
may only be inserted in the 'correct' set of parentheses. Again, I think
existing UNIX compilers do not enforce this restriction, but Ansi will.
The grammar above also demands that 'identifier' be declared as
'function-returning-something' while the K&R version does not.

Furthermore, "function-definition" should be

	static(opt) type-specifier(opt) function-declarator function-body
	^^^^^^^^^^^
At the top of page 214 there is a statement which indicates that the
grammar given is intended to aid comprehension, which would explain why
they would want to simplify some of the weirder parts at the expense of
absolute accuracy.

While I'm on the subject, I might as well add the only other corrections
that I have scribbled into Appendix A:

Page 194, section 8.4, third line: delete "and storage class".

Page 215, third line, delete "and the conditional operator".
	Add "The conditional operator ?: groups right-to-left."
-- 
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg
Have vAX, will hack...

henry@utzoo.UUCP (Henry Spencer) (11/30/86)

>    Does anybody out there have a yacc grammer for the C language?  How
> about for the ANSI C language?  Could you send me a copy?

It would be best to work from the syntax in the X3J11 draft standard,
which is probably trivially yaccable.  No, I don't have it on-line.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

lenoil@apple.UUCP (Robert Lenoil) (12/04/86)

In article <1338@nicmad.UUCP> karsh@nicmad.UUCP (Bruce Karsh) writes:
>  By the way, does anybody know if there are any known bugs in the K&R
>C grammer?  What's a good reference for information on the precise
>syntax of the C language?
>
>   Does anybody out there have a yacc grammer for the C language?  How
>about for the ANSI C language?  Could you send me a copy?

Harbison and Steele's book (the exact name I'm not sure of now; I've lent
mine to a friend, but it's something like "C: A Reference Manual") contains
an appendix with a full LR(1) grammar for C.  I've never tried YACCing it,
so I don't know if it works.

jeff@gatech.EDU (Jeff Lee) (12/04/86)

>Harbison and Steele's book (the exact name I'm not sure of now; I've lent
>mine to a friend, but it's something like "C: A Reference Manual") contains
>an appendix with a full LR(1) grammar for C.  I've never tried YACCing it,
>so I don't know if it works.

The grammar in the back of Harbison and Steele is a real LALR(1) grammar
and it YACC's just fine. They have even gone to the trouble of using "open"
and "close" statements to remove the 1 shift/reduce error you normally
get when compiling if-then-else statements.

As a sideline, does anyone know what tool that they used to generate their
grammar? The grammar contains all the split states that one would expect
to come from an LALR(1) grammar that has been translated from an extended
BNF. The split states were named strange things with numbers in them, also.
(I haven't had a copy of H&S for over a year now so this is from memory).
-- 
Jeff Lee
CSNet:	Jeff @ GATech		ARPA:	Jeff%GATech.CSNet @ CSNet-Relay.ARPA
uucp:	...!{akgua,allegra,hplabs,ihnp4,linus,seismo,ulysses}!gatech!jeff

michael@orcisi.UUCP (12/06/86)

> The grammar in the back of Harbison and Steele is a real LALR(1) grammar
> and it YACC's just fine. They have even gone to the trouble of using "open"
> and "close" statements to remove the 1 shift/reduce error you normally
> get when compiling if-then-else statements.

Has someone typed this in?  I'm sure there must be.
How about posting it to comp.sources?

montnaro@chenengo.UUCP (12/09/86)

Mod.sources, Volume 1, has a yacc/lex grammar for ANSI C circa 11/84. This
should be a reasonable starting point.

Skip Montanaro

ARPA: montanaro%desdemona.tcpip@ge-crd.arpa
UUCP: seismo!rochester!steinmetz!desdemona!montanaro
GE DECnet: csbvax::mrgate!montanaro@desdemona@smtp@tcpgateway