[comp.lang.c] ANSI C grammar

bright@dataio.Data-IO.COM (Walter Bright) (01/06/87)

Anybody got a copy of the ANSI C grammar that came over the net a while
ago (mebbe a year). I threw mine away, figuring I would never need it...

Please email if possible.
Thanks in advance,
			Walter Bright

jeff@gatech.edu (Jeff Lee) (12/10/87)

After long last, I finally decided to update the old ANSI C grammar
that I posted about 3 years ago and that Arnold posted for me (again)
about two years ago. This has been update to reflect the changes made
up to the current draft of November 9, 1987.

I made a small fix in the grammar since it had a bug in it. See the
README file for a description of when, where, what, why,...

Enjoy,
	Jeff Lee
------------------------------------------------------------------------------
#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (not csh) to create:
#	README
#	Makefile
#	gram.y
#	scan.l
#	main.c
# This archive created: Wed Dec  9 16:06:32 1987
export PATH; PATH=/bin:/usr/bin:$PATH
echo shar: "extracting 'README'" '(1367 characters)'
if test -f 'README'
then
	echo shar: "will not over-write existing file 'README'"
else
cat << \SHAR_EOF > 'README'
The files in this directory contain the ANSI C grammar from the November 9,
1987 draft of the proposed standard. It comes with a bozo little scanner and
a main program to call it to test the grammar (sort of). The scanner is known
NOT to be an ANSI draft compatible scannar. It is there just for the purpose
of testing things a little bit. Lines will not be hacked together in the right
spots and there is no preprocessor on the front end.

There is a bug in the grammar as distributed by the ANSI committee that is
fixed in this grammar. On lines 250 and 252, I replaced the type_specifier_list
that used to be there by a specifier_qualifier_list. This seems to be
consistent with the changes that were made on the grammar for this release.
Other than that, there is one shift/reduce error. This is the typical
if/then/else error because people don't want to ugly up their grammar and YACC
still creates the correct parser. You can find out how to remove this error
using any good compiler book, but if anyone is interested, write and I'll
send you the productions that remove it (probably diffs).

Pound away at it, junior compiler hackers, and don't forget to send in your
comments and queries on the new draft.

Jeff Lee	School of Information and Computer Science, Georgia Tech
Internet:	jeff@gatech.edu
uucp:		...!{decvax,hplabs,ihnp4,linus,rutgers}!gatech!jeff
SHAR_EOF
fi
echo shar: "extracting 'Makefile'" '(249 characters)'
if test -f 'Makefile'
then
	echo shar: "will not over-write existing file 'Makefile'"
else
cat << \SHAR_EOF > 'Makefile'
YFLAGS	= -dv
CFLAGS	= -O
LFLAGS	=

SRC	= gram.y scan.l main.c
OBJ	= gram.o scan.o main.o

a.out	: $(OBJ)
	cc $(CFLAGS) $(OBJ)

scan.o	: y.tab.h

shar	:
	shar -v README Makefile gram.y scan.l main.c >C-shar

clean	:
	rm -f a.out y.tab.h y.output *.o
SHAR_EOF
fi
echo shar: "extracting 'gram.y'" '(7639 characters)'
if test -f 'gram.y'
then
	echo shar: "will not over-write existing file 'gram.y'"
else
cat << \SHAR_EOF > 'gram.y'
%token IDENTIFIER CONSTANT STRING_LITERAL SIZEOF
%token PTR_OP INC_OP DEC_OP LEFT_OP RIGHT_OP LE_OP GE_OP EQ_OP NE_OP
%token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN
%token SUB_ASSIGN LEFT_ASSIGN RIGHT_ASSIGN AND_ASSIGN
%token XOR_ASSIGN OR_ASSIGN TYPE_NAME

%token TYPEDEF EXTERN STATIC AUTO REGISTER
%token CHAR SHORT INT LONG SIGNED UNSIGNED FLOAT DOUBLE CONST VOLATILE VOID
%token STRUCT UNION ENUM ELIPSIS

%token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR GOTO CONTINUE BREAK RETURN

%start translation_unit
%%

primary_expr
	: identifier
	| CONSTANT
	| STRING_LITERAL
	| '(' expr ')'
	;

postfix_expr
	: primary_expr
	| postfix_expr '[' expr ']'
	| postfix_expr '(' ')'
	| postfix_expr '(' argument_expr_list ')'
	| postfix_expr '.' identifier
	| postfix_expr PTR_OP identifier
	| postfix_expr INC_OP
	| postfix_expr DEC_OP
	;

argument_expr_list
	: assignment_expr
	| argument_expr_list ',' assignment_expr
	;

unary_expr
	: postfix_expr
	| INC_OP unary_expr
	| DEC_OP unary_expr
	| unary_operator cast_expr
	| SIZEOF unary_expr
	| SIZEOF '(' type_name ')'
	;

unary_operator
	: '&' | '*' | '+' | '-' | '~' | '!'
	;

cast_expr
	: unary_expr
	| '(' type_name ')' cast_expr
	;

multiplicative_expr
	: cast_expr
	| multiplicative_expr '*' cast_expr
	| multiplicative_expr '/' cast_expr
	| multiplicative_expr '%' cast_expr
	;

additive_expr
	: multiplicative_expr
	| additive_expr '+' multiplicative_expr
	| additive_expr '-' multiplicative_expr
	;

shift_expr
	: additive_expr
	| shift_expr LEFT_OP additive_expr
	| shift_expr RIGHT_OP additive_expr
	;

relational_expr
	: shift_expr
	| relational_expr '<' shift_expr
	| relational_expr '>' shift_expr
	| relational_expr LE_OP shift_expr
	| relational_expr GE_OP shift_expr
	;

equality_expr
	: relational_expr
	| equality_expr EQ_OP relational_expr
	| equality_expr NE_OP relational_expr
	;

and_expr
	: equality_expr
	| and_expr '&' equality_expr
	;

exclusive_or_expr
	: and_expr
	| exclusive_or_expr '^' and_expr
	;

inclusive_or_expr
	: exclusive_or_expr
	| inclusive_or_expr '|' exclusive_or_expr
	;

logical_and_expr
	: inclusive_or_expr
	| logical_and_expr AND_OP inclusive_or_expr
	;

logical_or_expr
	: logical_and_expr
	| logical_or_expr OR_OP logical_and_expr
	;

conditional_expr
	: logical_or_expr
	| logical_or_expr '?' expr ':' conditional_expr
	;

assignment_expr
	: conditional_expr
	| unary_expr assignment_operator assignment_expr
	;

assignment_operator
	: '=' | MUL_ASSIGN | DIV_ASSIGN | MOD_ASSIGN | ADD_ASSIGN | SUB_ASSIGN
	| LEFT_ASSIGN | RIGHT_ASSIGN | AND_ASSIGN | XOR_ASSIGN | OR_ASSIGN
	;

expr
	: assignment_expr
	| expr ',' assignment_expr
	;

constant_expr
	: conditional_expr
	;

declaration
	: declaration_specifiers ';'
	| declaration_specifiers init_declarator_list ';'
	;

declaration_specifiers
	: storage_class_specifier
	| storage_class_specifier declaration_specifiers
	| type_specifier
	| type_specifier declaration_specifiers
	| type_qualifier
	| type_qualifier declaration_specifiers
	;

init_declarator_list
	: init_declarator
	| init_declarator_list ',' init_declarator
	;

init_declarator
	: declarator
	| declarator '=' initializer
	;

storage_class_specifier
	: TYPEDEF | EXTERN | STATIC | AUTO | REGISTER
	;

type_specifier
	: VOID | CHAR | SHORT | INT | LONG
	| FLOAT | DOUBLE | SIGNED | UNSIGNED
	| struct_or_union_specifier
	| enum_specifier
	| TYPE_NAME
	;

struct_or_union_specifier
	: struct_or_union '{' struct_declaration_list '}'
	| struct_or_union identifier '{' struct_declaration_list '}'
	| struct_or_union identifier
	;

struct_or_union
	: STRUCT
	| UNION
	;

struct_declaration_list
	: struct_declaration
	| struct_declaration_list struct_declaration
	;

struct_declaration
	: specifier_qualifier_list struct_declarator_list ';'
	;

specifier_qualifier_list
	: type_specifier
	| type_specifier specifier_qualifier_list
	| type_qualifier
	| type_qualifier specifier_qualifier_list
	;

struct_declarator_list
	: struct_declarator
	| struct_declarator_list ',' struct_declarator
	;

struct_declarator
	: declarator
	| ':' constant_expr
	| declarator ':' constant_expr
	;

enum_specifier
	: ENUM '{' enumerator_list '}'
	| ENUM identifier '{' enumerator_list '}'
	| ENUM identifier
	;

enumerator_list
	: enumerator
	| enumerator_list ',' enumerator
	;

enumerator
	: identifier
	| identifier '=' constant_expr
	;

type_qualifier
	: CONST | VOLATILE
	;

declarator
	: direct_declarator
	| pointer direct_declarator
	;

direct_declarator
	: identifier
	| '(' declarator ')'
	| direct_declarator '[' ']'
	| direct_declarator '[' constant_expr ']'
	| direct_declarator '(' parameter_type_list ')'
	| direct_declarator '(' ')'
	| direct_declarator '(' identifier_list ')'
	;

pointer
	: '*'
	| '*' specifier_qualifier_list
	| '*' pointer
	| '*' specifier_qualifier_list pointer
	;

parameter_type_list
	: parameter_list
	| parameter_list ',' ELIPSIS
	;

parameter_list
	: parameter_declaration
	| parameter_list ',' parameter_declaration
	;

parameter_declaration
	: declaration_specifiers declarator
	| declaration_specifiers
	| declaration_specifiers abstract_declarator
	;

identifier_list
	: identifier
	| identifier_list ',' identifier
	;

type_name
	: specifier_qualifier_list
	| specifier_qualifier_list abstract_declarator
	;

abstract_declarator
	: pointer
	| direct_abstract_declarator
	| pointer direct_abstract_declarator
	;

direct_abstract_declarator
	: '(' abstract_declarator ')'
	| '[' ']'
	| '[' constant_expr ']'
	| direct_abstract_declarator '[' ']'
	| direct_abstract_declarator '[' constant_expr ']'
	| '(' ')'
	| '(' parameter_type_list ')'
	| direct_abstract_declarator '(' ')'
	| direct_abstract_declarator '(' parameter_type_list ')'
	;

initializer
	: assignment_expr
	| '{' initializer_list '}'
	| '{' initializer_list ',' '}'
	;

initializer_list
	: initializer
	| initializer_list ',' initializer
	;

statement
	: labeled_statement
	| compound_statement
	| expression_statement
	| selection_statement
	| iteration_statement
	| jump_statement
	;

labeled_statement
	: identifier ':' statement
	| CASE constant_expr ':' statement
	| DEFAULT ':' statement
	;

compound_statement
	: '{' '}'
	| '{' statement_list '}'
	| '{' declaration_list '}'
	| '{' declaration_list statement_list '}'
	;

declaration_list
	: declaration
	| declaration_list declaration
	;

statement_list
	: statement
	| statement_list statement
	;

expression_statement
	: ';'
	| expr ';'
	;

selection_statement
	: IF '(' expr ')' statement
	| IF '(' expr ')' statement ELSE statement
	| SWITCH '(' expr ')' statement
	;

iteration_statement
	: WHILE '(' expr ')' statement
	| DO statement WHILE '(' expr ')' ';'
	| FOR '(' ';' ';' ')' statement
	| FOR '(' ';' ';' expr ')' statement
	| FOR '(' ';' expr ';' ')' statement
	| FOR '(' ';' expr ';' expr ')' statement
	| FOR '(' expr ';' ';' ')' statement
	| FOR '(' expr ';' ';' expr ')' statement
	| FOR '(' expr ';' expr ';' ')' statement
	| FOR '(' expr ';' expr ';' expr ')' statement
	;

jump_statement
	: GOTO identifier ';'
	| CONTINUE ';'
	| BREAK ';'
	| RETURN ';'
	| RETURN expr ';'
	;

translation_unit
	: external_declaration
	| translation_unit external_declaration
	;

external_declaration
	: function_definition
	| declaration
	;

function_definition
	: declarator compound_statement
	| declarator declaration_list compound_statement
	| declaration_specifiers declarator compound_statement
	| declaration_specifiers declarator declaration_list compound_statement
	;

identifier
	: IDENTIFIER
	;
%%

#include <stdio.h>

extern char yytext[];
extern int column;

yyerror(s)
char *s;
{
	fflush(stdout);
	printf("\n%*s\n%*s\n", column, "^", column, s);
}
SHAR_EOF
fi
echo shar: "extracting 'scan.l'" '(4234 characters)'
if test -f 'scan.l'
then
	echo shar: "will not over-write existing file 'scan.l'"
else
cat << \SHAR_EOF > 'scan.l'
D			[0-9]
O			[0-7]
L			[a-zA-Z_]
H			[a-fA-F0-9]
E			[Ee][+-]?{D}+
FS			(f|F|l|L)
IS			(u|U|l|L)*

%{
#include <stdio.h>
#include "y.tab.h"

void count();
%}

%%
"/*"			{ comment(); }

"auto"			{ count(); return(AUTO); }
"break"			{ count(); return(BREAK); }
"case"			{ count(); return(CASE); }
"char"			{ count(); return(CHAR); }
"const"			{ count(); return(CONST); }
"continue"		{ count(); return(CONTINUE); }
"default"		{ count(); return(DEFAULT); }
"do"			{ count(); return(DO); }
"double"		{ count(); return(DOUBLE); }
"else"			{ count(); return(ELSE); }
"enum"			{ count(); return(ENUM); }
"extern"		{ count(); return(EXTERN); }
"float"			{ count(); return(FLOAT); }
"for"			{ count(); return(FOR); }
"goto"			{ count(); return(GOTO); }
"if"			{ count(); return(IF); }
"int"			{ count(); return(INT); }
"long"			{ count(); return(LONG); }
"register"		{ count(); return(REGISTER); }
"return"		{ count(); return(RETURN); }
"short"			{ count(); return(SHORT); }
"signed"		{ count(); return(SIGNED); }
"sizeof"		{ count(); return(SIZEOF); }
"static"		{ count(); return(STATIC); }
"struct"		{ count(); return(STRUCT); }
"switch"		{ count(); return(SWITCH); }
"typedef"		{ count(); return(TYPEDEF); }
"union"			{ count(); return(UNION); }
"unsigned"		{ count(); return(UNSIGNED); }
"void"			{ count(); return(VOID); }
"volatile"		{ count(); return(VOLATILE); }
"while"			{ count(); return(WHILE); }

{L}({L}|{D})*		{ count(); return(check_type()); }

0[xX]{H}+{IS}?		{ count(); return(CONSTANT); }
0{O}*{IS}?		{ count(); return(CONSTANT); }
{D}+{IS}?		{ count(); return(CONSTANT); }
'(\\.|[^\\'])+'		{ count(); return(CONSTANT); }

{D}+{E}{FS}?		{ count(); return(CONSTANT); }
{D}*"."{D}+({E})?{FS}?	{ count(); return(CONSTANT); }
{D}+"."({E})?{FS}?	{ count(); return(CONSTANT); }

\"(\\.|[^\\"])*\"	{ count(); return(STRING_LITERAL); }

"..."			{ count(); return(ELIPSIS); }
">>="			{ count(); return(RIGHT_ASSIGN); }
"<<="			{ count(); return(LEFT_ASSIGN); }
"+="			{ count(); return(ADD_ASSIGN); }
"-="			{ count(); return(SUB_ASSIGN); }
"*="			{ count(); return(MUL_ASSIGN); }
"/="			{ count(); return(DIV_ASSIGN); }
"%="			{ count(); return(MOD_ASSIGN); }
"&="			{ count(); return(AND_ASSIGN); }
"^="			{ count(); return(XOR_ASSIGN); }
"|="			{ count(); return(OR_ASSIGN); }
">>"			{ count(); return(RIGHT_OP); }
"<<"			{ count(); return(LEFT_OP); }
"++"			{ count(); return(INC_OP); }
"--"			{ count(); return(DEC_OP); }
"->"			{ count(); return(PTR_OP); }
"&&"			{ count(); return(AND_OP); }
"||"			{ count(); return(OR_OP); }
"<="			{ count(); return(LE_OP); }
">="			{ count(); return(GE_OP); }
"=="			{ count(); return(EQ_OP); }
"!="			{ count(); return(NE_OP); }
";"			{ count(); return(';'); }
"{"			{ count(); return('{'); }
"}"			{ count(); return('}'); }
","			{ count(); return(','); }
":"			{ count(); return(':'); }
"="			{ count(); return('='); }
"("			{ count(); return('('); }
")"			{ count(); return(')'); }
"["			{ count(); return('['); }
"]"			{ count(); return(']'); }
"."			{ count(); return('.'); }
"&"			{ count(); return('&'); }
"!"			{ count(); return('!'); }
"~"			{ count(); return('~'); }
"-"			{ count(); return('-'); }
"+"			{ count(); return('+'); }
"*"			{ count(); return('*'); }
"/"			{ count(); return('/'); }
"%"			{ count(); return('%'); }
"<"			{ count(); return('<'); }
">"			{ count(); return('>'); }
"^"			{ count(); return('^'); }
"|"			{ count(); return('|'); }
"?"			{ count(); return('?'); }

[ \t\v\n\f]		{ count(); }
.			{ /* ignore bad characters */ }

%%

yywrap()
{
	return(1);
}

comment()
{
	char c, c1;

	ECHO;	/* leading "/*" */

loop:
	while ((c = input()) != '*' && c != 0)
		putchar(c);

	putchar(c);	/* the  '*' */

	if (c != 0 && (c1 = input()) != '/' && c1 != 0)
	{
		unput(c1);
		goto loop;
	}

	if (c != 0 && c1 != 0)
		putchar(c1);
}

int column = 0;

void count()
{
	int i;

	for (i = 0; yytext[i] != '\0'; i++)
		if (yytext[i] == '\n')
			column = 0;
		else if (yytext[i] == '\t')
			column += 8 - (column % 8);
		else
			column++;

	ECHO;
}

int check_type()
{
/*
* pseudo code --- this is what it should check
*
*	if (yytext == type_name)
*		return(TYPE_NAME);
*
*	return(IDENTIFIER);
*/

/*
*	it actually will only return IDENTIFIER
*/

	return(IDENTIFIER);
}
SHAR_EOF
fi
echo shar: "extracting 'main.c'" '(48 characters)'
if test -f 'main.c'
then
	echo shar: "will not over-write existing file 'main.c'"
else
cat << \SHAR_EOF > 'main.c'
main()
{
	int yyparse();

	return(yyparse());
}
SHAR_EOF
fi
exit 0
#	End of shell archive
-- 
Jeff Lee
Internet:	jeff@gatech.edu
uucp:		...!{decvax,hplabs,ihnp4,linus,rutgers}!gatech!jeff

hjelm@g.gp.cs.cmu.edu (Mark Hjelm) (06/04/90)

Now that there is a real standard, are there any significant changes
to the grammars in K&R II or Harbison/Steele II (or to the language
specification in general, for that matter)?  Assuming I translate them
directly into a YACC source file, is there any reason to choose one
over the other?  I don't have a copy of the standard yet.  I assume it
has a grammar specification.  Is it similar to (the same as) one of
these two?  Lastly, has anyone produced an up-to-date YACC grammar
(and LEX scanner) already?  If so, please post it or send e-mail.  I
have one that was posted several years ago that I will modify, if
there isn't anything more recent.


Thanks,
Mark Hjelm

hjelm@cs.cmu.edu

mjp@saturn.wustl.edu (Mark J. Patton) (02/22/91)

I apologize in advance if this question has been asked recently, but I am
very new to this group. However, an answer would be of tremendous help. 

Can someone please tell me where I can find a listing of the Grammar for
ANSI C? A publication, a book, or any reference for that matter would be
great. Even better, does anyone know where I could find the ANSI C grammar
already in "yaccable" form (i.e. ready for input to yacc)?

All responses are greatly appreciated.

---
Mark Patton
ESSRL Washington University
mjp@saturn.wustl.edu

 

jik@athena.mit.edu (Jonathan I. Kamens) (02/22/91)

  If you are new to this group, then you should read the monthly Frequently
Asked Questions posting before posting any questions.

  In particular, your question about a C grammar is question 89 of the FAQ
posting, and it is answered in that posting.  I've included the question and
answer below.

  If the posting has expired at your site, feel free to send me E-mail and
send it to you, or wait until the first of the month, when it should be posted
again.

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

89. Where can I get a YACC grammar for C?

A:  The definitive grammar is of course the one in the ANSI standard.
    Several copies are floating around; keep your eyes open.  There is
    one on uunet.uu.net (192.48.96.2) in net.sources/ansi.c.grammar.Z .
    FSF's GNU C compiler contains a grammar, as does the appendix to
    K&R II.

    References: ANSI Sec. A.2 .

gross@speedy.ada.cci.de (Arno Gross) (04/25/91)

We are looking for an ANSI-C grammar which is prepared for a tool
like YACC. Who has such a grammar or knows where we can get it from?

Any help would be greatly appreciated.

                  ________________
                 /   A. Gross   / \
                /   A. Miethe  /   \
               /   CCI GmbH   /     \
              /   Lohberg 10 /       \
             /  Postf. 1225 /     /   \
            /______________/    /  \   \
            \ 4470 Meppen  \  / \   /   /
             \  Tel.:       \ \  \/    /
              \  05931/      \ \/     /
               \  805 433     \      /   
                \ miethe@cci.de\    /
                 \ gross@cce.de \  /
                  \______________\/
 
A. Gross and A. Miethe