src@cup.portal.com (Steve R Calwas) (05/16/89)
I am looking for a public domain or inexpensive YACC grammer definition for C and/or C++. LEX definitions of same are also desired. Anyone having information about sources for these would be much appreciated. E-mail or post. (For example, is a YACC grammer for C a part of the ANSI committee's output?) Thanks for any help. Steve Calwas src@cup.portal.com Santa Clara, CA ...!sun!cup.portal.com!src
james@sparrmsuucp (James Buchanan) (05/19/89)
In article <18416@cup.portal.com>, src@cup.portal.com (Steve R Calwas) writes: > I am looking for a public domain or inexpensive YACC grammer definition for > C and/or C++. LEX definitions of same are also desired. Anyone having > information about sources for these would be much appreciated. E-mail or > post. (For example, is a YACC grammer for C a part of the ANSI committee's > output?) Thanks for any help. > Here's an ANSI C yacc file. By the time this got to me the author's name had been stripped off, but rumour has it that he/she is involved with the ANSI C committee. ****** CUT here - gram.y ******* %token IDENTIFIER CONSTANT STRING_LITERAL SIZEOF %token PTR_OP INC_OP DEC_OP LEFT_OP RIGHT_OP LE_OP GE_OP EQ_OP NE_OP %token AND_OP OR_OP MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN ADD_ASSIGN %token SUB_ASSIGN LEFT_ASSIGN RIGHT_ASSIGN AND_ASSIGN %token XOR_ASSIGN OR_ASSIGN TYPE_NAME %token TYPEDEF EXTERN STATIC AUTO REGISTER %token CHAR SHORT INT LONG SIGNED UNSIGNED FLOAT DOUBLE CONST VOLATILE VOID %token STRUCT UNION ENUM ELIPSIS RANGE %token CASE DEFAULT IF ELSE SWITCH WHILE DO FOR GOTO CONTINUE BREAK RETURN %start file %% primary_expr : identifier | CONSTANT | STRING_LITERAL | '(' expr ')' ; postfix_expr : primary_expr | postfix_expr '[' expr ']' | postfix_expr '(' ')' | postfix_expr '(' argument_expr_list ')' | postfix_expr '.' identifier | postfix_expr PTR_OP identifier | postfix_expr INC_OP | postfix_expr DEC_OP ; argument_expr_list : assignment_expr | argument_expr_list ',' assignment_expr ; unary_expr : postfix_expr | INC_OP unary_expr | DEC_OP unary_expr | unary_operator cast_expr | SIZEOF unary_expr | SIZEOF '(' type_name ')' ; unary_operator : '&' | '*' | '+' | '-' | '~' | '!' ; cast_expr : unary_expr | '(' type_name ')' cast_expr ; multiplicative_expr : cast_expr | multiplicative_expr '*' cast_expr | multiplicative_expr '/' cast_expr | multiplicative_expr '%' cast_expr ; additive_expr : multiplicative_expr | additive_expr '+' multiplicative_expr | additive_expr '-' multiplicative_expr ; shift_expr : additive_expr | shift_expr LEFT_OP additive_expr | shift_expr RIGHT_OP additive_expr ; relational_expr : shift_expr | relational_expr '<' shift_expr | relational_expr '>' shift_expr | relational_expr LE_OP shift_expr | relational_expr GE_OP shift_expr ; equality_expr : relational_expr | equality_expr EQ_OP relational_expr | equality_expr NE_OP relational_expr ; and_expr : equality_expr | and_expr '&' equality_expr ; exclusive_or_expr : and_expr | exclusive_or_expr '^' and_expr ; inclusive_or_expr : exclusive_or_expr | inclusive_or_expr '|' exclusive_or_expr ; logical_and_expr : inclusive_or_expr | logical_and_expr AND_OP inclusive_or_expr ; logical_or_expr : logical_and_expr | logical_or_expr OR_OP logical_and_expr ; conditional_expr : logical_or_expr | logical_or_expr '?' logical_or_expr ':' conditional_expr ; assignment_expr : conditional_expr | unary_expr assignment_operator assignment_expr ; assignment_operator : '=' | MUL_ASSIGN | DIV_ASSIGN | MOD_ASSIGN | ADD_ASSIGN | SUB_ASSIGN | LEFT_ASSIGN | RIGHT_ASSIGN | AND_ASSIGN | XOR_ASSIGN | OR_ASSIGN ; expr : assignment_expr | expr ',' assignment_expr ; constant_expr : conditional_expr ; declaration : declaration_specifiers ';' | declaration_specifiers init_declarator_list ';' ; declaration_specifiers : storage_class_specifier | storage_class_specifier declaration_specifiers | type_specifier | type_specifier declaration_specifiers ; init_declarator_list : init_declarator | init_declarator_list ',' init_declarator ; init_declarator : declarator | declarator '=' initializer ; storage_class_specifier : TYPEDEF | EXTERN | STATIC | AUTO | REGISTER ; type_specifier : CHAR | SHORT | INT | LONG | SIGNED | UNSIGNED | FLOAT | DOUBLE | CONST | VOLATILE | VOID | struct_or_union_specifier | enum_specifier | TYPE_NAME ; struct_or_union_specifier : struct_or_union identifier '{' struct_declaration_list '}' | struct_or_union '{' struct_declaration_list '}' | struct_or_union identifier ; struct_or_union : STRUCT | UNION ; struct_declaration_list : struct_declaration | struct_declaration_list struct_declaration ; struct_declaration : type_specifier_list struct_declarator_list ';' ; struct_declarator_list : struct_declarator | struct_declarator_list ',' struct_declarator ; struct_declarator : declarator | ':' constant_expr | declarator ':' constant_expr ; enum_specifier : ENUM '{' enumerator_list '}' | ENUM identifier '{' enumerator_list '}' | ENUM identifier ; enumerator_list : enumerator | enumerator_list ',' enumerator ; enumerator : identifier | identifier '=' constant_expr ; declarator : declarator2 | pointer declarator2 ; declarator2 : identifier | '(' declarator ')' | declarator2 '[' ']' | declarator2 '[' constant_expr ']' | declarator2 '(' ')' | declarator2 '(' parameter_type_list ')' | declarator2 '(' parameter_identifier_list ')' ; pointer : '*' | '*' type_specifier_list | '*' pointer | '*' type_specifier_list pointer ; type_specifier_list : type_specifier | type_specifier_list type_specifier ; parameter_identifier_list : identifier_list | identifier_list ',' ELIPSIS ; identifier_list : identifier | identifier_list ',' identifier ; parameter_type_list : parameter_list | parameter_list ',' ELIPSIS ; parameter_list : parameter_declaration | parameter_list ',' parameter_declaration ; parameter_declaration : type_specifier_list declarator | type_name ; type_name : type_specifier_list | type_specifier_list abstract_declarator ; abstract_declarator : pointer | abstract_declarator2 | pointer abstract_declarator2 ; abstract_declarator2 : '(' abstract_declarator ')' | '[' ']' | '[' constant_expr ']' | abstract_declarator2 '[' ']' | abstract_declarator2 '[' constant_expr ']' | '(' ')' | '(' parameter_type_list ')' | abstract_declarator2 '(' ')' | abstract_declarator2 '(' parameter_type_list ')' ; initializer : assignment_expr | '{' initializer_list '}' | '{' initializer_list ',' '}' ; initializer_list : initializer | initializer_list ',' initializer ; statement : labeled_statement | compound_statement | expression_statement | selection_statement | iteration_statement | jump_statement ; labeled_statement : identifier ':' statement | CASE constant_expr ':' statement | DEFAULT ':' statement ; compound_statement : '{' '}' | '{' statement_list '}' | '{' declaration_list '}' | '{' declaration_list statement_list '}' ; declaration_list : declaration | declaration_list declaration ; statement_list : statement | statement_list statement ; expression_statement : ';' | expr ';' ; selection_statement : IF '(' expr ')' statement | IF '(' expr ')' statement ELSE statement | SWITCH '(' expr ')' statement ; iteration_statement : WHILE '(' expr ')' statement | DO statement WHILE '(' expr ')' ';' | FOR '(' ';' ';' ')' statement | FOR '(' ';' ';' expr ')' statement | FOR '(' ';' expr ';' ')' statement | FOR '(' ';' expr ';' expr ')' statement | FOR '(' expr ';' ';' ')' statement | FOR '(' expr ';' ';' expr ')' statement | FOR '(' expr ';' expr ';' ')' statement | FOR '(' expr ';' expr ';' expr ')' statement ; jump_statement : GOTO identifier ';' | CONTINUE ';' | BREAK ';' | RETURN ';' | RETURN expr ';' ; file : external_definition | file external_definition ; external_definition : function_definition | declaration ; function_definition : declarator function_body | declaration_specifiers declarator function_body ; function_body : compound_statement | declaration_list compound_statement ; identifier : IDENTIFIER ; %% #include <stdio.h> #include "lex.yy.c" extern char yytext[]; extern int column; main() { int yyparse(); return(yyparse()); } yyerror(s) char *s; { fflush(stdout); printf("\n%*s\n%*s\n", column, "^", column, s); } ********************* p.s. We've been having some trouble with the address our mailer deamon is putting on our e-mail. The correct address is the one below, if the header says comething different, please ignore it. James Buchanan james@sparrms.UUCP Spar Aerospace Ltd 1700 Ormont Drive (416) 745-9680 Weston, Ontario, CANADA M9L 2W7
james@sparrmsuucp (James Buchanan) (05/19/89)
In article <18416@cup.portal.com>, src@cup.portal.com (Steve R Calwas) writes: > I am looking for a public domain or inexpensive YACC grammer definition for > C and/or C++. LEX definitions of same are also desired. Anyone having > information about sources for these would be much appreciated. E-mail or > post. (For example, is a YACC grammer for C a part of the ANSI committee's > output?) Thanks for any help. Here's the LEX file. Again this is written by some anonymous Guru rumoured to be involved with the ANSI C committee. ************ CUT HERE scan.l *********** D [0-9] L [a-zA-Z_] H [a-fA-F0-9] E [Ee][+-]?{D}+ FS (f|F|l|L) IS (u|U|l|L)* %{ #include <stdio.h> #include "y.tab.h" void count(); %} %% "/*" { comment(); } "auto" { count(); return(AUTO); } "break" { count(); return(BREAK); } "case" { count(); return(CASE); } "char" { count(); return(CHAR); } "const" { count(); return(CONST); } "continue" { count(); return(CONTINUE); } "default" { count(); return(DEFAULT); } "do" { count(); return(DO); } "double" { count(); return(DOUBLE); } "else" { count(); return(ELSE); } "enum" { count(); return(ENUM); } "extern" { count(); return(EXTERN); } "float" { count(); return(FLOAT); } "for" { count(); return(FOR); } "goto" { count(); return(GOTO); } "if" { count(); return(IF); } "int" { count(); return(INT); } "long" { count(); return(LONG); } "register" { count(); return(REGISTER); } "return" { count(); return(RETURN); } "short" { count(); return(SHORT); } "signed" { count(); return(SIGNED); } "sizeof" { count(); return(SIZEOF); } "static" { count(); return(STATIC); } "struct" { count(); return(STRUCT); } "switch" { count(); return(SWITCH); } "typedef" { count(); return(TYPEDEF); } "union" { count(); return(UNION); } "unsigned" { count(); return(UNSIGNED); } "void" { count(); return(VOID); } "volatile" { count(); return(VOLATILE); } "while" { count(); return(WHILE); } {L}({L}|{D})* { count(); return(check_type()); } 0[xX]{H}+{IS}? { count(); return(CONSTANT); } 0[xX]{H}+{IS}? { count(); return(CONSTANT); } 0{D}+{IS}? { count(); return(CONSTANT); } 0{D}+{IS}? { count(); return(CONSTANT); } {D}+{IS}? { count(); return(CONSTANT); } {D}+{IS}? { count(); return(CONSTANT); } '(\\.|[^\\'])+' { count(); return(CONSTANT); } {D}+{E}{FS}? { count(); return(CONSTANT); } {D}*"."{D}+({E})?{FS}? { count(); return(CONSTANT); } {D}+"."{D}*({E})?{FS}? { count(); return(CONSTANT); } \"(\\.|[^\\"])*\" { count(); return(STRING_LITERAL); } ">>=" { count(); return(RIGHT_ASSIGN); } "<<=" { count(); return(LEFT_ASSIGN); } "+=" { count(); return(ADD_ASSIGN); } "-=" { count(); return(SUB_ASSIGN); } "*=" { count(); return(MUL_ASSIGN); } "/=" { count(); return(DIV_ASSIGN); } "%=" { count(); return(MOD_ASSIGN); } "&=" { count(); return(AND_ASSIGN); } "^=" { count(); return(XOR_ASSIGN); } "|=" { count(); return(OR_ASSIGN); } ">>" { count(); return(RIGHT_OP); } "<<" { count(); return(LEFT_OP); } "++" { count(); return(INC_OP); } "--" { count(); return(DEC_OP); } "->" { count(); return(PTR_OP); } "&&" { count(); return(AND_OP); } "||" { count(); return(OR_OP); } "<=" { count(); return(LE_OP); } ">=" { count(); return(GE_OP); } "==" { count(); return(EQ_OP); } "!=" { count(); return(NE_OP); } ";" { count(); return(';'); } "{" { count(); return('{'); } "}" { count(); return('}'); } "," { count(); return(','); } ":" { count(); return(':'); } "=" { count(); return('='); } "(" { count(); return('('); } ")" { count(); return(')'); } "[" { count(); return('['); } "]" { count(); return(']'); } "." { count(); return('.'); } "&" { count(); return('&'); } "!" { count(); return('!'); } "~" { count(); return('~'); } "-" { count(); return('-'); } "+" { count(); return('+'); } "*" { count(); return('*'); } "/" { count(); return('/'); } "%" { count(); return('%'); } "<" { count(); return('<'); } ">" { count(); return('>'); } "^" { count(); return('^'); } "|" { count(); return('|'); } "?" { count(); return('?'); } [ \t\v\n\f] { count(); } . { /* ignore bad characters */ } %% yywrap() { return(1); } comment() { char c, c1; loop: while ((c = input()) != '*' && c != 0) putchar(c); if ((c1 = input()) != '/' && c != 0) { unput(c1); goto loop; } if (c != 0) putchar(c1); } int column = 0; void count() { int i; for (i = 0; yytext[i] != '\0'; i++) if (yytext[i] == '\n') column = 0; else if (yytext[i] == '\t') column += 8 - (column % 8); else column++; ECHO; } int check_type() { /* * pseudo code --- this is what it should check * * if (yytext == type_name) * return(TYPE_NAME); * * return(IDENTIFIER); */ /* * it actually will only return IDENTIFIER */ return(IDENTIFIER); } ***************** p.s. We've been having some trouble with the address our mailer deamon is putting on our e-mail. The correct address is the one below, if the header says comething different, please ignore it. James Buchanan james@sparrms.UUCP Spar Aerospace Ltd 1700 Ormont Drive (416) 745-9680 Weston, Ontario, CANADA M9L 2W7
arnold@mathcs.emory.edu (Arnold D. Robbins {EUCC}) (05/22/89)
In article <229@sparrmsuucp> james@sparrmsuucp (James Buchanan) writes: > [... ANSI C yacc and lex files ommitted. ] > > Again this is written by some anonymous Guru rumoured >to be involved with the ANSI C committee. Let's set the record straight. The yacc and lex files were done by Jeff Lee (jeff@gatech.edu), based on drafts which he had in front of him. I supplied him with the drafts since I am on the mailing list and shared an office with him at the time. Once they were done, I posted the files to the net; they are in the comp.sources.unix archives somewhere, a fair ways back. These were posted approximately 3.5 to 4 years ago, as I posted them when I was at Georgia Tech, and I left there a little over 3 years ago. Obscure quote: Kol ha'omer davar b'shem omro, maevi geulah la'olam. -- Pirkei Avot. -- Arnold Robbins -- Emory University Computing Center | Unix is a Registered DOMAIN: arnold@unix.cc.emory.edu | Bell of AT&T Trademark UUCP: gatech!emoryu1!arnold PHONE: +1 404 727-7636 | Laboratories. BITNET: arnold@emoryu1 FAX: +1 404 727-2599 | -- Donn Seeley
jima@hplsla.HP.COM (Jim Adcock) (05/25/89)
I used these as the starting point of my ObjC to C++ translator -- which I did a year or two ago. If I remember right, the issue of fields is not addressed, and if one is going to do anything more than go/nogo testing on a presumed example of C text, then one needs to address the traditional hack of passing typedef info back to lex so that type names can be distinguished from variable names. Otherwise, these seemed to work the way one would expect. I can provide the source to my translator program if you want [weak] examples of how to address these issues.