mpledger@cti1.UUCP (Mark Pledger) (10/03/90)
Could somebody please give me a hand with a lex & yacc problem. I am trying to build a small subset of the SQL language. For purposes of this question my yacc grammer is defined below. %start cmds %% cmds : select ';' { fprintf(stdout,"SQL syntax correct.\n"); } ; select : SELECT sel_list { } ; sel_list : sel_type { } | sel_list ',' sel_type { } ; sel_type : attr_name { } ; attr_name : IDENTIFIER { } ; %% ... and so on My lex specifications are list below. delimiters [ \t] ws {delimiters}+ newline [\n] letter [A-Za-z_] int [0-9]+ string {letter}+ squote \'[^\'\n]*\' dquote \"[^\"\n]*\" quotes (squote|dquote) %% /* RULES */ {quotes} { yylval.string = yytext; return(QUOTES); } ";" { yylval.string = yytext; return(EOC); } "," { yylval.string = yytext; return(','); } "q" { /* interactive quit command */ yylval.string = yytext; return( 0 ); } {newline} { lineno++; } {int} { yylval.integer = atoi(yytext); return(INT); } {ws} ; /* white space */ {string} { /* normal token */ yylval.string = yytext; return(symlookup()); } "/*" { skipcomment(); } . { /* for testing only */ ECHO; exit(0); } %% ... and so on This lex & yacc grammer works only part of the time. I have specified in the yacc grammer to allow multiple attr_name's after the key word SELECT. Using the following examples below, I seem to be getting an error when I don't think I should. I have run the lex code as a seperate program and it returns all valid tokens -- regardless of white space. However when using the yacc grammer, white space becomes significant. select sno; <--- this works ok select sno,pno; <--- yyparse() returns a 1 if yacc grammer is (1) sel_list : sel_type ',' sel_list | sel_type ; sel_type : attr_name; attr_name: IDENTIFIER; <---- but works ok if yacc grammer is (2) sel_list : sel_list ',' sel_type | sel_type ; sel_type : attr_name; attr_name: IDENTIFIER; select sno, pno; <--- this works ok if yacc grammer is is specified as (1) above. I have read that using left recursion keeps the stack space smaller for yacc, so I am attempting to use it. But specifications (1) and (2) seem to force a different syntax requirement (i.e., a space after the comma). Why is this happening? Now when yacc does work for 2 attribute names, it does not work for 3, 4, 5, 6, ... attributes. Why? I figured that if yacc worked for 2 attribute parameters it would surely work for 5, 6, or 10. But I found out it doesn't. I have poured over three books (O'Reilly's Lex & Yacc, the dragon book, and An Introduction to Compiler Contruction under Unix) and cannot find anything similiar. Am I missing something? By the way I am using GNU's flex and bison programs, but will be uploading them to a 3b2 running SYS5 v.3.2. as soon as I get these darn things debugged. Any help would be appriciated. Thanks in advance. -- Sincerely, Mark Pledger -------------------------------------------------------------------------- CTI | (703) 685-5434 [voice] 2121 Crystal Drive | (703) 685-7022 [fax] Suite 103 | Arlington, DC 22202 | mpledger@cti.com --------------------------------------------------------------------------
ronald@atcmp.nl (Ronald Pikkert) (10/04/90)
From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger): <> <> Could somebody please give me a hand with a lex & yacc problem. I am <> trying to build a small subset of the SQL language. For purposes of this <> question my yacc grammer is defined below. <> <> %% /* RULES */ <> {quotes} { <> yylval.string = yytext; yylval.string = strdup(yytext); ^^^^^^ <> return(QUOTES); <> } <> Always use dynamically allocated memory to hold matched patterns in the input. - Ronald Pikkert E-mail: ronald@atcmp.nl @ AT Computing b.v. Tel: 080 - 566880 Toernooiveld 6525 ED Nijmegen
bryang@chinet.chi.il.us (Bryan Glennon) (10/11/90)
>From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger): > > Could somebody please give me a hand with a lex & yacc problem. I am > trying to build a small subset of the SQL language. For purposes of this > question my yacc grammer is defined below. > There are (I think) two problems here. The first is in the lexical analyzer, which returns the token EOC when it sees a semicolon. Unless EOC is defined as ';', this will be a problem, since the cmds rule expects to see select ';': >";" { > yylval.string = yytext; > return(EOC); > } >cmds : select ';' I changed the analyzer to return ';' when it sees a semicolon. The second problem lies in the definition of cmds. It does not recurse, so it will not know what to do with statements other than the first. Anyway, here is the original rule, and the changes I made: >cmds : select ';' > { > fprintf(stdout,"SQL syntax correct.\n"); > } > ; cmds : | cmds select ';' { fprintf(stdout,"SQL syntax correct.\n"); } ; My version seems to work, although I had to guess as to what the symbol lookup routine does. If you want to see my complete test source, drop me a line and I will e-mail it to you. BTW, you don't have to allocate dynamic memory to hold the matched patterns, since yytext is a global. What you do have to be careful of (and I think this is what the response was referring to) is that you don't return the address of local strorage that is created within the analyzer. Hopes this helps... Bryan ...chinet!bryang "Hey, Rock! Watch me pull a ...chinet!bpgc!bryan rabbit outta my hat!" -Bullwinkle