mpledger@cti1.UUCP (Mark Pledger) (10/03/90)
Could somebody please give me a hand with a lex & yacc problem. I am
trying to build a small subset of the SQL language. For purposes of this
question my yacc grammer is defined below.
%start cmds
%%
cmds : select ';'
{
fprintf(stdout,"SQL syntax correct.\n");
}
;
select : SELECT sel_list
{
}
;
sel_list : sel_type
{
}
| sel_list ',' sel_type
{
}
;
sel_type : attr_name
{
}
;
attr_name : IDENTIFIER
{
}
;
%%
... and so on
My lex specifications are list below.
delimiters [ \t]
ws {delimiters}+
newline [\n]
letter [A-Za-z_]
int [0-9]+
string {letter}+
squote \'[^\'\n]*\'
dquote \"[^\"\n]*\"
quotes (squote|dquote)
%% /* RULES */
{quotes} {
yylval.string = yytext;
return(QUOTES);
}
";" {
yylval.string = yytext;
return(EOC);
}
"," {
yylval.string = yytext;
return(',');
}
"q" { /* interactive quit command */
yylval.string = yytext;
return( 0 );
}
{newline} {
lineno++;
}
{int} {
yylval.integer = atoi(yytext);
return(INT);
}
{ws} ; /* white space */
{string} { /* normal token */
yylval.string = yytext;
return(symlookup());
}
"/*" { skipcomment();
}
. { /* for testing only */
ECHO; exit(0);
}
%%
... and so on
This lex & yacc grammer works only part of the time. I have specified
in the yacc grammer to allow multiple attr_name's after the key word SELECT.
Using the following examples below, I seem to be getting an error when I
don't think I should. I have run the lex code as a seperate program and it
returns all valid tokens -- regardless of white space. However when using
the yacc grammer, white space becomes significant.
select sno; <--- this works ok
select sno,pno; <--- yyparse() returns a 1 if yacc grammer is
(1) sel_list : sel_type ',' sel_list
| sel_type
;
sel_type : attr_name;
attr_name: IDENTIFIER;
<---- but works ok if yacc grammer is
(2) sel_list : sel_list ',' sel_type
| sel_type
;
sel_type : attr_name;
attr_name: IDENTIFIER;
select sno, pno; <--- this works ok if yacc grammer is
is specified as (1) above.
I have read that using left recursion keeps the stack space smaller for
yacc, so I am attempting to use it. But specifications (1) and (2) seem to
force a different syntax requirement (i.e., a space after the comma). Why
is this happening? Now when yacc does work for 2 attribute names, it does
not work for 3, 4, 5, 6, ... attributes. Why? I figured that if yacc
worked for 2 attribute parameters it would surely work for 5, 6, or 10. But
I found out it doesn't.
I have poured over three books (O'Reilly's Lex & Yacc, the dragon book,
and An Introduction to Compiler Contruction under Unix) and cannot find
anything similiar. Am I missing something? By the way I am using GNU's flex
and bison programs, but will be uploading them to a 3b2 running SYS5 v.3.2.
as soon as I get these darn things debugged.
Any help would be appriciated. Thanks in advance.
--
Sincerely,
Mark Pledger
--------------------------------------------------------------------------
CTI | (703) 685-5434 [voice]
2121 Crystal Drive | (703) 685-7022 [fax]
Suite 103 |
Arlington, DC 22202 | mpledger@cti.com
--------------------------------------------------------------------------ronald@atcmp.nl (Ronald Pikkert) (10/04/90)
From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger):
<>
<> Could somebody please give me a hand with a lex & yacc problem. I am
<> trying to build a small subset of the SQL language. For purposes of this
<> question my yacc grammer is defined below.
<>
<> %% /* RULES */
<> {quotes} {
<> yylval.string = yytext;
yylval.string = strdup(yytext);
^^^^^^
<> return(QUOTES);
<> }
<>
Always use dynamically allocated memory to hold matched patterns in the
input.
-
Ronald Pikkert E-mail: ronald@atcmp.nl
@ AT Computing b.v. Tel: 080 - 566880
Toernooiveld
6525 ED Nijmegenbryang@chinet.chi.il.us (Bryan Glennon) (10/11/90)
>From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger): > > Could somebody please give me a hand with a lex & yacc problem. I am > trying to build a small subset of the SQL language. For purposes of this > question my yacc grammer is defined below. > There are (I think) two problems here. The first is in the lexical analyzer, which returns the token EOC when it sees a semicolon. Unless EOC is defined as ';', this will be a problem, since the cmds rule expects to see select ';': >";" { > yylval.string = yytext; > return(EOC); > } >cmds : select ';' I changed the analyzer to return ';' when it sees a semicolon. The second problem lies in the definition of cmds. It does not recurse, so it will not know what to do with statements other than the first. Anyway, here is the original rule, and the changes I made: >cmds : select ';' > { > fprintf(stdout,"SQL syntax correct.\n"); > } > ; cmds : | cmds select ';' { fprintf(stdout,"SQL syntax correct.\n"); } ; My version seems to work, although I had to guess as to what the symbol lookup routine does. If you want to see my complete test source, drop me a line and I will e-mail it to you. BTW, you don't have to allocate dynamic memory to hold the matched patterns, since yytext is a global. What you do have to be careful of (and I think this is what the response was referring to) is that you don't return the address of local strorage that is created within the analyzer. Hopes this helps... Bryan ...chinet!bryang "Hey, Rock! Watch me pull a ...chinet!bpgc!bryan rabbit outta my hat!" -Bullwinkle