[comp.lang.c] yacc-a-dy YACC yacc, Help me HACK!

mpledger@cti1.UUCP (Mark Pledger) (10/03/90)

     Could somebody please give me a hand with a lex & yacc problem.  I am 
trying to build a small subset of the SQL language.  For purposes of this 
question my yacc grammer is defined below. 

%start cmds

%%

cmds        : select ';'
            { 
            fprintf(stdout,"SQL syntax correct.\n");
            }
            ;

select      : SELECT sel_list 
            {
            }
            ;

sel_list    : sel_type
            {
            }
            | sel_list ',' sel_type
            {
            }
            ; 

sel_type    : attr_name  
            {
            }
            ;

attr_name   : IDENTIFIER
            {
            }
            ;

%%

... and so on


     My lex specifications are list below.

delimiters           [ \t]
ws                   {delimiters}+
newline              [\n]
letter               [A-Za-z_]
int                  [0-9]+
string               {letter}+
squote               \'[^\'\n]*\'
dquote               \"[^\"\n]*\"
quotes               (squote|dquote)

%% /* RULES */
{quotes}             { 
                     yylval.string = yytext;
                     return(QUOTES);         
                     }

";"                  { 
                     yylval.string = yytext;
                     return(EOC);         
                     }

","                  { 
                     yylval.string = yytext;
                     return(',');         
                     }

"q"                  {        /* interactive quit command */
                     yylval.string = yytext;
                     return( 0 );  
                     }

{newline}            { 
                     lineno++;
                     }

{int}                { 
                     yylval.integer = atoi(yytext);
                     return(INT);         
                     }

{ws}                 ;        /* white space */

{string}             {        /* normal token */
                     yylval.string = yytext;
                     return(symlookup()); 
                     }    

"/*"                 { skipcomment();
                     }

.                    {        /* for testing only */
                     ECHO; exit(0);
                     }

%% 

... and so on



     This lex & yacc grammer works only part of the time.  I have specified 
in the yacc grammer to allow multiple attr_name's after the key word SELECT.  
Using the following examples below, I seem to be getting an error when I 
don't think I should.  I have run the lex code as a seperate program and it 
returns all valid tokens -- regardless of white space.  However when using 
the yacc grammer, white space becomes significant. 


      select sno;             <--- this works ok 


      select sno,pno;         <--- yyparse() returns a 1 if yacc grammer is
                           
                        (1)   sel_list : sel_type ',' sel_list
                                       | sel_type
                                       ;
                              sel_type : attr_name;
                              attr_name: IDENTIFIER;
                           

                              <---- but works ok if yacc grammer is

                        (2)   sel_list : sel_list ',' sel_type
                                       | sel_type
                                       ;
                              sel_type : attr_name;
                              attr_name: IDENTIFIER;


      select sno, pno;        <--- this works ok if yacc grammer is
                                   is specified as (1) above.


     I have read that using left recursion keeps the stack space smaller for 
yacc, so I am attempting to use it.  But specifications (1) and (2) seem to 
force a different syntax requirement (i.e., a space after the comma).  Why 
is this happening?  Now when yacc does work for 2 attribute names, it does 
not work for 3, 4, 5, 6, ... attributes.  Why?  I figured that if yacc 
worked for 2 attribute parameters it would surely work for 5, 6, or 10.  But 
I found out it doesn't. 

     I have poured over three books (O'Reilly's Lex & Yacc, the dragon book, 
and An Introduction to Compiler Contruction under Unix) and cannot find 
anything similiar.  Am I missing something?  By the way I am using GNU's flex
and bison programs, but will be uploading them to a 3b2 running SYS5 v.3.2. 
as soon as I get these darn things debugged. 

     Any help would be appriciated.  Thanks in advance.




-- 
Sincerely,


Mark Pledger

--------------------------------------------------------------------------
CTI                              |              (703) 685-5434 [voice]
2121 Crystal Drive               |              (703) 685-7022 [fax]
Suite 103                        |              
Arlington, DC  22202             |              mpledger@cti.com
--------------------------------------------------------------------------

ronald@atcmp.nl (Ronald Pikkert) (10/04/90)

From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger):
<> 
<>      Could somebody please give me a hand with a lex & yacc problem.  I am 
<> trying to build a small subset of the SQL language.  For purposes of this 
<> question my yacc grammer is defined below. 
<> 
<> %% /* RULES */
<> {quotes}             { 
<>                      yylval.string = yytext;
                        yylval.string = strdup(yytext);
                                        ^^^^^^
<>                      return(QUOTES);         
<>                      }
<> 

Always use dynamically allocated memory to hold matched patterns in the
input.


-
Ronald Pikkert                 E-mail: ronald@atcmp.nl
@ AT Computing b.v.            Tel:    080 - 566880
Toernooiveld
6525 ED  Nijmegen

bryang@chinet.chi.il.us (Bryan Glennon) (10/11/90)

>From article <289@cti1.UUCP>, by mpledger@cti1.UUCP (Mark Pledger):

> 
>      Could somebody please give me a hand with a lex & yacc problem.  I am 
> trying to build a small subset of the SQL language.  For purposes of this 
> question my yacc grammer is defined below. 
> 

There are (I think) two problems here.  The first is in the lexical analyzer,
which returns the token EOC when it sees a semicolon.  Unless EOC is defined
as ';', this will be a problem, since the cmds rule expects to see select ';':

>";"                  { 
>                     yylval.string = yytext;
>                     return(EOC);         
>                     }

>cmds        : select ';'

I changed the analyzer to return ';' when it sees a semicolon.

The second problem lies in the definition of cmds.  It does not recurse, so
it will not know what to do with statements other than the first.  Anyway,
here is the original rule, and the changes I made:

>cmds        : select ';'
>            { 
>            fprintf(stdout,"SQL syntax correct.\n");
>            }
>            ;

cmds        :
            | cmds select ';'
            { 
            fprintf(stdout,"SQL syntax correct.\n");
            }
            ;

My version seems to work, although I had to guess as to what the symbol lookup
routine does.  If you want to see my complete test source, drop me a line and
I will e-mail it to you.

BTW, you don't have to allocate dynamic memory to hold the matched patterns,
since yytext is a global.  What you do have to be careful of (and I think this
is what the response was referring to) is that you don't return the address of
local strorage that is created within the analyzer.  

Hopes this helps...
                              Bryan

...chinet!bryang                            "Hey, Rock!  Watch me pull a
...chinet!bpgc!bryan                         rabbit outta my hat!" 
                                                     -Bullwinkle