[comp.lang.misc] lex & yacc questions

mpledger@cti1.UUCP (Mark Pledger) (08/20/90)

Help!  I have three lex & yacc questions which I need help on.  They
are listed below.  (Please note that I have posted this article to
to a few news groups for wider coverage.)


PROBLEM 1 ------------------------------------ :-(

When I compile then link using the command line options below, I
always get the "environ" referencing error.  The only way I've
found to get around this is to compile and link without the "-c"
compiler option.  Why is this happening and what is "environ"?  I
think its the char *environ[] string which is passed to each program
from the shell.  Is this correct?

CTI-> cc -c main.c lex.yy.c

CTI-> ld main.o lex.yy.o -ll -lc 2>&1 > ubtest.err

undefined			first referenced
 symbol  			    in file
environ              /lib/libc.a
ld fatal: Symbol referencing errors. No output written to a.out
CTI->


PROBLEM 2 ------------------------------------ :-(

When I run yylex() from the sample code below, if no matching integer
is found it prints the yytext[] anyway.  Why?  I have it doing nothing
in the lex definition file (also shown below).  Whats causing it to
print out the token found?


[lex source code]

delim                [ \t\n]
ws                   {delim}+
letter               [A-Za-z'_']
digit                [0-9]
number               {digit}+(\.{digit}+)?(E[+\-]?{digit}+)?
var                  {letter}({letter}|{digit})*

%%
{ws}                 { /* do nothing */ }
"APPLICATION"        { return(APP); }
"IF"                 { return(IF); }
"FORM"               { return(FORM); } 
{var}                { /* do nothing */ }
{number}             { /* do nothing */ }
"/*"                 { skipcomments(); }
%%
   
skipcomments()

   ...


[C source code]

while (1)
   {
   switch( yylex() )
      {
      case APP:      /* APPLICATION key word found */
         printf("\n%s",yytext);
         break;

      case FORM:     /* FORM key word found */
         printf("\n%s",yytext);
         break;O

      case IF:       /* IF key word found */
         printf("\n%s", yytext);
         break;

      case UVAR:     /* user var word found */
         break;

      case  0:       /* end of file reached */
         printf("\nEOF reached.\n");
         exit(0);
         break;

      } /* switch */

   } /* while */

}  /* main() */


[screen dump from program's output]

CTI-> ld main.o lex.yy.o -ll -lc 2>&1 > ubtest.err
CTI-> ubtest < ubtest.txt                         
CTI-> 

APPLICATION(==);(==);
EOF reached.
CTI->...usr2/mpledger/ub> 



PROBLEM 3 ------------------------------------ :-(

I asked this before but did'nt get a response.  I'm trying to write a 
grammer for a context, case-insensitive grammer (Unify RDBMS command
language).  I know that you can specify lower or upper case letters
by defining letters in the lex rules section.  In my example
I am using letters  [A-Za-z'_'] to match upper or lower case characters
and possibly the underscore.  My question is this, how can you get
lex to match a reserved word you have declared, whether it's upper case
or not.  For example, Unify has the reserved command word "application".
I wish to scan for this word using lex and return to yyparse() whether
the reserved word "application" is found as "application", "APPLICATION",
"Application", "APPlication", etc.  What do you specify in lex to do
this.  I tried using "APPLICATION" in the rules section (e.g. 
"APPLICATION" { return(APP); } where APP is a #define).  However, this
only works if the token found is already capitalized.




Any help to the above mentioned problems would be of great help to me.
I have read the dragon book and recently bought O'Reilly's book Lex &
Yacc (which I don't think is very good!).

Thanks in advance.


-- 
Sincerely,


Mark Pledger

--------------------------------------------------------------------------
CTI                              |              (703) 685-5434 [voice]
2121 Crystal Drive               |              (703) 685-7022 [fax]
Suite 103                        |              
Arlington, DC  22202             |              mpledger@cti.com
--------------------------------------------------------------------------
             "To boldly go where no 'C' has gone before" 
--------------------------------------------------------------------------

diamond@tkou02.enet.dec.com (diamond@tkovoa) (08/21/90)

In article <267@cti1.UUCP> mpledger@cti1.UUCP (Mark Pledger) writes:

>CTI-> cc -c main.c lex.yy.c
>CTI-> ld main.o lex.yy.o -ll -lc 2>&1 > ubtest.err
Try:   ld main.o lex.yy.o -ll -lc /lib/crt0.o 2>&1 > ubtest.err

>When I run yylex() from the sample code below, if no matching integer
>is found it prints the yytext[] anyway.  Why?
I think lex is defined as printing out input text if nothing else is
done to it.  You need to make a rule to dispose of unwanted text.
(I'm not sure though.  Have you looked in a lex book?)

>I am using letters  [A-Za-z'_'] to match upper or lower case characters
>and possibly the underscore.
This should also match the apostrophe.  Try [A-Za-z_] with no apostrophe.

>My question is this, how can you get lex to match a reserved word you
>have declared, whether it's upper case or not.
>"application", "APPLICATION", "Application", "APPlication", etc.
[Aa][Pp][Pp][Ll][Ii][Cc][Aa][Tt][Ii][Oo][Nn]
If input other than reserved words is also case insensitive, then it is
probably faster to fold all input text into one case before lexing.
-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
This is me speaking.  If you want to hear the company speak, you need DECtalk.