[comp.unix.questions] Using Lex

john@basho.uucp (John Lacey) (08/10/90)

Normally, of course, one wants a scanner (and a parser) to work from 
a file, perhaps stdin.  Sigh.  Well, I want one that works from a string.

I am using Flex 2.3, and Bison 1.11.  I tried the following few #define's:

#undef  YY_INPUT
#define YY_INPUT(buf,result,max_size) \
{ \
for ( result = 0;  *ch_this && result < max_size; result ++ ) \
   buf[result] = *ch_this++; \
}

#define YY_USER_INIT \
   if ( scan_init ) { \
      if ( yy_flex_debug ) \
         printf ( "-- initializing for scan %d\n", scan_init ); \
      ch_this = inbuffer; \
      scan_init = 0; }

with the following couple of definitions and declarations in the scanner:

static char * ch_this;
extern char * inbuffer;
extern int scan_init;

and with inbuffer and scan_init defined in the code that calls yyparse().
This didn't work.  Well, actually, it works the first time yyparse() is 
called, but not again.  Now, YY_USER_INIT is used inside an if statement
that checks yy_init, so I moved it out of there in the scanner skeleton
so that YY_USER_INIT is seen every time the scanner is called.  Still 
no go.

Has anyone done this, or see a way to do it, or know a way to do it, or ....

Thanks.

-- 
John Lacey, 
   E-mail:  ...!osu-cis!n8emr!uncle!basho!john  (coming soon: john@basho.uucp)
   V-mail:  (614) 436--3773, or 487--8570
"What was the name of the dog on Rin-tin-tin?"  --Mickey Rivers, ex-Yankee CF

ptb@ittc.wec.com (Pat Broderick) (08/10/90)

In article <1990Aug10.012927.5558@basho.uucp>, john@basho.uucp (John Lacey) writes:
> Normally, of course, one wants a scanner (and a parser) to work from 
> a file, perhaps stdin.  Sigh.  Well, I want one that works from a string.
> ...

Recently I had occasion to do something similar.  What we did was
roughly as follows:

   - strings to be parsed are maintained in memory 
   - to parse a string a global pointer known to lex is set to point at
     the beginning of the string
   - the input() macro was redefined in terms of this pointer (standard
     uses getc(yyin))

The things needed might look something like:

LEX:

# define input() (((yytchar=yysptr>yysbuf?U(*--yysptr):getc(yyin))==10?(yylineno++,yytchar):yytchar)==EOF?0:yytchar)   /* standard defn from lex */

# define input() (((yytchar=yysptr>yysbuf?U(*--yysptr):(*yynyy++))==10?(yylineno++,yytchar):yytchar)==EOF?0:yytchar)                   ^^^^^^^^^^

					/* modified defn to use string */

extern char *yynyy;			/* will pt to start of string */


Function invoking parser:

char *yynyy;			/* globally visible */

....

    yynyy = start_of_string;
    yyparse();


This works fine for us, hope it helps.
-- 
Patrick T. Broderick           |ptb@ittc.wec.com |
                               |uunet!ittc!ptb   |
                               |(412)733-6265    |

bogatko@lzga.ATT.COM (George Bogatko) (08/10/90)

In article <1990Aug10.012927.5558@basho.uucp>, john@basho.uucp (John Lacey) writes:
> Has anyone done this, or see a way to do it, or know a way to do it, or ....

Put these lines in your lex file after the #include lines

%{
#include <stdio.h>
#include <y.tab.h>

extern char *mis_ptr;

#undef input
#undef unput
# define input() (*mis_ptr=='\n'?0:*mis_ptr++)
# define unput(c) (*--mis_ptr=(c) )
%}


Now have a char buff called myinputstring

char myinputstring[100];

do the following in main:

char *mis_ptr;
main()
{
	for(;;)
	{
		gets(buf);
		mis_ptr = buf;
		yylex();
	}
}

I think you get the picture now?


GB

jal@valha1.ATT.COM (Joseph A. Leggio) (08/12/90)

From article <1990Aug10.012927.5558@basho.uucp>, by john@basho.uucp (John Lacey):
> Normally, of course, one wants a scanner (and a parser) to work from 
> a file, perhaps stdin.  Sigh.  Well, I want one that works from a string.
> 

> Has anyone done this, or see a way to do it, or know a way to do it, or ....
> 
> -- 
> John Lacey, 

I have used these "input" and "unput" routines in
many programs where I wanted complete control of the
input stream.  The example here uses fgets to fill
a character array from stdin, but you could fill it
from any source you wish.  You only need point pointer "p"
to the start of the array each time you read a new line.

Only restriction: unput cannot back up past the start of a line.
(I have not found this to be a problem as I do not usually try
to match patterns which span multiple lines.)

I use System V Release 3 AT&T lex, "flex" might work the same, look
for the #defines for "input" and "unput" in your code.
==================================================
%%
	Lex reg-expr's go here
%%
#define BUFFER_SIZE 1024

char *p;
char buf[BUFFER_SIZE];

main(){
    p = buf;        /* point "p" to start of buf for first line     */
    while( fgets(buf, sizeof(buf), stdin) != NULL ) { /* read line  */
        yylex();                                      /* parse line */
        p = buf;    /* point "p" back to start of buf for next line */
    }
exit(0);
}

#ifdef input
#undef input
#endif
#ifdef unput
#undef unput
#endif

/* replacement "input" routine for lex, uses char array "buf"  */
char input()
{
    if ( p < buf + ( BUFFER_SIZE - 1 ) )
            return(*p++);
    else
            return((char)0);
}

/* replacement "unput" routine for lex, uses char array "buf"  */
unput(c)
char c;
{
    if ( p > buf )
        *(--p) = c;     
}

=============================================================
Joe Leggio WB2HOL
AT&T Customer Software Services
Valhalla, NY
att!valha1!jal