[comp.sys.sgi] re-cursing lex

Dan Karron@UCBVAX.BERKELEY.EDU (04/29/91)

How do I get lex to: 

1) read from a user provided buffer instead of a file ?

2) Get lex to throw away output it can't match (instead of sinking OUTPUT
to /dev/null ?) Better still would be to write a catchall pattern that
would make an error string for stuff it could not match otherwise.

3) Get lex to recurse, i.e., call lex again with input from another buffer.

I don't want to get involved with yacc, as that is too complicated for 
my simple mind.

All that I want to do is identify strings like

VARIABLE="value"

from a file or the environment string and

-VARIABLE="value" 

from the command line argument vector.

Once recognized, then parse out the value, or if in error, say something
about it.

I would like to take a uniform approach to reading values from a file,
environment string, and command line argument.

| karron@nyu.edu (e-mail alias )         Dan Karron, Research Associate      |
| Phone: 212 263 5210 Fax: 212 263 7190  New York University Medical Center  |
| 560 First Avenue                       Digital Pager <1> (212) 397 9330    |
| New York, New York 10016               <2> 10896   <3> <your-number-here>  |

jmb@patton.wpd.sgi.com (Jim Barton) (05/02/91)

In article <9104290656.AA16311@karron.med.nyu.edu>, Dan Karron@UCBVAX.BERKELEY.EDU writes:
|> From: Dan Karron@UCBVAX.BERKELEY.EDU
|> Newsgroups: comp.sys.sgi
|> Subject: re-cursing lex
|> Message-ID: <9104290656.AA16311@karron.med.nyu.edu>
|> Date: 29 Apr 91 06:56:24 GMT
|> Organization: The Internet
|> 
|> How do I get lex to: 
|> 
|> 1) read from a user provided buffer instead of a file ?
|> 
|> 2) Get lex to throw away output it can't match (instead of sinking OUTPUT
|> to /dev/null ?) Better still would be to write a catchall pattern that
|> would make an error string for stuff it could not match otherwise.
|> 
|> 3) Get lex to recurse, i.e., call lex again with input from another buffer.
|> 
|> I don't want to get involved with yacc, as that is too complicated for 
|> my simple mind.
|> 
|> All that I want to do is identify strings like
|> 
|> VARIABLE="value"
|> 
|> from a file or the environment string and
|> 
|> -VARIABLE="value" 
|> 
|> from the command line argument vector.
|> 
|> Once recognized, then parse out the value, or if in error, say something
|> about it.
|> 
|> I would like to take a uniform approach to reading values from a file,
|> environment string, and command line argument.

You can manage 1 and 3 in lex by redefining the standard input/output functions.
I actually think it's documented in that old hoary stuff from AT&T. The
functions are actually macros: "input", "output" and "unput". #undef these
at the top of your *.l file, and provide real functions to do what you want.
For instance, I wrote a lex program that could deal with "include" files
(even nested includes).

Number 2 could be handled pretty easy by including a final rule:

	. { }

I would suggest you use 'flex' instead. Flex is 1000% faster than lex, builds
faster parsers, has more features, is easier to use, doesn't have the weird
bugs in lex, and is free. The way you redefine input/output is much easier.
You can pick it up from any of the public domain archives.

-- Jim Barton
   Silicon Graphics Computer Systems
   jmb@sgi.com