[comp.unix.wizards] lex and yacc help desired

allen@jetson.UUCP (Allen Wade) (04/23/91)

Hello,

I am fairly new to Lex and Yacc and I am tring to develop a 
language definition for a small report processer. I have the book 
Introduction to Complier Construction by Scheiner and Friedman,
But I feel a little lost as how to start. My input lines will
generally look like this:

|MEDINA|ANTOINIO|01/14/62|M|(312)778-2540|60629|GARFIELD|LAREN|GP|DC101A|

I am able to figure out the regular expressions for Lex, but I am confused
about how Lex and Yacc work together.

Any help will be most appreciated.

__________
Allen Wade        
Consumer Health Services        Boulder, CO
{uunet,boulder}!chs!allen
chs!allen@boulder.Colorado.EDU
uunet!isis!awade
awade@nyx.cs.du.edu

awade@isis.cs.du.edu (allen wade) (04/23/91)

Hello,

I am fairly new to Lex and Yacc and I am tring to develop a
language definition for a small report processer. I have the book
Introduction to Complier Construction by Scheiner and Friedman,
But I feel a little lost as how to start. My input lines will
generally look like this:

|MEDINA|ANTOINIO|01/14/62|M|(312)778-2540|60629|GARFIELD|LAREN|GP|DC101A|

I am able to figure out the regular expressions for Lex, but I am confused
about how Lex and Yacc work together.

Any help will be most appreciated.

__________
Allen Wade
Consumer Health Services        Boulder, CO
{uunet,boulder}!chs!allen
chs!allen@boulder.Colorado.EDU
uunet!isis!awade
awade@nyx.cs.du.edu

reg@pinet.aip.org (Dr. Richard Glass) (04/24/91)

You may also want to look at "lex & yacc" from O'Reilly & Assoc. Inc. ,
(A nutshell handbook). by Mason and Brown.

Ricky Glass
-- 
Ricky Glass
(reg@pinet.aip.org)
Def: Recursion, See recursion  ("The Devils DP Dictionary" - )

pg@bsg.com (Peter Garst) (04/25/91)

In article <1991Apr23.164744.25927@mnemosyne.cs.du.edu> awade@isis.cs.du.edu (allen wade) writes:
>
>I am able to figure out the regular expressions for Lex, but I am confused
>about how Lex and Yacc work together.
>

When you are writing a lex/yacc interface, the work is on the lex side;
the yacc side is hidden in the code generated by yacc.

On the lex side, you generally need to do two things when you have found
a token that you want to pass to yacc: 

1) set the variable "yylval" to the value of the token, and
2) return the macro defining the token number.

For example, if you have recognized an integer and want to pass it to yacc,
you will do this in a lex action:

	yylval	= atoi(yytext);	/* Set yylval to the number found	*/
	return(NUMBER);		/* Tell yacc you found a number		*/

In your grammar file you can then get the value of the number with one of
the $n variables in an action associated with a rule.
The easiest way to make sure NUMBER is defined is to include the scanner
generated by lex in the last part of the yacc grammar file. You can use
"#include "put-your-scanner-name-here"" after the second %% in the yacc file.
The the "%token NUMBER" in the declarations section will define NUMBER for
your scanner.

Debugging yacc grammars and parsers can be difficult and time consuming;
we have a symbolic debugger (and yacc upgrade) which makes it much easier.
See our recent comp.newprod posting, or get in touch for more information.

Peter Garst
pg@bsg.com
Bloomsbury Software Group, PO Box 390018, Mountain View, CA 94039
(415) 964-3486

mike@odgate.odesta.com (Mike J. Kelly) (04/26/91)

In article <1991Apr23.164744.25927@mnemosyne.cs.du.edu> allen wade writes:
>I am fairly new to Lex and Yacc and I am tring to develop a
>language definition for a small report processer.
> My input lines will generally look like this:
>
>|MEDINA|ANTOINIO|01/14/62|M|(312)778-2540|60629|GARFIELD|LAREN|GP|DC101A|
>

You could use Yacc to parse this, but it's really overkill.  scanf(3)
would work as well, or at worst, just read the line into a string and
use the string package to separate out the components; look at strtok(3)
and strchr(3).  

If you really want to use Yacc, what I'd do is write a simple lex which
returns four types of tokens: TOK_DATE, TOK_PHONE, TOK_NUM and
TOK_STRING.  Then your yacc grammar is:

statement:	TOK_STRING '|' TOK_STRING '|' TOK_DATE '|' TOK_STRING
'|' ...
		{
			last_name = $1;		/* first field */
			first_name = $3;	/* second field */
			bday = $5;		/* third field */
			.
			.
			.
		}

rembo@unisoft.UUCP (Tony Rems) (04/30/91)

In article <1991Apr25.204908.21654@odgate.odesta.com> mike@odgate.odesta.com (Mike J. Kelly) writes:
>In article <1991Apr23.164744.25927@mnemosyne.cs.du.edu> allen wade writes:
>>I am fairly new to Lex and Yacc and I am tring to develop a
>>language definition for a small report processer.
>> My input lines will generally look like this:
>>
>>|MEDINA|ANTOINIO|01/14/62|M|(312)778-2540|60629|GARFIELD|LAREN|GP|DC101A|
>>
>
>You could use Yacc to parse this, but it's really overkill.  scanf(3)
>would work as well, or at worst, just read the line into a string and
>use the string package to separate out the components; look at strtok(3)
>and strchr(3).  
>

Actually, this seems ideally suited to awk or, preferably perl :).
You could split these fields up with one split command in perl.
Then to really use perl right, you could stick it into an
associative array (I'm sure Mr. Wall is smiling), and you can
do any sort of operations on it you'd like.

Try perl today!

-Tony