[comp.lang.c] Lex Macros

dymm@wvucswv.UUCP (11/14/87)

Dear Friends,

I would like some help in interpreting some of the macros used
in Lex.  Below is the relevant portion of the C code produced by Lex.
The macros are quite "slick", but also somewhat difficult to understand.

Could someone please describe - in detail - how the macros
"input()"  and  "unput()"   operate?????

Send mail to the address below or, if you think your answer is worthy of
the attention of others, post it to the bb.
Thanks for your help.


David Dymm			Software Engineer

USMAIL: Bell Atlantic Knowledge Systems,
	13 Beechurst Ave., Morgantown, WV 26505
PHONE:	304 291-2651 (8:30-4:30 EST)
USENET:  {allegra,bellcore, cadre,idis,psuvax1}!pitt!wvucsb!wvucswv!dymm


	***  C Code From Lex.c  ***
    .
    .
# define U(x) x
# define YYLMAX 200

# define output(c) putc(c,yyout)

# define input() (((yytchar=yysptr>yysbuf?U(*--yysptr):getc(yyin))==10?
    	    	    (yylineno++,yytchar):yytchar)==EOF?0:yytchar)
		    
# define unput(c) {yytchar= (c);if(yytchar=='\n')yylineno--;*yysptr++=yytchar;}

extern char *yysptr, yysbuf[];
int yytchar;
FILE *yyin ={stdin}, *yyout ={stdout};
extern int yylineno;

chris@mimsy.UUCP (Chris Torek) (11/16/87)

In article <48@wvucswv.UUCP> dymm@wvucswv.UUCP asks what the `input'
and `unput' macros lex generates do.

Try considering them as functions:

	/*
	 * Retrieve an input character.
	 */
	int
	input()
	{
		int c;

		/* retrieve pushback, if any, else next char */
		if (yysptr > yysbuf)
			c = *--yysptr;
		else
			c = getc(yyin);
		/* increment input line number if newline */
		if (c == '\n')
			yylineno++;
		/* return 0 for eof, anything else as is */
		return (c == EOF ? 0 : c);
	}

	unput(c)
		int c;
	{

		/* if pushing back a newline, back the line number down */
		if (c == '\n')
			yylineno--;
		*yysptr++ = c;
	}

The nonsense with the global `yytchar' is to avoid referring to
macro arguments or using `getc' more than once.  It would be nice
if lex used EOF for EOF rather than 0, but not many people need
to scan NULs.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

andrew@frip.UUCP (11/20/87)

[]

	"I would like some help in interpreting some of the macros used
	in Lex.  Below is the relevant portion of the C code produced
	by Lex.  The macros are quite "slick", but also somewhat
	difficult to understand.  Could someone please describe - in
	detail - how the macros "input()"  and  "unput()"
	operate?????"

Yysbuf is the unget buffer.  It's organized as a stack, growing and
shrinking at the right (high address).  To unget a character, append
it; to get a character, first look to see if yysbuf is nonempty, and if
so take (and remove) the last character.

Yysptr points to the position beyond the last character in the stack.
When yysptr==yysbuf, then stack is empty.

So:

# define input() (((yytchar=yysptr>yysbuf?U(*--yysptr):getc(yyin))==10?
    	    	    (yylineno++,yytchar):yytchar)==EOF?0:yytchar)

yysptr>yysbuf means the unget stack is nonempty, so set yytchar to the
last character in that stack and remove it by decrementing yysptr.
Otherwise the unget stack is empty, so invoke getc to read the next
character from the input file, and, if that character is '\n' (decimal
10 in ASCII), increment the line number in yylineno; if it's EOF,
return 0, else return the character.

# define unput(c) {yytchar= (c);if(yytchar=='\n')yylineno--;*yysptr++=yytchar;}

Just append the character to the unget stack, and decrement the line
number if ungetting a newline.

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]