[comp.unix.questions] Yacc & Lex problem

Alistair_Frith.wgc1@rx.xerox.com (03/06/91)

Please don't bother to read this if you are not fairly well versed in Yacc as
it will only bore you.

I am currently working on a software metrics project using Yacc and Lex to
produce a program which will parse 'C' source code and return various
metrics such as number of function calls / distinct paths through a function /
local and global variables etc.

I did this by putting the grammer curtesy of Kernighan & Ritchie (2nd edition)
into Yacc.

The problem is that the program thus produced won't parse nested if statements.
The dangling 'else' syndrome makes it fall over.

i.e.   it will parse

if (a)				if (a)				if (a)
	b;				b;				if (b)
else				else
c;
	c;				if (c)			etc;
etc;						d;
				etc;

but not

if (a)				if (a)
	b;				if (b)
else						c;
	if (c)				else
		d;				d;
	else			etc;
		e;
etc;


The relevent bits of the K&R grammer are:

statement
	:	conditional_statement
	.
	.
	.
	;

conditional_statement
	:	IF LBRA expression RBRA statement
	|	IF LBRA expression RBRA statement ELSE statement
	;

non terminals are in lower case, terminals returned by Lex are in capitals.

This grammer give one shift / reduce conflict, as I expected but I, and several
of my colleages would also expect it to resolve the conflict by shifting the
else. The manuals seem to confirm this, and so does the verbose output of Yacc.
But as far as I can see, the program is reducing both ifs before shifting the
else onto the stack!

I have tried re-vamping the conditional statement to read

cond_state
	:	IF LBRA expression RBRA statement
	|	cond_state else_part
	;

else_part
	:	ELSE statement
	;

and various convolutions along that theme.

I have also tried a suggestion of changing it to

statement
	:	x_statement
	|	complex_if
	;

x_statement
	:	simple_if
	.
	.
	.
	;

simple_if
	:	IF LBRA expr RBRA statement
	;

complex_if
	:	IF LBRA expr RBRA x_statement ELSE statement
	;

and various convolutions

but the conflict persists, and the program will not parse valid 'C'

I have asked various people, but there is little Yacc knowledge here.
I have read several manuals, but are all the same and tell me that the conflict
will be resolved in the way I want.
I have looked at the verbose Yacc output but even that seems to say the same as
the manuals.

And so I turn to you . . .		:-)

	Any ideas?

		Alistair.

nntp@informatik.uni-erlangen.de (nntp@faui45) (03/08/91)

> [yacc example deleted]

Have you tried compiling with -DYYDEBUG and setting
extern int yydebug;
to non-zero? This will be useful in determining what a
yacc-parser does at run time.

The first thing I would look at : Does your lexica analyzer
really recognize the 'else' ? Are you sure?? Really ??? (How?)

If this does'nt help, I'm willing to help you if you send me
the source (via email).
So far as I can see, your grammar is correct (but I saw only few
relevant lines!)

BTW, such questions are usually discussed in comp.lang.misc, comp.lang.c
or comp.compilers.
			See you(r postings/mail)  Ingo