[comp.compilers] Error messages and Yacc

nielses@imada.dk (Soee Niels) (07/29/89)

We are using yacc to write a compiler, and we have problems 
as we want to give some meaningful error messages to the user.
Yacc knows which symbol is valid from every state, and we want to
use this information to write something like:
LEFTPAR or SEMICOLON expected.
One possible solution to the problem is a function "yyexpected"
that could return a list of possible terminal symbols.
Does any of you have this function, or is there another
solution to the problem?

Thank you in advance.

My email address is niels@ifad.dk, but maybe the answer is of
interest to the newsgroup.

[This is not as easy as it looks, because yacc combines states
wherever possible, which means that the legitimate follow set in each
state is often a lot larger than the grammar would make it.  There
has been a lot of work done on improving yacc's notoriously weak
error handling, and nobody has asked about this for a while, so I'd
be interested in hearing what's new in yacc error recovery.  -John]
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
{ decvax | harvard | yale | bbn }!ima.  Meta-mail to ima!compilers-request.
Please send responses to the author of the message, not the poster.

djones@megatest.uucp (Dave Jones) (08/01/89)

> We are using yacc to write a compiler, and we have problems 
> as we want to give some meaningful error messages to the user.
> Yacc knows which symbol is valid from every state, and we want to
> use this information to write something like:
> LEFTPAR or SEMICOLON expected.
> One possible solution to the problem is a function "yyexpected"
> that could return a list of possible terminal symbols.
> Does any of you have this function, or is there another
> solution to the problem?
> 
> Thank you in advance.

We bashed this all out a few months ago. The result of the discussion:
It is not as easy (nor as useful) as it would at first seem.
There were two incorrect solutions posted, and I found another
incorrect attempt in a book called _Introduction_to_Compiler_Contruction_
_with_Unix. It is a very easy algorithm to botch.

The problem is that yacc uses default reductions -- which is to
say, when yacc is in a state in which the next token can not be
shifted, but the state is such that (if the input token is not considered)
there is at least one possible reduction, then yacc always does some
reduction, willy nilly. At "compiler-compile time", one reduction is
selected as the "default", and will be used when the compiler runs,
if the look-ahead token counterindicates the other reductions. Why?
Saves space in the tables.

The result is that when yacc finally gets around to discovering
that there is an error, several states which might have shifted
a legal token may have been used in reductions, and thus popped
from the stack. The information necessary to reconstruct the set
of legal tokens has therefore been lost (and erroneous semantic
actions may have been done).

When we went over this before, I posted several examples, mostly
in Pascal grammars, which demonstrated the problem. For example,
in a state where VAR, TYPE, and so forth are legal in the
decalartion section, the incorrect algorithms would say, 
"BEGIN expected."

But them grapes is sour, anyway. You probably would not like the result if
you had it. The messages would often list tiresomely large sets
of tokens.

What you really want is a message which says something like,
"Decalartion list or compound statement expected," along with a pointer
to the places in the manual where declarations and compound statements
are explained. You'd still need to do something about the default
reductions, though.

I would be interested in hearing more ideas on this subject.
[From djones@megatest.uucp (Dave Jones)]
-- 
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
{ decvax | harvard | yale | bbn }!ima.  Meta-mail to ima!compilers-request.
Please send responses to the author of the message, not the poster.

peter@aries5.waterloo.edu (Peter Bumbulis) (08/04/89)

In article <1989Aug1.032302.1341@esegue.uucp> djones@megatest.uucp (Dave Jones) writes:
>
>What you really want is a message which says something like,
>"Decalartion list or compound statement expected," along with a pointer
>to the places in the manual where declarations and compound statements
>are explained. You'd still need to do something about the default
>reductions, though.
>
>I would be interested in hearing more ideas on this subject.

In "A syntax-error-handling technique and its experimental analysis",
TOPLAS 5(4):656-679, 1983, Sippu and Soisalon-Soininen discuss how to
generate such messages.

Peter
[From peter@aries5.waterloo.edu (Peter Bumbulis)]
-- 
--
Send compilers articles to compilers@ima.isc.com or, perhaps, Levine@YALE.EDU
{ decvax | harvard | yale | bbn }!ima.  Meta-mail to ima!compilers-request.
Please send responses to the author of the message, not the poster.