[comp.compilers] Obsession with lexical and syntactic issues

worley@compass.com (Dale Worley) (11/16/89)

> Who >really< cares about syntax anyway?

Well, the customer, for one.

Syntax-related issues in compilers "in the real world" are *not*
trivial.  For instance:

- Constant revisions/changes to the grammar

This is where tools that turn a grammar into a parser win big over
hand-coded parsers -- the language *is* going to go through 23
revisions before the compiler goes out the door.  Sure, it started out
as "ANSI Standard", but the customer would like a couple of extra
features...

- Verifying that the new language features don't introduce ambiguities
into the language

Unless you design your language with an eye to making all
constructions *obviously* different, you will introduce ambiguities.
I've been involved with compilers for C, Algol 68, and a Cobol-like
business application language, and seen enough about Fortran and PL/1,
to know that this sort of problem is always biting you.

- Automatically producing good error recovery from syntax errors

This is clearly a major research area, and it is important in any
compiler that someone is actually going to use to write programs.
Even figuring out how error messages "ought" to be presented is still
unknown.

- Language designers steadfastly refuse to make LALR(1) languages

I've never yet seen a major programming language that was truly
LALR(1).  And usually the points where they depart from LALR(1) are
seriously ugly -- consider the problem of tagging all the typedef
names in a C program in a truly ANSI Standard way.

- Building a lex/parse system that will Do What I Mean

Lexers and parsers specifications are *still* more complicated than
they "ought to be".  Anything that can be done to put more wisdom in
the generator will provide immediate payoff.

Another reason that this is still an important area of study is that
we write many parsers, whereas, at least in theory, a good global
optimizer should be usable with little change in compilers for many
languages.

Dale Worley		Compass, Inc.			worley@compass.com
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

gateley@m2.csc.ti.com (John Gateley) (11/19/89)

In article <1989Nov15.193343.2017@esegue.segue.boston.ma.us> worley@compass.com (Dale Worley) writes:
> [These are syntactic issues which arise in the real world.]
>- Constant revisions/changes to the grammar
>- Verifying that the new language features don't introduce ambiguities
>into the language
>- Automatically producing good error recovery from syntax errors
>- Language designers steadfastly refuse to make LALR(1) languages
>- Building a lex/parse system that will Do What I Mean

Lisp is a counter-example to most of the above points: Its grammar
is very simple (and modifications are made through the read-table
abstraction, instead of the parser). New language features don't
change the grammar, they are added as special forms or macros.
It is at least LALR(1), and parse systems are very simple. Lexers
are a bit harder, depending on how many special characters you use,
and whether you implement the Common Lisp read-table, but are not
more complex than conventional language lexers.

Good error recovery is still hard though, even with Lisp's simple syntax.

John
gateley@m2.csc.ti.com
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.