[comp.compilers] LEX and YACC -- Are there better parser generators?

grosch@uunet.UU.NET (Josef Grosch) (11/12/89)

      Rex, Lalr and Ell - Compiler Construction Tools
      ===============================================

     Rex (Regular EXpression tool) is a scanner  generator  whose
specifications  are  based  on  regular expressions and arbitrary
semantic actions written in one of  the  target  languages  C  or
Modula-2.  As  scanners sometimes have to consider the context to
unambiguously recognize a token the right context can  be  speci-
fied by an additional regular expression and the left context can
be handled by so-called  start  states.  The  generated  scanners
automatically  compute the line and column position of the tokens
and offer an efficient mechanism  to  normalize  identifiers  and
keywords  to upper or lower case letters. The scanners are table-
driven and run at a speed of 180,000 to 195,000 lines per  minute
on a MC 68020 processor.

     Lalr is a LALR(1) parser generator accepting grammars  writ-
ten  in  extended BNF notation which may be augmented by semantic
actions expressed by statements of the target language. The  gen-
erator  provides  a  mechanism  for  S-attribution,  that is syn-
thesized attributes can be computed during parsing.  In  case  of
LR-conflicts  unlike  other tools Lalr provides not only informa-
tion about an internal state consisting of a set of items but  it
prints a derivation tree which is much more useful to analyze the
problem. Conflicts can be resolved by specifying  precedence  and
associativity of operators and productions. The generated parsers
include automatic  error  recovery,  error  messages,  and  error
repair.  The  parsers  are  table-driven  and  run  at a speed of
400,000 lines per minute. Currently parsers can be  generated  in
the target languages C and Modula-2.

     Ell is a LL(1) parser generator accepting the same  specifi-
cation  language  as  Lalr except that the grammars must obey the
LL(1) property. It  is  possible  to  evaluate  an  L-attribution
during parsing. The generated  parsers  include  automatic  error
recovery,  error  messages,  and  error  repair  like  Lalr.  The
parsers are implemented following the  recursive  descent  method
and  reach a speed of 750,000 lines per minute. The possible tar-
get languages are again C and Modula-2.

     All the tools are implemented in Modula-2.
A comparison of the above tools with the corresponding UNIX
tools shows that significant improvements in terms of error handling
as well as efficiency have been achieved:
Rex generated scanners are 4 times faster than those of LEX.
Lalr generated parsers are 2-3 times faster than those of YACC.
Ell generated parsers are 4 times faster than those of YACC.
The input languages of the tools are improvements of the LEX and YACC
inputs. The tools also understand LEX and YACC syntax with the help of
preprocessors.

The tools are publicly copyable. They have been developed in 1987.
They have been tested by generating scanners and parsers for
e. g. Pascal, Modula, Oberon, Ada and found stable.

The tools have been developed using our own Modula-2 compiler called
MOCKA on a MC 68020 based UNIX workstation. They have been ported
to the SUN workstation and been compiled successfully using the SUN
Modula-2 compiler. They also run on VAX/BSD UNIX and VAX/ULTRIX machines.
This should assure a reasonable level of portability.

I would be pleased to send the programs to everybody that is interested.
To minimize the costs (at least for me) I suggest to send a blank SUN
streamer tape. 1/2" magtapes are possible, too, but cause more work for me.
I will return the tape with the sources organized as a UNIX file tree
(tar format). The tape will contain user manuals for each tool.

Literature:

J. Grosch: "Generators for High-Speed Front-Ends",
LNCS 371, pp. 81-92, Berlin: Springer Verlag, 1989.

More papers about Rex and Lalr will be published soon.

PS: We have completed first versions of more tools:
Ast handles abstract syntax trees or attributed trees and graphs
(something like IDL).
Ag generates attribute evaluators for ordered attribute grammars (OAGs).


Josef Grosch
GMD - Forschungsstelle an der Universitaet Karlsruhe
Haid-und-Neu-Str. 7
D-7500 Karlsruhe 1
West Germany

Phone: +721-662226
Email: grosch@gmdka.UUCP
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

grosch@uunet.UU.NET by unido.informatik.uni-dortmund.de with uucp via EUnet for uunet (Josef Grosch) (11/12/89)

      Rex, Lalr and Ell - Compiler Construction Tools
      ===============================================

     Rex (Regular EXpression tool) is a scanner  generator  whose
specifications  are  based  on  regular expressions and arbitrary
semantic actions written in one of  the  target  languages  C  or
Modula-2.  As  scanners sometimes have to consider the context to
unambiguously recognize a token the right context can  be  speci-
fied by an additional regular expression and the left context can
be handled by so-called  start  states.  The  generated  scanners
automatically  compute the line and column position of the tokens
and offer an efficient mechanism  to  normalize  identifiers  and
keywords  to upper or lower case letters. The scanners are table-
driven and run at a speed of 180,000 to 195,000 lines per  minute
on a MC 68020 processor.

     Lalr is a LALR(1) parser generator accepting grammars  writ-
ten  in  extended BNF notation which may be augmented by semantic
actions expressed by statements of the target language. The  gen-
erator  provides  a  mechanism  for  S-attribution,  that is syn-
thesized attributes can be computed during parsing.  In  case  of
LR-conflicts  unlike  other tools Lalr provides not only informa-
tion about an internal state consisting of a set of items but  it
prints a derivation tree which is much more useful to analyze the
problem. Conflicts can be resolved by specifying  precedence  and
associativity of operators and productions. The generated parsers
include automatic  error  recovery,  error  messages,  and  error
repair.  The  parsers  are  table-driven  and  run  at a speed of
400,000 lines per minute. Currently parsers can be  generated  in
the target languages C and Modula-2.

     Ell is a LL(1) parser generator accepting the same  specifi-
cation  language  as  Lalr except that the grammars must obey the
LL(1) property. It  is  possible  to  evaluate  an  L-attribution
during parsing. The generated  parsers  include  automatic  error
recovery,  error  messages,  and  error  repair  like  Lalr.  The
parsers are implemented following the  recursive  descent  method
and  reach a speed of 750,000 lines per minute. The possible tar-
get languages are again C and Modula-2.

     All the tools are implemented in Modula-2.
A comparison of the above tools with the corresponding UNIX
tools shows that significant improvements in terms of error handling
as well as efficiency have been achieved:
Rex generated scanners are 4 times faster than those of LEX.
Lalr generated parsers are 2-3 times faster than those of YACC.
Ell generated parsers are 4 times faster than those of YACC.
The input languages of the tools are improvements of the LEX and YACC
inputs. The tools also understand LEX and YACC syntax with the help of
preprocessors.

The tools are publicly copyable. They have been developed in 1987.
They have been tested by generating scanners and parsers for
e. g. Pascal, Modula, Oberon, Ada and found stable.

The tools have been developed using our own Modula-2 compiler called
MOCKA on a MC 68020 based UNIX workstation. They have been ported
to the SUN workstation and been compiled successfully using the SUN
Modula-2 compiler. They also run on VAX/BSD UNIX and VAX/ULTRIX machines.
This should assure a reasonable level of portability.

I would be pleased to send the programs to everybody that is interested.
To minimize the costs (at least for me) I suggest to send a blank SUN
streamer tape. 1/2" magtapes are possible, too, but cause more work for me.
I will return the tape with the sources organized as a UNIX file tree
(tar format). The tape will contain user manuals for each tool.

Literature:

J. Grosch: "Generators for High-Speed Front-Ends",
LNCS 371, pp. 81-92, Berlin: Springer Verlag, 1989.

More papers about Rex and Lalr will be published soon.

PS: We have completed first versions of more tools:
Ast handles abstract syntax trees or attributed trees and graphs
(something like IDL).
Ag generates attribute evaluators for ordered attribute grammars (OAGs).


Josef Grosch
GMD - Forschungsstelle an der Universitaet Karlsruhe
Haid-und-Neu-Str. 7
D-7500 Karlsruhe 1
West Germany

Phone: +721-662226
Email: grosch@gmdka.UUCP

-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

fjb@metaware.metaware.com (Fred Bourgeois) (11/12/89)

In article <1989Nov5.235947.2907@esegue.segue.boston.ma.us> you write:
> Here's the real question: are there better parser generators than
> yacc and lex which are easily available on Unix systems, ie,
> available on a broad range of systems, robust, and either free or
> not too expensive.  ...

Might I suggest the TWS (Translator Writing System) from MetaWare.
The error handling is better than anything you will find in YACC
based systems.  The parser generated is also efficient.  Further
information can be found in various texts on compilers, including
Barret & Couch.  It is, however, not free (or cheap :-).  You can
also write or call MetaWare for more details.

Fred Bourgeois       [Insert standard disclaimers here]         MetaWare, Inc.
metaware!fjb@uunet.uu.net               |                 2161 Delaware Avenue
...!{acad,amdcad}!metaware!fjb          |            Santa Cruz, CA 95060-5706
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

daven@ibmpcug.co.uk (D R Newman) (11/12/89)

Of course there are better parsers, especially the Declarative Clause
Grammars in Prolog. If you use a compiled Prolog, like LPA, you can even
use the parser with other languages.

OTOH, if you are only parsing a restricted language, not a natural one like
English, you don't need something so powerful - a LISP augmented transition
network will do.
[From D R Newman <daven@ibmpcug.co.uk>]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.