[comp.lang.misc] Parsers using OOP . What objects to use ? Examples ?

sshws@convx1.lerc.nasa.gov (Herb Schilling) (06/12/91)

This is a subject that has been talked about in other forums but I
just recently "re-discovered" this newsgroup so I thought I would ask
about it here. 

We would like to implement a parser using Turbo Pascal. It seems that 
using OOP concepts could naturally be applied to the problem. It 
would help with developing, debugging and extending the code. Has anybody 
out there done or thought about writing a parser using
objects ? I would be interested in reading about the structure of the objects. 
Examples, of course, would be even better.

By the way, I saw the article in the latest Computer Language magazine.
This was good and I called the author to get more information but I am
interested more in the parsing of something like complex mathematical 
expressions than the compilation of assembler statements.

--
Herb Schilling , NASA Lewis Research Center , 21000 Brookpark Road,
Cleveland, Ohio 44135 . (216) 433-5058       sshws@convx1.lerc.nasa.gov

sra@ecs.soton.ac.uk (Stephen Adams) (06/13/91)

In article <1991Jun11.185402.21089@eagle.lerc.nasa.gov> sshws@convx1.lerc.nasa.gov (Herb Schilling) writes:

 > We would like to implement a parser using Turbo Pascal. It seems that 
 > using OOP concepts could naturally be applied to the problem. It 
 > would help with developing, debugging and extending the code. Has anybody 
 > out there done or thought about writing a parser using
 > objects ? I would be interested in reading about the structure of the objects. 

I know of two references that would interest you.  The first
deals with writing a parser in a way that makes your typical
huge recursive descent parser become a collection of
modules.  The second is more general, discussing further
issues of modularity in a compiler.  I would re-read both if
I had to write a language processor in Turbo Pascal /
Modula-2 / C++ etc.


@ARTICLE{ko:lazy,
  AUTHOR	= "Kai Koskimies",
  TITLE		= "Lazy Recursive Descent Parsing for
		   Modular Language Implementation",
  JOURNAL	= "{Software---Practice and Experience}",
  YEAR		= 1990,
  MONTH		= aug,
  VOLUME	= 20,
  NUMBER	= 8,
  PAGES		= "749-772",
}

@INPROCEEDINGS{ko:soft,
  AUTHOR	= "Kai Koskimies",
  TITLE		= "Software Engineering Aspects in Language Implementation",
  CROSSREF	= "CCHSC88",
  PAGES		= "39-51",
}
@PROCEEDINGS{CCHSC88,
  KEY		= "CCHSC88",
  EDITOR	= "D. Hammer",
  TITLE		= "Compiler Compilers and High Speed Compilation",
  BOOKTITLE	= "Compiler Compilers and High Speed Compilation",
  YEAR		= 1988,
  MONTH		= oct,
  SERIES	= "Lecture Notes in Computer Science",
  VOLUME	= 371,
  CITY		= "Berlin",
  PUBLISHER	= S-V,
}

--
Stephen Adams                        S.R.Adams@ecs.soton.ac.uk
Computer Science
University of Southampton
Southampton SO9 5NH, UK

paj@mrcu (Paul Johnson) (06/14/91)

In article <1991Jun11.185402.21089@eagle.lerc.nasa.gov> sshws@convx1.lerc.nasa.gov (Herb Schilling) writes:

>We would like to implement a parser using Turbo Pascal. It seems that
>using OOP concepts could naturally be applied to the problem.  [...]
>Has anybody out there done or thought about writing a parser using
>objects ?

Yup.  Look at the Eiffel libraries.  Briefly:

The Eiffel library lexer is a lex(1) style tool which will either
build the lexer on the fly or load an existing one in from a file
(Eiffel makes storing and reloading complex objects a trivial matter).
The parser library provides a base class CONSTRUCT and derived classes
SEQUENCE (for sequences of 0-n other constructs), AGGREGATE (for
grammer productions of the form A ::= BCD), CHOICE (for productions of
the form A ::= B|C|D), TERMINAL for lexical tokens and KEYWORD for
keywords.  The SEQUENCE, AGGREGATE and CHOICE classes are the
interesting ones.  They store the current lexer position (for later
backtracking) and then read the sequence of lexer tokens and try to
match them.  If they succeed they set a flag and return.  If they fail
they clear the flag, tell the lexer to back up and then return.  To
use these classes you derive your own which contain functions to
generate a list of the expected objects.

Of course there is a bit more than this.  There is a procedure
`commit' which is used for syntax error detection (meaningful syntax
error messages are generated automatically).  CONSTRUCT also contains
deferred functions `preaction' and `postaction' which get called once
the parsing has been done.  You put your semantic actions in them.

Paul.


-- 
Paul Johnson |            Isn't modern education wonderful: one size fits all!
-------------^------------------v-------------------------v-------------------
GEC-Marconi Research is not 	| Telex: 995016 GECRES G  | Tel: +44 245 73331
responsible for my opinions.	| Inet: paj@gec-mrc.co.uk | Fax: +44 245 75244