[comp.lang.modula2] Modula-2 to C

iwm@doc.ic.ac.uk (Ian Moor) (06/18/91)

gmd have produced a set of compiler writing tools in Modula-2, included
in these is a Modula-2 to C translator, the distribution includes C sources
produced by the translator for bootstrap purposes. I think I remember 
the documentation advising against program development using it -- i.e. it
will translate correct code only. 

For uk readers, the tools are on the doc.ic archive, here is the  README
file from the distribution:

*************************************************************************
*                                                                       *
*  Compiler Construction Tool Box                                       *
*  ==============================                                       *
*                                                                       *
*  Copyright (c) 1989 by                                                *
*                                                                       *
*  Gesellschaft fuer Mathematik und Datenverarbeitung                   *
*  (German National Research Center for Computer Science)               *
*  Forschungsstelle fuer Programmstrukturen                             *
*  an der Universitaet Karlsruhe                                        *
*                                                                       *
*  All rights reserved. GMD assumes no responsibility for the use       *
*  or reliability of its software.                                      *
*                                                                       *
*************************************************************************


Direct requests, comments, questions, and error reports to:

   Josef Grosch
   GMD Forschungsstelle
   Vincenz-Priessnitz-Str. 1
   D-7500 Karlsruhe 1

   Phone: +721-662226
   Email: grosch@gmdka.uucp   or   uunet!unido!gmdka!grosch


Distribution Format:

The compiler construction tool box is distributed on a DC300A data cartridge
(streamer tape) or on a 1/2" magnetic tape (1600 bpi) in tar format.

To read the tape use:   tar -xvfb /dev/rst0 20
                        tar -xvb 20
                        or similar commands

The directories and their contents are as follows:

directory       contents
--------------------------------------------------
README          this file
Makefile        sketches the compilation and installation of the tools
rex             Scanner Generator
lalr            LALR(1) Parser Generator
ell             Recursive Descent Parser Generator
bnf             Transforms Grammars from Extended BNF to Plain BNF
struct          Common Front-End of Lalr, Ell, and Bnf
reuse           Library of Reusable Modules
specs           Example Specifications for the Above Tools
cg              Common Program implementing Ast and Ag
                Ast = Generator for Abstract Syntax Trees
                Ag  = Attribute Evaluator Generator
l2r             Transforms Lex  input to Rex  input
y2l             Transforms Yacc input to Lalr input
hexa            contains the scanner and parser tables of Rex and Struct
                (= front-end of Lalr, Ell, and Bnf) converted from binary to
                ascii hexadecimal representation
bin             shell scripts (my version)
lib             executables, table and data files (for SUN 3/SunOS 4.0)


The names of the subdirectories indicate the following types of information:

sub directory   contents
--------------------------------------------------
doc             documentation, user's manual, manual pages
		suffix me	: roff format, me macros
		suffix ps	: postscript format
		suffix 1	: roff format, man macros
src             source files in Modula-2
mod             source files in Modula-2
m2c             source files in C (generated from the  Modula-2 sources)
c               source files in C (hand-written)
lib             data files, module skeletons
test            test environment for a tool


Installation:

To compile and install the programs visit the directories listed below,
look at the README file, and execute an appropriate make command.
See also the Makefile at the global level.

Source Language Modula-2:

   reuse/c
   reuse/mod
   rex/src
   struct/src
   bnf/src
   ell/src
   lalr/src
   struct/doc
   l2r
   y2l
   cg/src

Source Language C:

   reuse/c
   reuse/m2c
   rex/m2c
   struct/m2c
   bnf/m2c
   ell/m2c
   lalr/m2c
   struct/doc
   l2r
   y2l
   cg/m2c



              Compiler Construction Tool Box
              ==============================

     Rex (Regular EXpression tool) is a scanner  generator  whose
specifications  are  based  on  regular expressions and arbitrary
semantic actions written in one of  the  target  languages  C  or
Modula-2.  As  scanners sometimes have to consider the context to
unambiguously recognize a token the right context can  be  speci-
fied by an additional regular expression and the left context can
be handled by so-called  start  states.  The  generated  scanners
automatically  compute the line and column position of the tokens
and offer an efficient mechanism  to  normalize  identifiers  and
keywords  to upper or lower case letters. The scanners are table-
driven and run at a speed of 180,000 to 195,000 lines per  minute
on a MC 68020 processor.

     Lalr is a LALR(1) parser generator accepting grammars  writ-
ten  in  extended BNF notation which may be augmented by semantic
actions expressed by statements of the target language. The  gen-
erator  provides  a  mechanism  for  S-attribution,  that is syn-
thesized attributes can be computed during parsing.  In  case  of
LR-conflicts  unlike  other tools Lalr provides not only informa-
tion about an internal state consisting of a set of items but  it
prints a derivation tree which is much more useful to analyze the
problem. Conflicts can be resolved by specifying  precedence  and
associativity of operators and productions. The generated parsers
include automatic  error  recovery,  error  messages,  and  error
repair.  The  parsers  are  table-driven  and  run  at a speed of
560,000 lines per minute. Currently parsers can be  generated  in
the target languages C and Modula-2.

     Ell is a LL(1) parser generator accepting the same  specifi-
cation  language  as  Lalr except that the grammars must obey the
LL(1) property. It  is  possible  to  evaluate  an  L-attribution
during parsing. The generated  parsers  include  automatic  error
recovery,  error  messages,  and  error  repair  like  Lalr.  The
parsers are implemented following the  recursive  descent  method
and  reach a speed of 810,000 lines per minute. The possible tar-
get languages are again C and Modula-2.


Ast - A Generator for Abstract Syntax Trees

- generates abstract data types (program modules) to handle trees
- the trees may be attributed
- besides trees graphs are handled as well
- nodes may be associated with arbitrary many attributes of arbitrary type
- specifications are based on extended context-free grammars
- common notation for concrete and abstract syntax
- as well as for attributed trees and graphs
- an extension mechanism provides single inheritance
- trees are stored as linked records
- generates efficient program modules
- generates modules in Modula-2 or C
- provides many tree operations (procedures):
- node constructors combine aggregate notation and storage management
- ascii graph reader and writer
- binary graph reader and writer
- reversal of lists
- top down and bottom up traversal
- interactive graph browser


Ag - An Attribute Evaluator Generator

- processes ordered attribute grammars (OAGs)
- processes higher order attribute grammars (HAGs)
- operates on abstract syntax
- is based on tree modules generated by Ast
- the tree structure is fully known
- terminals and nonterminals may have arbitrary many attributes
- attributes can have any target language type
- allows tree-valued attributes
- differentiates input and output attributes
- allows attributes local to rules
- allows to eliminate chain rules
- offers an extension mechanism (single inheritance)
- attributes are denoted by unique selector names
   instead of nonterminal names with subscripts
- attribute computations are expressed in the target language
- attribute computations are written in a functional style
- attribute computations can call external functions
- non-functional statements and side-effects are possible
- allows to write compact, modular, and readable specifications
- AGs can consist of several modules
- the context-free grammar is specified only once
- checks an AG for completeness of the attribute computations
- checks for unused attributes
- checks an AG for the classes SNC, DNC, OAG, LAG, and SAG
- the evaluators are directly coded using recursive procedures
- generates efficient evaluators
- generates evaluators in Modula-2 (or C)


A comparison of the above tools with the corresponding UNIX
tools shows that significant improvements in terms of error handling
as well as efficiency have been achieved:
Rex generated scanners are 4 times faster than those of LEX.
Lalr generated parsers are 2-3 times faster than those of YACC.
Ell generated parsers are 4 times faster than those of YACC.
The input languages of the tools are improvements of the LEX and YACC
inputs. The tools also understand LEX and YACC syntax with the help of
the preprocessors l2r and y2l.

The tool box is publicly copyable. It has been developed since 1987.
It has been tested by generating scanners and parsers for
e. g. Pascal, Modula, Oberon, Ada and found stable.

The tool box is implemented in Modula-2. It has been developed using our
own Modula-2 compiler called MOCKA on a MC 68020 based UNIX workstation.
It has been ported to the SUN workstation and been compiled successfully
using the SUN Modula-2 compiler. The tools also run on VAX/BSD UNIX and
VAX/ULTRIX machines. This should assure a reasonable level of portability
for the Modula-2 code. Meanwhile the sources exist in C, too.

I would be pleased to send the programs to everybody that is interested.
To minimize the costs (at least for me) I suggest to send a blank SUN
streamer tape. 1/2" magtapes are possible, too. I will return the tape
with the sources organized as a UNIX file tree (tar format).
The tape will contain user manuals (troff format) for each tool.

References:

1.   J. Grosch, `Generators  for  High-Speed  Front-Ends',  LNCS,
     371, 81-92 (Oct. 1988), Springer Verlag.

2.   W. M. Waite, J. Grosch and F. W.  Schroeer,  `Three  Compiler
     Specifications', GMD-Studie Nr. 166, GMD Forschungsstelle an
     der Universitaet Karlsruhe, Aug. 1989.

3.   J. Grosch,  `Efficient  Generation  of  Lexical  Analysers',
     Software-Practice & Experience, 19, 1089-1103 (Nov. 1989).
--
Ian W Moor
  Internet: iwm@doc.ic.ac.uk
  JANET: iwm@uk.ac.ic.doc
           
 Department of Computing,  That which you call a crime when one man does it,
 Imperial College.         you call government when done by many.
 180 Queensgate          
 London SW7 UK.          

david@oahu.cs.ucla.edu (David Dantowitz) (06/28/91)

The gmd tools work VERY well and have saved us uncounted translation time.

-- 
David Dantowitz
david@cs.ucla.edu

Singing Barbershop when I'm not computing...