[comp.compilers] compiler generators.

VMDOS@TECMTYVM.MTY.ITESM.MX (Ing. Pablo Tejeda Zeron) (09/12/90)

   I would like to know if exist some public domain compiler generator for
   DOS, UNIX or OS2. Could you help me?.
   I'm using YACC for generate the parser, but, I want also generate
   intermediate code or target code.

Thank you.
Pablo Tejeda.
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

mike@vlsivie.at (Inst.f.Techn.Informatik) (09/25/90)

In article <90255.105510VMDOS@tecmtyvm.mty.itesm.mx>, VMDOS@TECMTYVM.MTY.ITESM.MX (Ing. Pablo Tejeda Zeron) writes:
> 
>    I would like to know if exist some public domain compiler generator for
>    DOS, UNIX or OS2. Could you help me?.
>    I'm using YACC for generate the parser, but, I want also generate
>    intermediate code or target code.

Several approaches are possible. The more conventional is a code generator
generator which helps in writing (portable) back ends. One such beast is 
the GNU C compiler (gcc). It has been succesfully used for a C++ compiler,
and front ands for Modula-[23] and Fortran are currently being written. 
But this still requires _you_ to generate the intermediate code (RTL) from
which gcc works.
 
The other approach is based on high level semantics. Peter Lee wrote
one such generator, MESS. This approach is still in experimental state, 
so you will have difficulties finding systems which are in production 
use. 

			bye,
				mike

Michael K. Gschwind, Institute for VLSI-Design, Technical University, Vienna
mike@vlsivie.at
mike@vlsivie.uucp
e182202@awituw01.bitnet
Voice: (++43).1.58801 8144
Fax:   (++43).1.569697
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

mike@thor.acc.stolaf.edu (Mike Haertel) (10/01/90)

In article <1852@tuvie> mike@vlsivie.at (Inst.f.Techn.Informatik) writes:
>Several approaches are possible. The more conventional is a code generator
>generator which helps in writing (portable) back ends. One such beast is 
>the GNU C compiler (gcc). It has been succesfully used for a C++ compiler,
>and front ands for Modula-[23] and Fortran are currently being written. 
>But this still requires _you_ to generate the intermediate code (RTL) from
>which gcc works.

I beg to differ, but the bulk of the RTL generating pass of gcc is
language-independent and takes a tree representation as its input.

The tree representation is largely language-independent.  Writing
a new language front end for gcc involves writing a parser and type
checker since the RTL generating pass assumes its input trees are
completely type checked.  There may be subtle assumptions made about
the details of the input trees to RTL generation, but I can't say for
sure.
--
Mike Haertel <mike@acc.stolaf.edu>
[My impression is that the tree routines would need some work for languages
that aren't semantically very similar to C, but I haven't looked very hard.
 -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

mike@vlsivie.at (Inst.f.Techn.Informatik) (10/03/90)

In article <1990Oct1.044328.8051@acc.stolaf.edu>, mike@thor.acc.stolaf.edu (Mike Haertel) writes:
> I beg to differ, but the bulk of the RTL generating pass of gcc is
> language-independent and takes a tree representation as its input.

Yes, this is what I remember also. But this leaves you with having to 
generate the tree. As you you have pointed out, you have to 
type-check the tree, also there are things like writing symbol tables and all 
this really _boring_ stuff.  The truth is, you don't have to generate RTL, 
but a tree representation which can then be translated to RTL.

> [My impression is that the tree routines would need some work for languages
> that aren't semantically very similar to C, but I haven't looked very hard.
>  -John]

Just what my impression is. In fact, there are hacks made by Michael 
Tiemann to support C++. But maybe we can get some information from 
somebody who is _writing_ a new front end? 
It also seems to be necessary to hack some parts of gcc proper to 
accomodate statically nested procedures and other things unknown in 
C/C++.

Michael K. Gschwind, Institute for VLSI-Design, Technical University, Vienna
mike@vlsivie.at
mike@vlsivie.uucp
e182202@awituw01.bitnet
Voice: (++43).1.58801 8144
Fax:   (++43).1.569697
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

moss@cs.umass.edu (Eliot Moss) (10/03/90)

I agree with Mike's comments, for the most part. We are doing the Modula-3
compiler based on gcc, and we did need a new parser and type checker, but can
use most of the existing tree stuff. Since Modula-3 has constructs not
expressible in C, a few modest extensions were required to the tree format,
but not many. We will also need some back end extensions. These are required
mostly for exception handling and garbage collection support. Some of it will
be in gcc 2.0, since the extended C for that includes nested functions (which
Pascal and Modula-* need for implementing their nested procedures) and
exception handling (though Stallman may implement the exception handling for C
differently from the way we have done for Modula-3). Further, the extensions
we have made will not disturb the C compiler, so they will (after appropriate
blessing by Stallman and associates) be incorporated into the base compiler
(the .h files, etc.). Hope this clarifies the effort involved in getting new
languages into the gcc framework ....			Eliot
--

		J. Eliot B. Moss, Assistant Professor
		Department of Computer and Information Science
		Lederle Graduate Research Center
		University of Massachusetts
		Amherst, MA  01003
		(413) 545-4206; Moss@cs.umass.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

moss@cs.umass.edu (Eliot Moss) (10/04/90)

In case my previous posting did not make it clear, I would agree with the
statement that the amount of work needed on the tree part of gcc to
incorporate a "new" programming language would be related to its semantic
similarity to C. Note, though, that what we are really saying is whether the
"new" language has similar fundamental data types and control structures. Type
checking rules, coercions, etc., can vary quite a bit without affecting the
tree work that much. Another way of putting it is that the approach can be
expected to work well for imperative languages in the C, Pascal, Modula, Ada,
etc., tradition.

In handling Modula-3 the most difficult things appear to be exception handling
(not too bad), garbage collection (more difficult), and use before declaration
(which means you need to process entire modules before resolving definitions).
This latter item does not affect the tree data structure all that much, but
rather the front-end control structure, which must build an entire tree,
resolve definitions, and then generate RTL, rather than generating (and
discarding) tree stuff one statement at a time. (There are other ways of doing
the two passes, but the essential difference is between a one-pass and
two-pass front-end; the back end does many "passes" of course, but on smaller
collections of RTL, and each pass is a specialist.)

Hope this clarifies things a bit more. I think that the gcc base could readily
be used to build (for example) and Ada compiler, but it is probably not a good
starting point for (say) Lisp or Smalltalk ....			Eliot Moss
--

		J. Eliot B. Moss, Assistant Professor
		Department of Computer and Information Science
		Lederle Graduate Research Center
		University of Massachusetts
		Amherst, MA  01003
		(413) 545-4206; Moss@cs.umass.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.