[comp.lang.modula2] Modula2 front end for gcc?

tynor@pyr.gatech.EDU (Steve Tynor) (07/11/89)

Is anyone out there working on a Modula2 front end for gcc (Gnu cc)? As
much as I like gcc, I prefer a more strongly typed (fascist?) language
- especially one like Modula2 or Ada which has a strong distinction
between specification and implementation. I'd be happy with a Modula2
which omits coroutines (I can live with unix fork()). I'd volunteer to
do it myself, but it's not clear to me where to start (nor how much
work it'll be).  I'll gladly lend a hand to anyone already started or
contemplating starting...  Is anyone else out there interested?

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Never put off until tomorrow what you can avoid altogether.   
                     
    Steve Tynor
    Georgia Tech Research Institute
    tynor@gitpyr.gatech.edu

bowen@cs.Buffalo.EDU (Devon E Bowen) (07/13/89)

In article <8728@pyr.gatech.EDU> tynor@pyr.UUCP (Steve Tynor) writes:
>Is anyone out there working on a Modula2 front end for gcc (Gnu cc)?

There's work being done on this here. The compiler here is going to
lex and parse the modula-2 code and generate RTL. Then it'll be turning
the RTL over the tail end of gcc for optimization and code generation.
So this isn't a modula-2 to C converter but rather a portable modula-2
compiler using the gcc techniques (and code).

As far as its characteristics, it will have all the feature described
by Wirth (yes, even coroutines). It will also include some things wanted
by the locals here. Some of these things are:

	1) short type.
	2) variable number of parameters passed to a function.
	3) linking with object files from other langs given a
	   definition module for that object file (ie, will
	   link with C libraries).
	4) assertion function for program correctness.

Our department is big on modula-2, so the main drive behind this project
is for instructional use. This means we are keeping strictly to the specs
from Wirth and only making extensions to it that don't change the meaning
of the language. It also means the warning and error messages will attempt
to be instruction.

I really shouldn't even be talking about this at this stage since it's
only started within the last month, but it is being actively worked on
daily. The lexer and parser are done. Current work is with the symbol
tables and then code generation. We don't expect to have much for another
5 months or so. At that time we'll be looking for beta-testers. Look for
more info in the appropriate gnu newsgroup.


Devon Bowen (KA2NRC)		FAX:	   (716) 636-3464
University at Buffalo		BITNET:    bowen@sunybcs.BITNET
				Internet:  bowen@cs.Buffalo.EDU
UUCP: ...!{watmath,boulder,decvax,rutgers}!sunybcs!bowen

dcw@doc.ic.ac.uk (Duncan C White) (07/13/89)

In article <7894@cs.Buffalo.EDU> bowen@sunybcs.UUCP (Devon E Bowen) writes:
>In article <8728@pyr.gatech.EDU> tynor@pyr.UUCP (Steve Tynor) writes:
>>Is anyone out there working on a Modula2 front end for gcc (Gnu cc)?
>
>There's work being done on this here. The compiler here is going to
>lex and parse the modula-2 code and generate RTL. Then it'll be turning
>the RTL over the tail end of gcc for optimization and code generation.
>So this isn't a modula-2 to C converter but rather a portable modula-2
>compiler using the gcc techniques (and code).

Sounds good... If anyone's interested, I have an MSc student, Peter Klein, who
has just finished a project I was supervising: a portable Modula-2 <--> C
translator.  Peter focused more on the conversion strategies than on the error-
checking, so it's type-checking is a bit too generous at the moment.
It produces K&R (not ANSI) "C" code, with the following (very common)
extensions: it uses 'void', and passes structures on the stack to achieve the
effect of value-parameters.  (Arrays are wrapped up inside structures and
passed in the same way).

It does not support the following aspects of Modula-2:
o	more than one nested level of local modules, (which are sick :-)
o	coroutines: Peter had some thoughts on an implementation of this using
	setjmp() and longjmp()..

It has word-sized sets (for the moment... until I get around to changing this!)

It "de-localizes" any number of levels of nested procedures and a single level
of nested modules.
It does this by moving local types and constants outside (then resolves any
conflict of names), and passing "outer scope variables" to the "inner"
procedures as extra (var) parameters.

It's perfectly happy with all type definitions, including pointer types and
opaque types (those were a swine to translate, Peter found), and is equally
happy with open array parameters (implemented with a hidden "HIGH" parameter
which is what the HIGH(s) function uses).

It compiles a single definition or implementation/main module when requested,
translating a definition module into an include file, and an implementation/
main modula into a .c file.  When processing imports, it re-parses each
definition module - rather than leaving a symbol file behind.

Because it produces C, linking with routines written in C is simple: write a
definition module specifying the procedures you will write in C, create a
dummy implementation module with BEGIN ENDs for all procedures, compile both
parts, and finally fill in all the function bodies in the .c file produced.

I am still testing this translator out, but if anyone would like to look at
it, I'd be happy to send them the source code & documents.  On a strictly
as-is basis!!

>As far as its characteristics, it will have all the feature described
>by Wirth (yes, even coroutines).

May I ask how it does this?  setjmp()/longjmp() ?  or assembly-support.

> .. It will also include some things wanted
>by the locals here. Some of these things are:
>
>	1) short type.
>	2) variable number of parameters passed to a function.
>	3) linking with object files from other langs given a
>	   definition module for that object file (ie, will
>	   link with C libraries).
>	4) assertion function for program correctness.
>

Presumably when you pass a variable no of parameters to a function, type-
checking goes out of the window?

I like the assertion function built into the language.

As to linking with C, that's absolutely essential IMHO.. the lowest level
of our homegrown I/O routines (well, does ANYONE use InOut??) is written in
C.

>Our department is big on modula-2, so the main drive behind this project
>is for instructional use. This means we are keeping strictly to the specs
>from Wirth and only making extensions to it that don't change the meaning
>of the language. It also means the warning and error messages will attempt
>to be instruction.
>
>I really shouldn't even be talking about this at this stage since it's
>only started within the last month, but it is being actively worked on
>daily. The lexer and parser are done. Current work is with the symbol
>tables and then code generation. We don't expect to have much for another
>5 months or so. At that time we'll be looking for beta-testers. Look for
>more info in the appropriate gnu newsgroup.

If you want any beta-testers, let me know...

	Duncan

>
>Devon Bowen (KA2NRC)		FAX:	   (716) 636-3464
>University at Buffalo		BITNET:    bowen@sunybcs.BITNET
>				Internet:  bowen@cs.Buffalo.EDU
>UUCP: ...!{watmath,boulder,decvax,rutgers}!sunybcs!bowen

bowen@cs.Buffalo.EDU (Devon E Bowen) (07/15/89)

In article <944@gould.doc.ic.ac.uk> dcw@doc.ic.ac.uk (Duncan C White) writes:
>I am still testing this translator out, but if anyone would like to look at
>it, I'd be happy to send them the source code & documents.  On a strictly
>as-is basis!!

I'm very interested. Please let me know where I can get a copy.

>>As far as its characteristics, it will have all the feature described
>>by Wirth (yes, even coroutines).
>
>May I ask how it does this?  setjmp()/longjmp() ?  or assembly-support.

This is, of course, OS/architecture dependent. We pan to implement it as
such by leaving it in the separate Processes module. We will then write
the module according to the OS/architecture. We plan to support true
parallel processing on the Encore Multimax and coroutines on BSD and SunOS.
The Multimax port will be simple since the OS already provides for shared
memory and semaphores. For the BSD and SunOS modules, we'll probably use
assembly to save and rewrite the call frame as needed. This is the last
of our concerns at this stage, though. I am very interested in talking to
Peter about his setjmp/longjmp ideas. Can I get an address for him?

>>	2) variable number of parameters passed to a function.
>
>Presumably when you pass a variable no of parameters to a function, type-
>checking goes out of the window?

Right. This is being included to take advantage of the variable parameter
C I/O like printf, etc. The cleanest way to do this is to define a new
procedure type (maybe "cprocedure") that tells the system to just accept
any parameters with no type checking. This still leaves it as an extension
that will not break the standard language.

>I like the assertion function built into the language.

Thanks. This will initially only be included as an external module. It
can then be added as a compiler-known function later. This is needed in
case our compiler eventually ends up used in introductory classes (which
are currently taught on Macs).

>As to linking with C, that's absolutely essential IMHO..

Couldn't agree with you more (expecially since I'm primarily a C hack).

>If you want any beta-testers, let me know...

I'll be posting here when ready. I've gotten a lot of interst by mail
as well.


Devon Bowen (KA2NRC)		FAX:	   (716) 636-3464
University at Buffalo		BITNET:    bowen@sunybcs.BITNET
				Internet:  bowen@cs.Buffalo.EDU
UUCP: ...!{watmath,boulder,decvax,rutgers}!sunybcs!bowen

avi@taux01.UUCP (Avi Bloch) (07/16/89)

In article <7943@cs.Buffalo.EDU> bowen@sunybcs.UUCP (Devon E Bowen) writes:
>In article <944@gould.doc.ic.ac.uk> dcw@doc.ic.ac.uk (Duncan C White) writes:
>>>	2) variable number of parameters passed to a function.
>>
>>Presumably when you pass a variable no of parameters to a function, type-
>>checking goes out of the window?
>
>Right. This is being included to take advantage of the variable parameter
>C I/O like printf, etc. The cleanest way to do this is to define a new
>procedure type (maybe "cprocedure") that tells the system to just accept
>any parameters with no type checking. This still leaves it as an extension
>that will not break the standard language.
>
Another approach that we used here was to define a pseudo-module, i.e. a module
that doesn't actually exist but is built into the compiler (like module SYSTEM).
All objects imported from this module have the following properties:

	i.	The object must be a procedure.
	ii.	No type checking is performed on parameters.
	iii.	The procedure can receive any number of parameters.
	iv.	The procedures can be called either as a procedure or as a
		function.
	v.	Values returned by such procedures are of type WORD, i.e. can be
		coerced to any type of the same size.

Cheers,
Avi Bloch
National Semiconductor (Israel)
avi%taux01@nsc
-- 
	Avi Bloch
National Semiconductor (Israel)
6 Maskit st. P.O.B. 3007, Herzlia 46104, Israel		Tel: (972) 52-522263
avi%taux01@nsc

GRANGERG@VTVM1.BITNET (Greg Granger) (07/18/89)

IMHO the implementation of a variable number of parameters should be
as follows:

  -Introduce a new data type that is compatible will all other data
types and can only be used in a formal parameter list (similar to
ARRAY OF WORD).  Call this new type DESCRIPTOR.
  -Variables of type DESCRIPTOR will be records containing the
variable's type, location and size.  Variables of type RECORD and
ARRAY will require additional type information.  Fields for element
width and type can be added for ARRAYs. RECORD types can be
implemented as a linked list of DESCRIPTOR.

In this way the parameter ARRAY OF DESCRIPTOR would allow you to pass
a variable number of paramenters, and still retain limited (user
enforced) type checking.

Possible problems:  References could be determined at compile time,
except of course for location values.  Parameters of type DESCRIPTOR
could not be passed to other procedures, because it would be difficult
(impossible?) to determine references at compile time.

Still it seems that this approach would allow greater flexiblity and
consistency with M2.

                                                      Greg

bowen@cs.Buffalo.EDU (Devon E Bowen) (07/19/89)

In article <INFO-M2%89071714450923@UCF1VM> Modula2 List <INFO-M2%UCF1VM.bitnet@lilac.berkeley.edu> writes:
>  -Variables of type DESCRIPTOR will be records containing the
>variable's type, location and size.  Variables of type RECORD and
>ARRAY will require additional type information.  Fields for element
>width and type can be added for ARRAYs. RECORD types can be
>implemented as a linked list of DESCRIPTOR.

This is better if the implementation were modula-2 on both sides,
but the purpose of adding this feature in our case is to link with
C libraries. And C expects parameters to be call by value. So
descriptors are not possible.


Devon Bowen (KA2NRC)		FAX:	   (716) 636-3464
University at Buffalo		BITNET:    bowen@sunybcs.BITNET
				Internet:  bowen@cs.Buffalo.EDU
UUCP: ...!{watmath,boulder,decvax,rutgers}!sunybcs!bowen