dcw@doc.ic.ac.uk (Duncan C White) (07/16/87)
I have a proposal which I would like people to consider, and comment on: I will outline it as a series of observations, leading to the proposal: 1). every system worth talking about has a C compiler. 2). often, the C compiler is heavily optimised. 3). C is often referred to as a high-level assembler. What people omit is the important word 'portable'. 4). Compilers conventionally produce assembler code. 5). Why not bring all these together, and write a portable M2 compiler which generates C as it's intermediate code. This would imply no tedious mucking around with new backends for different processors. This is the approach used by C++... So, there is the proposal... is the translation likely to be incredibly difficult ? What about processes ? Libraries ? Any other pitfalls I should watch out for ? Anyone else interested in cooperating on such a project ? [Is it completely irrevelent, or crackpot... all suggestions considered] Personally, I would much prefer to write such a compiler in C [on the grounds that it then becomes easier to port the compiler to other systems] Obviously, you'd get a free M2->C translator for the one-off conversion... Duncan. ----------------------------------------------------------------------------- JANET address : dcw@uk.ac.ic.doc| Snail Mail : Duncan White, --------------------------------| Dept of Computing, This space intentionally | Imperial College, left blank...... | 180 Queen's Gate, (paradoxical excerpt from | South Kensington, IBM manuals) | London SW7 ---------------------------------------------------------------------------- Tel: UK 01-589-5111 x 4982/4991 ----------------------------------------------------------------------------
crb@SUN.COM (Chuck Bilbe) (07/17/87)
In regard to Duncan White's recent note proposing a Modula-2 to C translation scheme --- no, it isn't a crackpot scheme. That would make me a crackpot. I envisioned and supervised just such a project at Hewlett-Packard Logic Systems Division in 1984. We built the compiler by "retargeting" the Zurich M2M compiler (written in Pascal) and driving it with a shell script under HP-UX. Here are some of the things we found out: PRO Up and running in two (!) months. Object code quality not shabby at all. C makes a fine assembly language. Use of C register variables provided good support for a simulated Stack Pointer and simulated Frame Pointer. "Intermediate" object code (e.g. C) transportable to virtually any machine. Semantic complexity of the "code generator" (c producer) is quite low. It was easy to arrange for inter-language ( C <==> Modula-2 ) calling. CON Extremely slow (we're talking molasses in January) compilation rates. Virtually no support for decent high-level debugging (info lost in translation) Difficult, politically, to convince people it was a "real" compiler. Since C doesn't support nested scopes, activation records couldn't be on C stack, and a "parallel" stack had to be implemented as a linked list of structures. Since C doesn't support name conflicts at outer level, symbolic info was lost because names had to be hoked-up (e.g. InOut.WriteLn might be "InOut00002WriteLn") so the linker wouldn't complain. We had some difficulty in translating function within expressions and still preserving the original evaluation ordering. We never did figure out (nor were we really interested in) a decent implementation of coroutines. We had some difficulties with the C compiler, stretching it in ways it wasn't used to. (overflowing name tables, too many labels, ... etc.) Don't expect the resulting C code to be understandable or readable by humans. It isn't. Of course, that's often true of human-produced C code (:-) Overall, it was worth doing; I do not consider it a way to achieve a production compiler but not a bad way to bootstrap a real one. -- Chuck Bilbe
lmjm@doc.ic.ac.UK (07/17/87)
Date: 16 Jul 87 16:31:05 GMT From: Duncan C White <eagle!icdoc!dcw@ucbvax.berkeley.edu> Organization: Dept. of Computing, Imperial College, London, UK. 1). every system worth talking about has a C compiler. True, true. 2). often, the C compiler is heavily optimised. Not quite so true - but getting better - eg: GCC. 3). C is often referred to as a high-level assembler. What people omit is the important word 'portable'. Hmmmmmmmm 4). Compilers conventionally produce assembler code. Most Unix compilers do anyway. 5). Why not bring all these together, and write a portable M2 compiler which generates C as it's intermediate code. This would imply no tedious mucking around with new backends for different processors. This is the approach used by C++... When I last had to port a M2 compiler to a new machine I thought about this. The recent messge from Chuck Bilbe covers the problems I was able to think of. I instead chose a slightly different route. Most Unix machines (not all) support a two pass C compiler commonly called PCC (Portable C Compiler). The second pass is the code generator and is also used by a pascal and fortran compiler. I took the ETH 4 pass compiler and converted its 4th pass to output a file suitable for PCC's code generator to use. This means the M2 compiler is pretty portable - so long as you have PCC on your target machine - but dodges many of the problems going to C. Since the machine it was targeted for was blindingly fast I never bothered with optimisation but I get some for free by using the standard C optimiser (which works on the code generated by PCC). Given the structure of the ETH compiler adding in simple lifetime analysis is relatively straight forward. This is a simplified description but should give you the gist of it. Obviously this approach is not as generally useful as Duncan's. But lets face it, there are a lot of Unix boxes out there now. A idea I was toying with would be to do something similar to the above scheme except going to the GNU optimising C compiler as this is likely to become available on a wide range of machines. Lee. -- UKUUCP SUPPORT Lee McLoughlin "What you once thought was only a nightmare is now a reality!" Janet: lmjm@uk.ac.ic.doc, lmcl@uk.ac.ukc DARPA: lmjm@doc.ic.ac.uk (or lmjm%uk.ac.ic.doc@cs.ucl.ac.uk) Uucp: lmjm@icdoc.UUCP, ukc!icdoc!lmjm
steve@hubcap.UUCP (Steve ) (07/17/87)
In article <8707171628.AA02971@odysseus.sun.com>, crb@SUN.COM (Chuck Bilbe) writes: > In regard to Duncan White's recent note proposing a Modula-2 to C translation > scheme --- no, it isn't a crackpot scheme. That would make me a crackpot. I had two graduate students do a modula to C preprocessor in the spring of '86. Our experiences were much the same as Chuck described. It was a great exercise for the students (Two very good ones). Steve steve@hubcap.clemson.edu (aka D. E. Stevenson), dsteven@clemson.csnet Department of Computer Science, (803)656-5880.mabell Clemson Univeristy, Clemson, SC 29634-1906
bpendlet@esunix.UUCP (Bob Pendleton) (07/20/87)
in article <482@ivax.doc.ic.ac.uk>, dcw@doc.ic.ac.uk (Duncan C White) says: > > 4). Compilers conventionally produce assembler code. > This statement is only true in the UNIX(TM) world. It still just plain blows me away when I encounter people who have never used a compiler that didn't generate assembly code. My experience, measurments made on assemblers I've written and used, is that 40 to 60 percent of assembly time is lexical analysis. Formatting and writing assembly code in a compiler can add 10 or more percent to compilation times, in non optimizing compilers it can be 25%. Generating a linkable file directly is a major performance win. Why does the UNIX world tolerate such slow compilation? What Duncan suggests will work. But it will be sloooow. > > Duncan. > -- Bob Pendleton @ Evans & Sutherland UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet I am solely responsible for what I say.
abbas@CORWIN.CCS.NORTHEASTERN.EDU (07/20/87)
I saw your note regarding the M2 to C translator. This is to inform you, that one of my student has developed such a system and we are putting the final touches to it. There will be also a technical report on that discussing the issues and how we resolved them. For example how would you unnest modules and procedures?!! If you are interested I will send you a copy of the Tech. report when it is ready ( very soon). The system runs on Macintosh and translates to LightspeedC, the design enables us to get translation to other flavors of C very easily without modifying the code at all! --Abbas Birjandi US Mail: College of Computer Science Northeastern University 360 Huntington Ave, Boston MA, 02115 tel:(617) 964-3077
ronald@csuchico.UUCP (Ronald Cole) (08/07/87)
In article <482@ivax.doc.ic.ac.uk>, dcw@doc.ic.ac.uk (Duncan C White) writes: > > I have a proposal which I would like people to consider, and comment on: > I will outline it as a series of observations, leading to the proposal: ... > 5). Why not bring all these together, and write a portable M2 compiler > which generates C as it's intermediate code. Duncan, I have been working on this project in my spare time for the last year now. My compiler is a four pass compiler implemented using yacc and lex in C. Translating Modula-2 to C is a lot easier than translating C to Modula-2. I am generating a very portable C as per J.E. Lapin's new portable C book. If you are interested in more information send me email. Ron -- Ronald Cole | uucp: ihnp4!csun!csuchic!ronald AT&T 3B5 System Administrator | PhoneNet: ronald@csuchico.edu @ the #_1_ party school in the nation: | voice (916) 895-4635 California State University, Chico "It's O.K." -Hal Landon Jr., Eraserhead