greg@ait.trl.oz (Greg Aumann) (02/19/90)
It is becoming more and more obvoius that Minix needs a C compiler with source that can be distributed without restrictions and modified easily. Problems caused by the current ACK compiler are that it is difficult to get bugs fixed. There is little hope of seeing desirable extensions such as ANSI conformance etc. Also you cannot look at the source and learn about compilers. The original intention of using minix to teach OS courses or for self study could apply equally well to compilers if the source were available and readable. Note when I write compiler I also mean to include an assembler and a loader. The ideal minix compiler would be small so that it would fit in 64k (probably with multiple passes), it would have the same front end for the ST and the PC, be distributed on the minix disks in source form (or at least on the net in source form), and it would be easy to modify to add say floating point and ANSI conforamce for a start and to port to new architectures. The ACK compiler kit is out as it is much too big and expensive. The source generated by the kit is also of little or no use as it is unmodifiable. Gcc is also too big (although this may not apply in a few more years). It seems that there are really two good candidates. One is for some minix fiend to write one and release it. This may the intention of Bruce Evans who has released binaries for a PC compiler that he is working on. A second alternative is the Sozoban C compiler. I don't know a great deal about it but my understanding is that generates code for the 68000 is fairly portable and it fits the above criteria. Would someone who knows more about the Sozoban compiler please comment on its suitability for minix and how easy it would be to modify the backend to generate code for the 8086 and 80386 etc. Also could Bruce please comment on what he intends for his compiler. Personally I think the best solution would be to find an existing compiler that is close to what we want and modify it. This is because writing a compiler from scratch is a very large task and shouldn't be underestimated. Would anyone who know of other possibly suitable compilers please also comment. This article is an invitation for comments. Hopefully it will end in a consensus of how to get the sort of compiler that we want and need for minix. Greg Aumann ------------------------------------------------------------------------- Artificial Intelligence Systems ACSnet: greg@trlamct.trl.oz Telecom Research Laboratories Internet: greg@trlamct.trl.oz.au Melbourne, AUSTRALIA Voice: +61 3 541 6222 ------------------------------------------------------------------------- Artificial Intelligence Systems ACSnet: greg@trlamct.trl.oz Telecom Research Laboratories Internet: greg@trlamct.trl.oz.au Melbourne, AUSTRALIA Voice: +61 3 541 6222
croes@fwi.uva.nl (Felix A. Croes) (02/19/90)
In article <1050@trlluna.trl.oz> greg@ait.trl.oz (Greg Aumann) writes: >It is becoming more and more obvoius that Minix needs a C compiler with >source that can be distributed without restrictions and modified >easily. Problems caused by the current ACK compiler are that it is >difficult to get bugs fixed. There is little hope of seeing desirable >extensions such as ANSI conformance etc. Also you cannot look at the >source and learn about compilers. The original intention of using >minix to teach OS courses or for self study could apply equally well to >compilers if the source were available and readable. Note when I write >compiler I also mean to include an assembler and a loader. The current compiler will be replaced by an ANSI C compiler in 2.0 - again, not in source. On all other points, I agree. [description of ideal Minix compiler deleted] The ideal Minix compiler would be public domain (of course). >The ACK compiler kit is out as it is much too big and expensive. The >source generated by the kit is also of little or no use as it is >unmodifiable. Gcc is also too big (although this may not apply in a >few more years). Gcc is too big, period. The ACK idea is fine, when trimmed down to what it really is all about: using EM as an intermediate language. [description of possible candidates deleted] > >Personally I think the best solution would be to find an existing >compiler that is close to what we want and modify it. This is because >writing a compiler from scratch is a very large task and shouldn't be >underestimated. Would anyone who know of other possibly suitable >compilers please also comment. It seems to me that writing a compiler is never underestimated by anyone in this newsgroup, rather the reverse applies. I propose the following: step by step replace the existing ACK compiler by a public domain version. A friend of mine is presently working on a ANSI C front end. Another friend is working on a 68000 code generator. I have already written a loader for Minix ST (shouldn't be too difficult to port it to the PC, once asld is split in a loader and an assembler), and I an thinking about writing a C++ front end. Comments? -- Felix Croes (croes@fwi.uva.nl)
HBO043%DJUKFA11.BITNET@CUNYVM.CUNY.EDU (Christoph van Wuellen) (02/19/90)
As to compilers with source: The sozobon compiler/optimizer/assembler/linker is freely available, I have tested it and found only two errors (but: they were catastrophic when hosting the compiler on a Sun386i workstation). BUT: I see little chance to hack a INTEL 8088 version from it. (and: I wont bet it fits into 64K). 2.) I have written an 68000 compiler derived from some raw Material I got by email (original Author: M. Brandt). It now handles signed and unsigned char/short/long and float/double, but double being a synonym for float. I have compiled MINIX with it successfully during a MINIX port I've completed now, but I feel there are some problems left. In a few weeks I will send the compiler to the referees. It does everything in core, avoiding intermediate files, it is optimizing and maps frequently occuring expressions on registers, thus yielding 1008 (Version 2.1) dhrystones with or without the register attribute. 3.) I agree, we should have the source code of our compilers 4.) Perhaps the ACK code generators are not the best, implementing virtual stack machines on register CPU's /Christoph van Wuellen
DN5@psuvm.psu.edu (02/19/90)
There was a book recently released called (I believe) _Compiler Construction in C_. It contains source for a YACC clone, a LEX clone, and a C compiler. As this is a book for a compiler course, perhaps the author would be agreeable to having his compiler ported over to Minix? Note: I have not seen the book, only saw mention of it in comp.compilers and people there seemed to be impressed with it. As soon as I can afford it, I plan to get a copy. D. Jay Newman dn5@psuvm.psu.edu
stailey@iris613.gsfc.nasa.gov (Ken Stailey) (02/19/90)
In article <429@fwi.uva.nl> croes@fwi.uva.nl (Felix A. Croes) writes: >In article <1050@trlluna.trl.oz> greg@ait.trl.oz (Greg Aumann) writes: >The current compiler will be replaced by an ANSI C compiler in 2.0 - again, not >in source. On all other points, I agree. > Will the new compiler be available for ST MINIX too? > INET stailey@iris613.gsfc.nasa.gov UUCP {backbone}!dftsrv!iris613!stailey
ZZASSGL%cms.manchester-computing-centre.ac.uk@nsfnet-relay.ac.uk (02/20/90)
Prehaps a first step would be to port one of the many Small C compilers onto Minix. OK, you would not be able to compile Minix but at least it gives everyone a base to work from. Geoff. UTS Sys Admin mcc
nfs@notecnirp.Princeton.EDU (Norbert Schlenker) (02/20/90)
I'm going to batch some comments regarding this thread. |In article <429@fwi.uva.nl> croes@fwi.uva.nl (Felix A. Croes) writes: |In article <1050@trlluna.trl.oz> greg@ait.trl.oz (Greg Aumann) writes: |>It is becoming more and more obvoius that Minix needs a C compiler with |>source that can be distributed without restrictions and modified |>easily. Problems caused by the current ACK compiler are that it is |>difficult to get bugs fixed. There is little hope of seeing desirable |>extensions such as ANSI conformance etc. Also you cannot look at the |>source and learn about compilers. The original intention of using |>minix to teach OS courses or for self study could apply equally well to |>compilers if the source were available and readable. Note when I write |>compiler I also mean to include an assembler and a loader. | |The current compiler will be replaced by an ANSI C compiler in 2.0 - again, not |in source. On all other points, I agree. | |[description of ideal Minix compiler deleted] |The ideal Minix compiler would be public domain (of course). Here's another vote in favour of all of the above. As for the ANSIness of the 2.0 compiler, that is a secondary consideration. The big problem with the Minix compilers is that source is not really available and that there is no real facility for bug fixes. I have always received polite responses to my bug reports from Andy and/or Ceriel; almost invariably, the bugs have been reported previously by others and are fixed in the next release. BUT the next release is just too far away, much too far away. I have almost resorted to cross-compilation under DOS, but have resisted so far. I know that many others have simply given up on the ACK compiler. |Gcc is too big, period. The ACK idea is fine, when trimmed down to what it |really is all about: using EM as an intermediate language. Agreed. |... |A friend of mine is presently working on a ANSI C front end. Another friend is |working on a 68000 code generator. I have already written a loader for Minix ST |(shouldn't be too difficult to port it to the PC, once asld is split in a loader |and an assembler), and I an thinking about writing a C++ front end. |... |Felix Croes (croes@fwi.uva.nl) Fine ideas. I believe that the 2.0 cc will have separate assembler and loader, after which I think the process becomes much simpler. Felix's loader would be ported fairly easily (at least it looked that way to me when I saw it). An assembler, while by no means trivial in a general sense (e.g. if you want useful macros), isn't hard once you have an object file format to translate to. And once we have that, a code generator that maps EM code to assembly cannot be too far behind (aren't all you 80386 owners using Minix tired of the inability to use 32 bit facilities?). As for cpp/cem, I would happily leave them as is, not wanting personally to get involved in all that grotty stuff. I can easily imagine Minix cc (PC version) being almost entirely replaced in short (and reverse) order. In article <11528@nigel.udel.EDU> INFO-MINIX@UDEL.EDU writes: |As to compilers with source: |The sozobon compiler/optimizer/assembler/linker is freely available, I |have tested it and found only two errors (but: they were catastrophic when |hosting the compiler on a Sun386i workstation). |BUT: I see little chance to hack a INTEL 8088 version from it. |(and: I wont bet it fits into 64K). Having looked at the Sozobon compiler myself, I have to say that I see little hope as well. I had hopes of using it, but the code generation and register allocation, nominally cleanly separated in the code, are actually spread throughout the compiler. There also seems to be heavy dependence on the large number of registers and orthogonality of the 680x0 instruction set (not that I think those are bad things - but 80x86 CPU's don't qualify in either regard). The Sozobon compiler is also not ANSI, and making it ANSI looks like a lot of work. With regard to the size, that's not really a problem. I couldn't get PC Minix cc to compile the Sozobon compiler, but there is no problem compiling it under DOS (using Microsoft C) and the total executable size is only ~90K (with all the debugging code included). Quite a creditable piece of work. |2.) I have written an 68000 compiler derived from some raw Material I got |by email (original Author: M. Brandt). It now handles signed and unsigned |char/short/long and float/double, but double being a synonym for float. |I have compiled MINIX with it successfully during a MINIX port I've |completed now, but I feel there are some problems left. In a few weeks |I will send the compiler to the referees. It does everything in core, |avoiding intermediate files, it is optimizing and maps frequently occuring |expressions on registers, thus yielding 1008 (Version 2.1) dhrystones |with or without the register attribute. Interesting. How hard would it be to add code generation for the 80x86 family? Norbert
R21014%UQAM.bitnet@ugw.utcs.utoronto.ca (Luc Dupuy) (02/20/90)
On Mon, 19 Feb 90 13:24:22 GMT <DN5@PSUVM.PSU.EDU> said: >There was a book recently released called (I believe) _Compiler Construction >in C_. It contains source for a YACC clone, a LEX clone, and a C compiler. >As this is a book for a compiler course, perhaps the author would be >agreeable to having his compiler ported over to Minix? > >Note: I have not seen the book, only saw mention of it in comp.compilers >and people there seemed to be impressed with it. As soon as I can afford >it, I plan to get a copy. > > D. Jay Newman > dn5@psuvm.psu.edu Could you give a more precise reference to the book : Author Editor Publisher etc. I would appreciate, thank you very much Salutations amicales, luc dupuy Centre d'analyse de textes par ordinateur universite du quebec a montreal r21014@uqam.bitnet
ghelmer@dsuvax.uucp (Guy Helmer) (02/20/90)
In article <24304@princeton.Princeton.EDU>, nfs@notecnirp.Princeton.EDU (Norbert Schlenker) writes: > I'm going to batch some comments regarding this thread. > .... > > |2.) I have written an 68000 compiler derived from some raw Material I got > |by email (original Author: M. Brandt). It now handles signed and unsigned > |char/short/long and float/double, but double being a synonym for float. > |I have compiled MINIX with it successfully during a MINIX port I've > |completed now, but I feel there are some problems left. In a few weeks > |I will send the compiler to the referees. It does everything in core, > |avoiding intermediate files, it is optimizing and maps frequently occuring > |expressions on registers, thus yielding 1008 (Version 2.1) dhrystones > |with or without the register attribute. > > Interesting. How hard would it be to add code generation for the 80x86 > family? > > Norbert It would be a good challenge. The code generator is well separated from the rest of the compiler. I think it would be tough to get really hot code, but with the 80x86 register set it's always been hard to get hot code out of a compiler. I'm waiting for the compiler to come through the referees list, and as soon as it does, I'll merge the changes into my copy and get an Intel code generator in it. -- Guy Helmer ...!uunet!loft386!dsuvax!ghelmer Dakota State University Computing Services helmer@sdnet.bitnet Software Engineering: "'How to program if you cannot.'" - Dijkstra
HBO043%DJUKFA11.BITNET@cunyvm.cuny.edu (Christoph van Wuellen) (02/21/90)
(about how difficult it would be to add 80x86 code generation for my compiler) I dislike Intel processors very much, so I won't do anything on it. I have put some assumptions in the expression parser that longs are more or less compatible with pointers (when doing pointer additions, the ints are casted to long, the longs are multiplied and the result is supposed to be an acceptable offset for a pointer). My personal opinion is: Forget about 8088,8086,80286 and use 80386 in 32-bit mode. P.S. I would like to send the compiler to the referees and have to uuencode the compressed tar file. How many compression bits may I use (13?). /Christoph van Wuellen, Bochum, Germany.
HBO043%DJUKFA11.BITNET@cunyvm.cuny.edu (Christoph van Wuellen) (02/21/90)
To those who write a new code generator (68000) for the ACK compiler: Recently I tuned a code generator for an 68000 C-compiler and so I expect a leap of 150..200 Dhrystones/sec from the following single change: - There is frequently the situation of multiplying two longs which were cast from short. The discussion on the speedup by changin _mli.s makes clear how important this pattern is (e.g. with pointer additions) I found it trivial to map this on a 68000 muls instruction, avoiding a library call (there is a similar pattern for the mulu instruction). This change let my compiler jump from 850 to 1000 dhrystones, an optimal structure assignment strategy let it jump from 700 to 850 dhrystones (but structure assignments are not that important outside of dhrystone). So only in the case the guy hacking on a new code generator could miss this will he (she?) take care of muls instructions? (The code generator is the only place where to do it since it is the only part that knows about CPU instructions) /Christoph van Wuellen
hbetel@watserv1.waterloo.edu (Heather) (02/23/90)
In article <11550@nigel.udel.EDU> ZZASSGL%cms.manchester-computing-centre.ac.uk@nsfnet-relay.ac.uk writes: > > >Prehaps a first step would be to port one of the many Small C >compilers onto Minix. OK, you would not be able to compile Minix but >at least it gives everyone a base to work from. The problem with that is that it seems to me that one of the worst parts of a compiler is its lexical box, and that is the fundamental difference between an ANSI C and a small C compiler. This is a pretty messy part to change. It tends to involve very large and complex finite state machines. (read "not nice to change, esp. when you didn't write it in the first place") The difference in code generators should be fairly minor, so by porting a small-C we have not won much. The other point in my mind may be closeminded, but I think that if you can't make a compiler of your own, then you probably can't do too good a job of heavily modifying someone elses. Then again, I can see how one might say the same of operating systems...
evans@ditsyda.oz (Bruce Evans) (02/25/90)
In article <24304@princeton.Princeton.EDU> nfs@notecnirp.UUCP (Norbert Schlenker) writes: >Here's another vote in favour of all of the above. As for the ANSIness of the >2.0 compiler, that is a secondary consideration. The big problem with the Leave the ANSI compilers to the big boys, or use gcc. It is a lot of work to write a compiler, and much harder when a detailed standard has to be met. >Minix compilers is that source is not really available and that there is no >real facility for bug fixes. I have always received polite responses to my >bug reports from Andy and/or Ceriel; almost invariably, the bugs have been >reported previously by others and are fixed in the next release. BUT the next Gcc is impressive:) in this respect as in others. There have been about 4 versions in the last year and you can read a 180K list of changes for bugs and improvements over that period. I doubt gcc had relatively more bugs than ACK a year ago. >release is just too far away, much too far away. I have almost resorted to >cross-compilation under DOS, but have resisted so far. I know that many others >have simply given up on the ACK compiler. Felix Croes writes: >|Gcc is too big, period. The ACK idea is fine, when trimmed down to what it >|really is all about: using EM as an intermediate language. The small compilers don't seem to use much technology (lex, yacc, rtl or intermediate languages). This keeps them small at the expense of flexibility and time to write. Mine avoids intermediate steps for speed too. --- Here is a short review of various worthwhile "free" compilers that I know about. I run a 32-bit Minix system on a 386. I wrote a compiler for this (bcc) and use it for most things. I use gcc for code with ANSI C, bitfields or floating point, and to get better error reports and (rarely) faster binaries. Most of the free compilers are for the 68000. I ran them but could not test the output. Quite likely I did not set them up to best advantage. PDC (68000) ----------- /* PDC Compiler - A Freely Distributable C Compiler for the Amiga * Based upon prior work by Matthew Brandt and Jeff Lydiatt. * * PDC Compiler 3.30 Copyright (C) 1989 Paul Petersen and Lionel Hummel. * PDC Software Distribution (C) 1989 Lionel Hummel and Paul Petersen. * * This code is freely redistributable upon the conditions that this * notice remains intact and that modified versions of this file not be * distributed as part of the PDC Software Distribution without the express * consent of the copyright holders. */ This appears tp have the same base as Cristoph van Wuellen's compile. I got it in a zoo package with a lot of other stuff: (output from du) 30 PDC/Bind 65 PDC/CCX 44 PDC/Dasm 154 PDC/LibSrc/Math 16 PDC/LibSrc/Misc 18 PDC/LibSrc/Startup 46 PDC/LibSrc/StdIO 42 PDC/LibSrc/StdLib 62 PDC/LibSrc/StringLib 63 PDC/LibSrc/SysIO 417 PDC/LibSrc 26 PDC/Libr 69 PDC/Make 1 PDC/PDC/manx_include 492 PDC/PDC 148 PDC/bin 1300 PDC PDC/bin contains a compiler (PDC) (only). This contains a preprocessor but not a compiler driver. It was compiled with bcc after a little editing. Sozobon (68000) --------------- /* Copyright (c) 1988 by Sozobon, Limited. Author: Johann Ruegg * * Permission is granted to anyone to use this software for any purpose * on any computer system, and to redistribute it freely, with the * following restrictions: * 1) No charge may be made other than reasonable charges for reproduction. * 2) Modified versions must be clearly marked as such. * 3) The authors are not responsible for any harmful consequences * of using this software, even if they result from defects in it. */ I have an incomplete version from comp.sources.atari.st. 208 soz/hcc 136 soz/top 139 soz/bin 486 soz The bin directory contains a compiler (hcc) and a peephole optimizer (top). These were compiled a while ago; I forget how. gcc (alliant, convex, i386, i860, m68k, m88k, mips, nsc32k, pyr, sparc, spur, tahoe, vax) ---------------------- Gcc was written by Richard Stallman and many elves. The copyright is readily available and too big to include here. 1770 gcc/config 8309 gcc This includes about 1MB of objects. It is a really good compiler, but too big to run on an 80286 or worse. bcc (6809, 8086, 80386) ----------------------- This is not exactly free (yet). Binaries are free. 24 as/work 17 as/bin 17 as/obj 12 as/6809 413 as 2 ld/6809 93 ld 56 sc/.examples 2 sc/6809 549 sc The sizes include objects and some junk. This once ran and compiled itself (in 4 minutes) in 40K text+data and 16K heap+stack on a 6809. It should be easily portable to a 68000 at the expense of poor code generation (1 data register. The 80*86 code is harmed surprisingly little by this). --- To see how much space these take, I compiled kernel/tty.c - the biggest program in the kernel. I reduced the stack allocation for everything to find the minimum. text data bss stack 127852 13248 5200 160000 PDC 81128 3688 2652 225000 hcc (sozobon) 498384 7336 18224 270000 cc1 (main pass of gcc) 50316 2072 13484 170000 cpp (preprocessing pass of gcc) 64656 5472 11700 135000 sc (mine) 58864 5100 6832 30000 sc (16-bit, needs separate cpp to fit) Everything except gcc was compiled with bcc, so these size would shrink 20% with a better compiler or one making more space optimizations. Gcc was compiled with itself and suffers a 10% size penalty from my assembler not being able to determine branch lengths. Compile times and word counts for the output (tty.s) (with no optimization) were real user+sys lines words chars 15 13.89 2774 5319 40168 PDC 9 6.46 2383 4641 33872 hcc (might have -O) 16 9.11 2279 5809 39672 cpp+cc1 3:-) 2.41 3121 5831 34140 sc (mine) 3 1.64 sc (16-bit, on preprocessed file) 8 5.22 cpp (ACK cpp pass for 16-bit sc) 27 22.34 Minix cc Differences in the disk cache size and state make the real times untrustworthy. -- Bruce Evans evans@ditsyda.oz.au