tassos@rti.rti.org (Tassos Markas) (04/11/89)
I'm looking for independent architecture compilers. This compiler should accept a high level langauage (I would prefer C) and produce microcode that can be easily retargeted for any system architecture. thanks in advance Tassos Markas (919) 541-7020 Research Triangle Institute Research Triangle Park, NC 27709 tassos@rti.rti.org [128.109.139.2] {decvax,ihnp4}!mcnc!rti!tassos [This topic comes up from time to time. I don't think there is any such thing, though I note that the OSF has an RFT out for exactly this sort of facility to make it possible to ship one set of object code to be run on many different architectures. -John] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU Plausible paths are { decvax | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
tassos@rti.UUCP (Tassos Markas) (04/14/89)
I'm looking for independent architecture compilers. This compiler should accept a high level langauage (I would prefer C) and produce microcode that can be easily retargeted for any system architecture. thanks in advance Tassos Markas (919) 541-7020 Research Triangle Institute Research Triangle Park, NC 27709 tassos@rti.rti.org [128.109.139.2] {decvax,ihnp4}!mcnc!rti!tassos
jac@paul.rutgers.edu (J. A. Chandross) (04/17/89)
tassos@rti.UUCP (Tassos Markas) writes: > I'm looking for independent architecture compilers. > This compiler should accept a high level langauage (I would prefer C) and > produce microcode that can be easily retargeted for any system architecture. I would suggest that anyone interested in microcode compilation read the proceedings of the past few Micro conferences (Proceedings of the International Conference on Microprogramming). Microcode compilation is a very hard problem for non-traditional architectures. (I know, I've tried.) If you have something fairly straightforward, ie a non-VLIW you may be able to use one of the existing retargetable microcode compilers. Otherwise, write it from scratch. It's a lot easier than coercing something designed for a relatively simple machine into working for your machine. As far as "easily retargeted for any system architecture" goes, beware of anyone who tries to sell you such a system. This isn't something that is trivially done. Jonathan A. Chandross Internet: jac@paul.rutgers.edu UUCP: rutgers!paul.rutgers.edu!jac
schow@bnr-public.uucp (Stanley Chow) (04/18/89)
In article <Apr.16.22.18.25.1989.11912@paul.rutgers.edu> jac@paul.rutgers.edu (J. A. Chandross) writes: >tassos@rti.UUCP (Tassos Markas) writes: >> I'm looking for independent architecture compilers. >> This compiler should accept a high level langauage (I would prefer C) and >> produce microcode that can be easily retargeted for any system architecture. > >Microcode compilation is a very hard problem for non-traditional architectures. >(I know, I've tried.) If you have something fairly straightforward, ie a >non-VLIW you may be able to use one of the existing retargetable microcode >compilers. Otherwise, write it from scratch. It's a lot easier than coercing >something designed for a relatively simple machine into working for your >machine. > I agree with Chandross that in many cases, it is better to write the micro-code from scratch. Considering the amount of micro-code in a machine (especially the really timetime critical bits), it is often quicker to rewrite the micro-code as opposed to writing (or porting) a truly optimizing compiler to a new architecture. If you have a lot of *critical* micro-code, (by a lot, I mean tens of K words), you should question the need. [Remember, this is a proponent of micro-code talking.] Stanley Chow ..!utgpu!bnr-vpa!bnr-fos!schow%bnr-public (613) 763-2831 Clever disclaimers are hard to come by, I will save them for the articles that need them. For now: I speak only for myself.
schow@bnr-public.uucp (Stanley Chow) (04/20/89)
In article <10441@polyslo.CalPoly.EDU> cquenel@polyslo.CalPoly.EDU (34 more school days) writes: > >What if your machine only runs micro-code ? (This is not an idle >question). The term I've heard coined recently is "superscalar". >If one were to write a compiler for a superscalar machine, it >seems that one might want to design it a lot like a micro-code >compiler. > >This is NOT an argument for a "retargetable" (ha ha ha ha ha) micro-code >compiler, just a micro-code compiler. > In my mind, micro-code means that stuff that implements the instructions used by compilers. Usually, the micro-code also have lots of strange encoding with parallellism and limited to a small addressing range. Since I don't know what a superscalar machine looks like, I really can't say much about it. If your target architecture has "small" instructions that the compiler must string together to do "big" operations, then you probably want to get a RISC compiler. Stanley Chow ..!utgpu!bnr-vpa!bnr-fos!schow%bnr-public Opinion? Did I say something in that posting? Wow! Please, can I let that opinion represent just me? I promise to tell everyone I am the sole representee.
weaver@prls.UUCP (Michael Weaver) (04/22/89)
In article <10441@polyslo.CalPoly.EDU> cquenel@polyslo.CalPoly.EDU (34 more school days) writes: > >What if your machine only runs micro-code ? (This is not an idle >question). The term I've heard coined recently is "superscalar". >If one were to write a compiler for a superscalar machine, it >seems that one might want to design it a lot like a micro-code >compiler. > >This is NOT an argument for a "retargetable" (ha ha ha ha ha) micro-code >compiler, just a micro-code compiler. > If your machine runs only microcode, it will generally be much simpler to generate code for it than a machine that uses microcode to implement an instruction set. The reason for this is quite simple: in the latter case the hardware designers have a pretty good idea what the only program the machine will run will look like, and may introduce some odd features (such as OR-ing address with data to form a branch target address), if they make the machine cheaper or faster. Michael Weaver Signetics/Philips Components 811 East Arques Avenue Sunnyvale CA 94086 USA Phone: (408) 991-3450 Usenet: ...!mips!prls!weaver
jac@paul.rutgers.edu (J. A. Chandross) (04/22/89)
cquenel@polyslo.CalPoly.EDU (34 more school days) writes: > >What if your machine only runs micro-code ? (This is not an idle >question). > weaver@prls.UUCP (Michael Weaver) > If your machine runs only microcode, it will generally be much simpler > to generate code for it than a machine that uses microcode to implement > an instruction set. This is indeed the case. Instruction sets are generally written once, but executed many many times. In order to deliver the highest performance you will likely want to write the code by hand. Besides, most microcoded instruction sets, even the VAX, are relatively simple compared to the features afforded by a true VLIW (ie horizontally microcoded) machine. However, if you want to generate user customizable instructions sets, or have user programs written entirely in microcode you will run into the problem of how to generate the microcode form a high-level language. It is bad enough having to debug the hardware with hand-written programs; forcing users to write in microcode means the top executives of your company are going to be selling real-estate in 6 months. However, programming disadvanatges aside, high-performance microcoded machines are likely to be the wave of the future. It is only with microcoded machines that you can take maximal advantage of your hardware. The RISC machines have merely proven what microarchitects have known since time immemorial: keep it single cycle, don't put a feature in if it will slow things down (even if your marketing people insist), don't put it in if you can make better use of the hardware, use parallelism to improve performance, and keep the hardware busy all of the time, etc.. And the devil take anyone who wants to program it by hand. (Of course, there are additional issues for microprogrammed machines like leave out pipelining because it makes it hard to write compilers for the machine as well as introducing needless complexity, handle branches intelligently, etc.) I'll construct a hypothetical machine to show what sort of performance gains it delivers and to demonstrate the demands it places on the compiler: 2 ALU's, conventional design, driveable in parallel 4 increment/decrement units. operations: add/subtract {1,2,4,nothing} to register memory access unit: {read,write} {8,16,32} bits offset is {register, constant, none} branch unit: jump, call subroutine, return from subroutine registers: 64 always accessible 64 accessible only through an ALU A 64 accessible only through an ALU B The most efficient code will use all these resources at the same time. Any compiler that will generate code for such a machine will require some sort of data flow analysis to determine how the various fields (ie an ALU op, branch, etc) can be compacted together to produce optimal code. For instance, the sequence: while(foo->next != NULL) { foo = foo->next; bar++; } Could compile into code like: R0 = foo R1 = offset for next loop: alu_1(compare(R0, NULL)) branch(equal, done); R0 = read(R0 + R1, Long) increment(R2,1) goto loop; done: But this is extremely inefficient. Instead, we can compact it to a 2 instruction loop: loop: alu_1(compare(R0, NULL)) branch(equal, done); R0 = read(R0 + R1, Long) increment(R2,1) goto loop; done: Now when you add in the complexity of folding in the instructions before and after the loop the compiler must understand a great deal about the target machine. After all, you now have scheduling problems. Recall that some registers are only accessible on certain ALUs. (These would be used to store commonly constants.) You also can have resource conflicts if various fields in your instruction are overlapped. For instance, you might discover that you typically do 1 alu operation and a memory operation or 2 alu operations. This would allow you to overlap the field for a memory operation with one of the alu fields. The problem grows as you add hardware. However, you can get performance with this sort of machine that you couldn't get out of a RISC chip. While the compiler problems are large, they are not insurmountable. Compilers have been written that generate tolerable code for machines like this. You need look no farther than the Multiflow or ELI-512 for proof. It is not clear to me exactly what model the current crop of commercial retargetable microcode compilers use. The research ones, ie the only ones that reveal their private parts to the world, tend to take a simplistic view of the world. I suspect that the commercial ones are more hype than substance, although I would be delighted to be proven wrong. Jonathan A. Chandross Internet: jac@paul.rutgers.edu UUCP: rutgers!paul.rutgers.edu!jac
aarons@syma.sussex.ac.uk (Aaron Sloman) (04/28/89)
tassos@rti.UUCP (Tassos Markas) writes: > Date: 14 Apr 89 01:20:12 GMT > Organization: Research Triangle Institute, RTP, NC > > I'm looking for independent architecture compilers. > This compiler should accept a high level langauage (I would prefer C) and > produce microcode that can be easily retargeted for any system architecture. > The Poplog two-level virtual machine with the two levels linked by a machine-independend and language independent compiler provides relatively easy portability for incremental compilers for a range of high level languages, though not C at present. It also makes it relatively easy to add a new new high level language that immediately runs (with a rich environment) on all the target architectures. I append a more detailed description. I hope it is of some interest, and I apologise for its length. Although Sussex University has a commercial interest in Poplog I have tried to avoid raising any commercial issues. --------------------- Poplog provides development tools for a range of languages: Common Lisp, Prolog, ML and POP-11 (a Lisp-like language with a more readable Pascal-like syntax). It also provides tools for adding new incremental compilers. It might be possible to add an incremental compiler for C, though it would not run very fast. However, from late 1989 we expect to give users access to a C-like extension to Pop-11 that is used for developing and porting Poplog. It is not quite as fast as C but provides far more facilities. Before I describe porting I need to explain how the running system works. The mechanisms described below were designed and implemented by John Gibson, at Sussex University. All the languages in Poplog compile to a common virtual machine, the Poplog VM which is then compiled to native machine code. First an over-simplified description: The Poplog system allows different languages to share a common store manager, and common data-types, so that a program in one language can call another and share data-structures. There is also a common interface to the host operating system and an "external" interface, to non-Poplog languages (C, Fortran, etc). The Poplog languages are incrementally compiled for rapid development and testing: individual procedures can rapidly be compiled, tested, modified and re-compiled and are immediately automatically linked in to the rest of the system, old versions being garbage collected if no longer pointed to. The languages are all implemented using a set of tools for adding new incremental compilers. These tools include procedures for breaking up a text stream into items, and tools for planting VM instructions when procedures are compiled. These tools are used by the Poplog developers to implement the four Poplog languages but are also available for users to implement new languages suited to particular applications. All this makes it possible to build a range of portable incremental compilers for different sorts of programming languages. POP-11, PROLOG, COMMON LISP and ML all compile to a common internal representation, and share machine-specific run-time code generators. Thus several different machine-independent "front ends" for different languages can share a machine-specific "back end" which compiles to native machine code, which runs far more quickly than if the new language had been interpreted. The actual story is more complicated: there are two Poplog virtual machines, a high level and a low level one, both of which are language independent and machine independent. The high level VM has powerful instructions, which makes it convenient as a target language for compilers for high level languages. This includes special facilities to support Prolog operations, dynamic and lexical scoping of variables, procedure definitions, procedure calls, suspending and resuming processes, and so on. Because these are quite sophisticated operations, the mapping from the Poplog VM to native machine code is still fairly complex. So there is a machine independent and language independent intermediate compiler which compiles from the high level VM to to a low level VM, doing a considerable amount of optimisation on the way. A machine-specific back-end then translates the low-level VM to native machine code, except when porting or re-building the system. In the latter case the final stage is translation to assembly language. (See diagram below.) The bulk of the core Poplog system is written in an extended dialect of POP-11, with provision for C-like addressing modes, for efficiency. We call it SYSPOP. The system sources, mostly written in SYSPOP, are also compiled to the high-level VM, and then to the low level VM. But instead of then being translated to machine code, the low level instructions are automatically translated to assembly language files for the target machine. This is much easier than producing object files, because there is a fairly straight-forward mapping from the low level VM to assembly language, and the programs that do the translation don't have to worry about formats for object files: we leave that to the assembler and linker supplied by the manufacturer. In fact, the system sources need facilities not available to users, so the two intermediate virtual machines are slightly enhanced for SYSPOP. The following diagram summarises the situation. {POP-11, COMMON LISP, PROLOG, ML, SYSPOP, etc} | Compile to [language specific] | V [High level VM] (extended for SYSPOP) | Optimise & compile to | V [Low level VM] (modified for SYSPOP) | Compile (translate) to [machine specific] | V [Native machine instructions] [or assembler - for SYSPOP] So for ordinary users compiling or re-compiling procedures during software development the built in machine code generator is used and compilation is very fast, with no linking required. For rebuilding the whole system the back end is changed to generate assembler, and rebuilding is much slower. But it does not need to be done very often. All the compilers and translators are implemented in Poplog (mostly in POP-11). Only the last stage is machine specific. The low level VM is at a level that makes it possible on the VAX, for example, to generate approximately one machine instruction per low level VM instruction. So writing the code generator for something like a VAX or M68020 was relatively easy. For a RISC machine the task is a little more complicated. Porting to a new computer requires the run-time "back end", i.e. the low level VM compiler, to be changed and also the system-building tools which output assembly language programs for the target machine. There are also a few hand-coded assembly files which have to be re-written for each machine. Thereafter all the high level languages have incremental compilers for the new machine. (The machine-independent system building tools perform rather complex tasks, such as creating a dictionary of procedure names and system variables that have to be accessible to users at run time. So besides translating system source files, the tools create additional assembler files and also check for consistency between the different system source files.) The Poplog VM provides a varied, extendable set of data-types and operations thereon, including facilities for logic programming, list, record and array processing, 'number crunching', sophisticated control structures (e.g. co-routines), 'active variables' and 'exit actions', that is instructions executed whenever a procedure exits, whether normally or abnormally. Indefinite precision arithmetic, ratios and complex numbers are accessible to all the languages that need them. Both dynamic and lexical scoping of variables are provided. A tree-structured "section" mechanism (partly like packages) gives further support for modular design. External modules (e.g. programs in C or Fortran) can be dynamically linked in and unlinked. A set of facilities for accessing the operating system is also provided. The VM facilities are relatively easy to port to a range of computers and operating systems because the core system is mostly implemented in SYSPOP, and is largely machine independent. Only the machine-dependent portions mentioned above (e.g. run-time code generator, and translator from low level VM to assembler), plus a small number of assembler files need be changed for a new machine (unless the operating system is also new). Since the translators are all written in a high level AI language, altering them is relatively easy. Porting requires compiling all the SYSPOP system sources, to generate the corresponding new assmbler files, then moving them and the hand-made assembler files to the new machine, where they are assembled then linked. The same process is used to rebuild the system on an existing machine when new features are added deep in the system. Much of the system is in source libraries compiled as needed by users, and modifying those components does not require re-building. Using this mechanism an experienced programmer with no prior knowledge of Poplog or the target processor was able to port Poplog to a RISC machine in about 7 months. But for the usual crop of bugs in the operating system, assembler, and other software of the new machine the actual porting time would have been shorter. In general, extra time is required for user testing, producing system specific documentation, tidying up loose ends etc. Thus 7 to 12 months work ports incremental compilers for four sophisticated languages, a screen editor, and a host of utilities. Any other languages implemented by users using the compiler-building tools should also run immediately. So in principle this mechanism allows a fixed amount of work to port an indefinitely large number of incremental compilers. Additional work will be required if the operating system is different from Unix or VMS, or if a machine specific window manager has to be provided. This should not be necessary for workstations supporting X-windows. POPLOG is too big for 80286-based PCs. Currently it runs on VAX (VMS/Unix), Sun2, Sun3, Sun4(SPARC), Sun386i (Road-runner), HP 9000 300 series workstations with HPUX Apollo 680?0 worstations with Bsd Unix Sequent Symmetry with Dynix Orion 1/05 (with Clipper). This version is not supported at present Aaron Sloman, School of Cognitive and Computing Sciences, Univ of Sussex, Brighton, BN1 9QN, England INTERNET: aarons%uk.ac.sussex.cogs@nsfnet-relay.ac.uk aarons%uk.ac.sussex.cogs%nsfnet-relay.ac.uk@relay.cs.net JANET aarons@cogs.sussex.ac.uk BITNET: aarons%uk.ac.sussex.cogs@uk.ac or aarons%uk.ac.sussex.cogs%ukacrl.bitnet@cunyvm.cuny.edu UUCP: ...mcvax!ukc!cogs!aarons or aarons@cogs.uucp