johnl@ima.UUCP (12/12/87)
The argument about what kind of output a compiler should produce is one that I've always enjoyed!! I personally see no need to ever generate assembler source. I'll go through some of the arguments here: "Reasons" for generating assembler source: 1) it's easier. - it is not easier. It can even be harder, depending on the syntactic requirements of the assembler you are going to use. The code generator in the compiler has to know about the addressing modes of the target machine, the ranges of displacements, etc. supported, the number of registers, etc. etc. After it has figured out which instruction it wants to generate, it is no harder to assemble the bits and emit them than it is to build up some strings describing the instruction. 2) the compiler writer and others want to see the assembler code. - true, but you should have a dissassembler (they're not hard to write) around anyway. You need it for when you don't have the source anymore, and you need it in your debugger as well. 3) in some cases you want to edit the compiler output. - hah! You're asking for a maintenance nightmare. If you have a requirement like that, then write that bit of code directly in assembler. 4) you don't want the compiler to have to know about the object file format. - perhaps, but it's never been a very major part of any of my compilers. Also, having the compiler do things may be the only way to get the information into the object files - modifying the assembler (assuming you have the source for it) is just more work. 5) you can get the assembler to do some of the work for you. - sometimes true (I recall the case of the PDP-11 UN*X C compiler wanting to let the assembler do the branch length optimization). I can't really argue against this one, but I also don't feel its terribly important. Some C compilers also take advantage of the two-pass nature of the assembler to generate code based on later information without having the compiler itself use multiple passes. Backpatching code that is stored in an in-memory buffer is just as easy. Reasons for NOT generating assembler source: 1) It takes disk space to store the (often bulky) assembler source. 2) It takes more time to do the compilation/assembly pair. 3) It results in more code in total (compiler + assembler). With some of the newer languages/systems, it is becoming possible to get away without an assembler. 4) Getting some info (e.g. symbolic information) into the object file can be a nuisance. 5) This isn't really a generic disadvantage, but it's quite significant for the UN*X compilers: the entire C source file ends up as one assembler source file, which ends up as one "unit" in the object file. This means that it isn't possible to have the linker selectively load functions. (For some 8086 compilers, addressing is relative to the beginning of the first function in the source file, so selective loading would be next to impossible.) If the compiler generates the object files by itself, there is no problem with making each function a separate "unit". 6) If a standard assembler is used, it simply may not have capabilities needed for the language you are compiling. For example, can it identify a function as an initialization routine that should be automatically called on startup if any function from this source file is ever called? -- Chris Gray Myrias Research, Edmonton +1 403 428 1616 {uunet!mnetor,ubc-vision,watmath,vax135}!alberta!myrias!cg -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request