[comp.compilers] More on compiling to assembler

johnl@ima.UUCP (12/12/87)
The argument about what kind of output a compiler should produce is one that
I've always enjoyed!! I personally see no need to ever generate assembler
source. I'll go through some of the arguments here:

"Reasons" for generating assembler source:

1) it's easier.
   - it is not easier. It can even be harder, depending on the syntactic
     requirements of the assembler you are going to use. The code generator
     in the compiler has to know about the addressing modes of the target
     machine, the ranges of displacements, etc. supported, the number of
     registers, etc. etc. After it has figured out which instruction it
     wants to generate, it is no harder to assemble the bits and emit them
     than it is to build up some strings describing the instruction.

2) the compiler writer and others want to see the assembler code.
   - true, but you should have a dissassembler (they're not hard to write)
     around anyway. You need it for when you don't have the source anymore,
     and you need it in your debugger as well.

3) in some cases you want to edit the compiler output.
   - hah! You're asking for a maintenance nightmare. If you have a requirement
     like that, then write that bit of code directly in assembler.

4) you don't want the compiler to have to know about the object file format.
   - perhaps, but it's never been a very major part of any of my compilers.
     Also, having the compiler do things may be the only way to get the
     information into the object files - modifying the assembler (assuming
     you have the source for it) is just more work.

5) you can get the assembler to do some of the work for you.
   - sometimes true (I recall the case of the PDP-11 UN*X C compiler wanting
     to let the assembler do the branch length optimization). I can't really
     argue against this one, but I also don't feel its terribly important.
     Some C compilers also take advantage of the two-pass nature of the
     assembler to generate code based on later information without having
     the compiler itself use multiple passes. Backpatching code that is
     stored in an in-memory buffer is just as easy.

Reasons for NOT generating assembler source:

1) It takes disk space to store the (often bulky) assembler source.

2) It takes more time to do the compilation/assembly pair.

3) It results in more code in total (compiler + assembler). With some of
   the newer languages/systems, it is becoming possible to get away without
   an assembler.

4) Getting some info (e.g. symbolic information) into the object file can
   be a nuisance.

5) This isn't really a generic disadvantage, but it's quite significant for
   the UN*X compilers: the entire C source file ends up as one assembler
   source file, which ends up as one "unit" in the object file. This means
   that it isn't possible to have the linker selectively load functions.
   (For some 8086 compilers, addressing is relative to the beginning of the
   first function in the source file, so selective loading would be next to
   impossible.) If the compiler generates the object files by itself, there
   is no problem with making each function a separate "unit".

6) If a standard assembler is used, it simply may not have capabilities
   needed for the language you are compiling. For example, can it identify
   a function as an initialization routine that should be automatically
   called on startup if any function from this source file is ever called?
--
Chris Gray		Myrias Research, Edmonton	+1 403 428 1616
	{uunet!mnetor,ubc-vision,watmath,vax135}!alberta!myrias!cg
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request