[comp.compilers] Compilers producing assembly language.

johnl@ima.UUCP (11/24/87)

This may or may not reopen and old debate that I don't know about...

How many compilers in "the real world" produce assembly language instead
of relocatable binary? I know that almost all standard (i.e. from a vendor)
Unix compilers first produce assembly language. I don't know about some
of the more exotic Unix machines such at UTS, Cray Unix or systems where
the C compiler was first written for a different OS (e.g. DG). What about
second party Unix compilers, e.g. Greenhills, Tartan Labs?

What I'm really after is:

1) Are there a lot of Unix compilers that don't produce assembly?

2) Are there common non-Unix compilers that do produce assembly?

3) [The $64,000 question:] Given that one's assembler (like 'as' on
   Unix) does not have a lot of extra overhead (macros etc.), is there
   still that big a win in generating relocatable binary directly?
-- 
Arnold Robbins
ARPA, CSNET:	arnold@emory.ARPA	BITNET: arnold@emory
UUCP: { decvax, gatech, }!emory!arnold	DOMAIN: arnold@emory.edu (soon)
[I've heard arguments either way.  Most assemblers on non-Unix systems are
chock full of features and so are so slow as to be unsuitable for the last
pass of a compiler, so the question never comes up.  Other than some of the
C compilers for the PC which optionally run through the assembler so as to
allow in-line assembler to be passed through, I've never seen a non-Unix
compiler that produces assembler.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (11/29/87)

The only other compiler which I have heard of which generates intermediate
assembler was the CDC FTN compiler.  My experience is years old, but at the
time I dealt with it (mid '70s) FTN generated assembler.  It had its own
stripped-down assembler that it used to assemble things; I think that this
one could access the symbol table from previous passes.  There was a command-
line option to use the system assembler, however.  This also allowed you to
get the assembler file in a form that you could work on and hand-optimize.

Of course, many student and internal compilers generate assembler.  We've
got a internal compiler around here; the project to make it generate
direct object code never seems to get off the ground.  It's cheaper and 
quicker to buy more iron to run it on.
---
Craig Jackson
UUCP: {harvard!axiom,linus!axiom,ll-xn}!drilex!dricej
BIX:  cjackson
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

gla@nixpbe.UUCP (12/02/87)

Well, there are a lot of generic UNIX compilers that generate object
code directly. For example, the Pyramid 90x family idem Nixdorf Targon/35
and soon the whole Nixdorf Targon Family.
There is good reason to do so.

First, it is a performance reason.
Lexical analysis takes about 30% of time in any compile/assembly
program. This can be saved if the assembly step is integrated.

Next, a lot of assemblers have reserved words. These will then
either be forbidden in the source languages, or the compiler
will prefix each name with something, e.g. an underscore.
Other curious restrictions might make extra effort to the
code generator. An INTEL 8086 assembler e.g. treats an EXTERN
inside or outside a segment/procedure different.

Third, and most important, you cannot pass correct debugging information,
e.g. source line numbers and symbol attributes.

Fourth, the assembler is far too powerful. It will keep all labels
ever generated and seldom will have an option to forget local
symbols, etc.

On the other hand, there is also good reason to use Assembler source
and the normal assembler.

First, maintenance is far more easy if you can edit the code output.

Second, you must have such a beast anyhow, why duplicate the effort?

Third, you have an assembly listing that is not an approximation
but the true listing. Especially important for real-time software
or device controllers.

Any more questions?

Rainer Glaschick
Microprocessor Tools Group
Nixdorf Computer AG, Paderborn, Germany
(personal communication, of course)

UUCP:   {seismo,mcvax}!unido!nixpbe!glaschick
NERV:   glaschick.pad

Phone:  nat-5251-14-6153
FAX:    nat-5251-14-6108

S-mail: Rainer Glaschick
	Nixdorf Computer AG
	Entwicklungstechnik
	Pontanusstr. 55
	D-4790 Paderborn
	W-Germany
[Some of the objections to assembler are not always valid, e.g. most Unix
assemblers pass through lots of debugging symbols and have local symbols.
On the other hand, it took me a while to figure out why I was having such
trouble on an 8086 with a symbol called AL.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (12/03/87)

Probably most, if not all, of the "real world" compilers that produce
assembler are derived, culturally if not physically, from the Unix compilers.
Traditional assemblers are so expensive that generating assembler code was
not a great idea.  One can argue that it isn't a great idea anyway:  if you
know what instruction you want to emit, why not emit it?  There isn't a
whole lot of difficulty in turning emit("mov (r0)+, (r1)+") into emit(012021).

However, there are some arguments in favor.  One is that generating assembler
keeps the knowledge of object-module format -- usually complex -- in one
place, with a simpler text format used as the interchange medium.  This line
of argument may weaken a bit if complex symbol-table info has to be passed
from compiler into object module somehow.  Another argument against is that
on modern machines, the assembler may do quite a bit of work, like picking
the precise form for span-dependent instructions or arranging code to match
pipeline constraints.  (A friend of mine once commented that the assembler
for the 68000 was more complicated than the code generator for it!)  Putting
all that into each compiler isn't trivial, and is a bit stupid given that
the assembler has to exist anyway.  This is especially true if the language
has an in-line-assembler facility which requires a full assembler somewhere
in the pipeline anyway.  Yet another consideration is that it's a whole lot
easier to deal with assembler output for compiler debugging.

The Minix system in fact uses (slightly compacted) assembler as its *object
module* format, with the assembler and linker rolled into one for building
the final code.  This is a rather extreme approach, but an interesting one,
provided that the assembler-linker is fast.

My personal opinion is that on a system with an efficient assembler,
generating binary is probably still a win, but not a very big one, and
generating assembler is definitely easier.  There are better things for
compiler writers to spend their time on than reinventing the assembler.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry
[My limited experience agrees with this.  For example, the Wizard C compiler,
forerunner of Turbo C, could generate either assembler or object.  Originally,
there were separate versions of pass 2 depending on which you wanted to
get.  In the last release, he combined them and the combined version was only
10% bigger than the previous object-only version. -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (12/07/87)

In article <767@ima.ISC.COM> harvard!drilex!dricej writes:
>The only other compiler which I have heard of which generates intermediate
>assembler was the CDC FTN compiler.  ...

NO!  The CDC fortran compiler can generate assembly OUTPUT if you wish (sort
of like the -S option for cc), but it internally generates an intermediate
language, than has the Common Code Generator turn that directly into binary
code.  Also, you could have a routine be in assembly by just using the
COMPASS mnemonics (i.e., have 'IDENT' in column 11); ftn is smart enough to
realize that you are using assembly and call the assembler.
My experience with this is about 45 seconds old; I just got off of a CDC
Cyber 170/750.

 Sean Eric Fagan          Office of Computing/Communications Resources
 (213) 852 5742           Suite 2600
 1GTLSEF@CALSTATE.BITNET  5670 Wilshire Boulevard
                          Los Angeles, CA 90036
{litvax, rdlvax, psivax, hplabs, ihnp4}!csun!aeusesef
[So how fast is the compiler?]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) (12/09/87)

I have 2 major objections to using the assembler as the object deck
writer.
1) Efficiency.  It just takes time to run the 2 or 3 passes of
   the assembler through all that ASCII text so that it can
   recompute a binary number (i.e. the instruction value)
   that the compiler knew when it wrote the text.
   In practice, I've found you can write a set of routines to
   emit code and build an object deck, and use the same set of
   routines to write your assembler.  This way you still only
   write the routines that know the object deck format once,
   but you get the faster compiler.
2) If you do go the route of writing assembler input, you are
   under tremendous presure to make the assembler "fast".  This
   usually results in a lot of things like a good macro processor
   and multiple user defined relocation counters being left out.
   The result is a fast assembler that is very difficult to use
   as for actual assembler programming because its primary mission
   is to be the object deck writer for the compiler instead of
   a tool for the programmer.  If the assembler is only used to
   process the small percentage of code written by programmers
   in assembler, you can afford to let it be somewhat slower
   and support the features that allow a programmer to avoid
   the error prone dog work.

--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

rcd@ico.isc.COM (12/10/87)

In article <26100001@nixpbe.UUCP>, gla@nixpbe.UUCP writes:
...
> First, it is a performance reason.
> Lexical analysis takes about 30% of time in any compile/assembly
> program. This can be saved if the assembly step is integrated.

This is a rather rash generalization--the numbers probably run from 5% to
over 60% of time.

It is also a weak argument against the compiler generating assembly language.
If you're spending that much time lexing, either you've got a very fast
compiler system (in which case it doesn't matter much) or you've done an
abominable job on the lexer (in which case you can rework the lexer).
Bear in mind that the lexer of concern is the one for the assembler, not
the one for the compiler proper--and lexing assembly language <<ought>> to
be easy (thus amenable to simple tuning).

[This is not to say that it doesn't happen.  I recall the early ETH Pascal
compilers spending something over 50% of their time in source input and lex
analysis.  It was a combination of careless lexical analysis, a mediocre
text I/O system, and a very fast rest-of-the-compiler with trivial work in
the code generation.]
[I've heard that an astounding fraction of PCC's time is usually spent in
the printf() calls to write out the assembler code.  -John]
-- 
Dick Dunn      UUCP: {hao,nbires,cbosgd}!ico!rcd       (303)449-2870
   ...CAUTION:  I get mean when my blood-capsaicin level gets low.

--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

rcodi@yabbie.rmit.oz.au (Ian Donaldson) (12/14/87)

In article <782@ima.ISC.COM>, rcd@ico.isc.COM writes:
> [I've heard that an astounding fraction of PCC's time is usually spent in
> the printf() calls to write out the assembler code.  -John]

Yes, and the Berkeley Pascal compiler is similar.  I once gprof'd both
pcc and pc and found this, and then went and changed all the printf's
that had no variable part (ie: no %'s) into basically puts().
There were a lot of them.  It did speed it up significantly.

With Pascal in an environment where separate compilation is inconvenient
(do to portability problems with other compilers and OS's) one is faced
with most of the compilation time being spent in the assembly phase
with most versions of the Berkeley compiler, at least for the 68000 family.  
(I haven't seen real Berkeley pc on other than vax or 68k machines)

This is due to the span-dependent instruction resolution (SDI) being
done over the entire (often 20000-line) assembly file.   (SDI's are
long/medium/short branch optimizations).  

In the Sun version of /bin/as, there is an option (-J) that turns off
SDI optimizations so that back-references are efficient and forward
references are inefficent but will always work.   The code that comes
out is a bit slow, but compilation time is dramatically reduced.

In the Sun documentation for their assembler (SunOS 3.2 or 3.3) there
is mention of a ".proc" directive that breaks up the assembler file
into separate modules... so that SDI optimization can be done more quickly
in a localized area such as procedure or function.  Unfortunately
the compilers don't yet generate such an instruction.

At the moment, the Sun Berkeley Pascal compiler (SunOS 3.3) is about 4
times -slower- than a very old version (1984 vintage) Silicon Valley Software 
(SVS) Pascal compiler on the same machine.  With a 1800 line program, 
I get the following figures:

	pc -g -H -C test.p		17.2u 1.3s 0:24   Current Sun/Berkeley
	spc -g test.p 			 4.3u 0.8s 0:08   very old SVS

(both compiles generate run-time check code and debug code; I didn't
use -O with pc since spc doesn't have an equivalent).  The code tested
here is very typical -- 18000 line programs yield much the same result.

This is interesting, because the SVS compiler is multi-pass also, and
both first passes of each compiler take about the same time as each
other.  This means that the parser/lexical analysis is of similar quality
in each.  

Berkeley pc does the following passes:  pc0-f1-pc2-[c2]-as-pc3
(on the Sun its slightly different now, but basicaly the same)
SVS Pascal does the following passes:   pascal-code-ulinker

pc0 and pascal are about the same speed and do pretty much the same things;
ie: generate a binary description for the code generator (f1-pc2, code).

This means that the speed differences come in at the code-generation
level.  SVS's code is generated by "code" in one step after parsing.
The "ulinker" stage is a module linker that also converts the SVS object
format into UNIX .o format, and -also- scans the SVS pascal library
for library routines.  The resultant only has to be linked with the
C library to be complete.  The ulinker stage is necessary because
of the UCSD nature of the SVS pascal -- module interdependencies must
be checked.  (Its done differently on the ns32000 version).

The Berkeley pc compiler uses 3-4 passes in code generation 
(f1-pc2-[c2]-as-pc3), and this is where the big difference in speed lies.
You've got 2-3 parsers in this (one in pc2, one in c2, one in as), and
3-4 output routines (one in f1, one in pc2, one in c2 and one in as).
All this time is spent converting from binary to text and back again,
several times.

The code that pops out the end for the two compilers is comparable in quality, 
if both compilers generate 68000 code (since our -old- version of SVS
only generated 68000 code; I suspect that newer SVS's would reverse
this trend).

Prior to SunOS 3.0, the code that popped out  the end of the two compilers
was far better in SVS; now it is slightly the other way around, probably due to 
a major overhaul of the code generator in pc0 (I saw the source of 
the old version of pc distributed with the Vax 4.2 release, the one
that came with SunOS 2.0-2.2) -- the 68k code in there was
abominable in parts -- almost no register variables were used)

Ian D
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.ISC.COM (12/16/87)

This discussion of compiler output is taking on the flavor of "my
compilation technique is better than yours." The issue of whether a
compiler should produce assembly language or machine code output
depends on the application:

1) Operating System Compiler Development

The goal of the operating system compiler development group
is to write a compiler that executes quickly and eventually
produces a machine language equivalent of the source code.
One can assume that the object file format and assembler
source code are available and can be incorporated into the
design of the compiler. It is not really necessary to produce
assembly language output.

2) Third Party Compiler Development

The goal of third party compiler development is to produce
a compiler that does something different or better than the
compilers supplied with the operating system. A major difference
between third party and operating system compiler development
is that the object file format and assembler source code may
not be available to the third party developers. Therefore, it
may be cost effective (in terms of development time and $$$) to
produce assembly language and let the assembler do the rest.

3) Academic Compiler Development

There are dozens of reasons for writing a compiler in an academic
environment. If the purpose is to explore compilation techniques
then assembly language or pseudo code output may be acceptable. If
the purpose is to explore compiler execution time optimization or
object code production then object code generation may be required.

4) Other Applications - Please Specify: __________

There are certainly many more reasons. The point is that certain
techniques are appropriate in certain cases and inappropriate in others.
My personal favorite, given a good macro assembler, is to output three
address code and subsequently assemble it using a macro library
written for the particular machine. It is easy to verify the compiler
output and it is quite portable. This approach resulted in acceptable
compilation times in an application where the compiler was used much
less frequently then the code it produced.

        Anthony F. Stuart, {uunet|sundc)!rlgvax!tony
        CCI, 11490 Commerce Park Drive, Reston, VA 22091
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request