[comp.compilers] Assemblers can be fast

johnl@ima.ISC.COM (12/17/87)

Several people have implied recently that assemblers are necessarily
slow because they require two or more scans of the source being assembled.
I recently wrote an assembler for the NS32000.  In the process, I concluded
that much of what I have read about how assemblers ought to be designed is
out-of-date.  In particular, I do not believe assemblers ought to look at
the source more than once.  I doubt that what I am about to describe is new.
However, the slow speed and various other properties of some very popular
assemblers (especially for MSDOS) lead me to believe that the traditional
two pass assembler is still common today.

The first phase of my assembler scans the source, emitting what it can of
the object, and creating data structures which allow the final, correct,
object to be emitted once the whole source has been seen.  In particular,
a directed graph is built whose nodes are symbols.  An edge exists from
node A to node B if A is defined by an expression which references B.
Expressions which cannot be resolved when they are seen in the source
are stored in a compact, tokenized, reverse Polish form.

During the first phase, all forward branch displacements are assumed to
require four bytes.  Phase two reduces the length of displacements, where
possible, to one or two bytes.  For this purpose, preliminary addresses of
labels are recorded during phase one; the addresses are in general too large
because the labels may be preceded by displacements which can be reduced.
Because reducing one displacement can never increase another, the
approximate addresses are sufficient for determining displacement sizes.
At the end of phase two, the final addresses of all labels are known.

Phase three evaluates all expressions which were unresolved during phase
one and back patches the object.  If expression E references unresolved
symbol S, then the expression for S is recursively evaluated before
proceeding with the evaluation of E.  Thus, chains of symbols which have
forward references to other symbols may be unbounded in length; this
is not true for the traditional two pass assembler.

Since users request listings infrequently, I decided it was permissible
to reread the source to produce a listing (though the source does not
need to be parsed).  An optional fourth pass merges the object with the
source for this purpose.

I assume the two pass assembler evolved because it could assemble very
large programs on computers with very limited memory.  Since we no longer
use machines with small memories, I think the approach I took makes more
sense for a modern assembler.
[It's true, assemblers used to run in real memory just like any other
program.  I've used very credible assemblers that ran in 4K.  But if you
have infinite memory, sure, what the heck, buffer it all in memory.  The
scheme in your pass two is the same one that existing Unix assemblers use
now to do branch optimization.  It's not quite optimal, but the extra work
to ensure optimality isn't worth it.  But be prepared for users to punch
you in the nose when they add two lines to their assembler program and
the assembler says "you lose, can't buffer it in 640K."  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (12/20/87)

> [It's true, assemblers used to run in real memory just like any other
> program.  I've used very credible assemblers that ran in 4K.  But if you
> have infinite memory, sure, what the heck, buffer it all in memory.  The
> scheme in your pass two is the same one that existing Unix assemblers use
> now to do branch optimization.  It's not quite optimal, but the extra work
> to ensure optimality isn't worth it.  But be prepared for users to punch
> you in the nose when they add two lines to their assembler program and
> the assembler says "you lose, can't buffer it in 640K."  -John]

Real computers aren't limited to 640K :-). And if anybody wrote a 640K
assembly language program all in one file, and really expected my
assembler to assemble the darn thing, I'd have to say that they got
what they deserved :-). There's always a limit, somewhere -- if it
ain't 640K, it's the amount of space available for your symbol table,
or what have you. You're never going to be able to assemble
infinite-size source files.

But, seriously... one-pass assembling isn't extremely difficult. What
you do is, on the first pass, emit object code with "fillers" for
expressions you couldn't evaluate, and in memory, you keep a symbol
table and a list of expressions that couldn't be evaluated, along with
their location (NOT a copy of the entire source program), and then for
the second pass go back and fix the object code with the right
addresses.  It's easy enough to do for something like a 6502 where
there's no long-branch vs. short-branch to worry about (only thing to
worry about is zero-page vs. absolute, and most assemblers wimp out by
saying that if it's not defined the first time the expression is hit,
then it's absolute).
    For more complex architectures, more work is required, but the
principle is the same.  Note that you do NOT have to buffer the entire
thing in RAM (just some, the object itself can be buffered on disk),
and yes, there is a limit, but there's always a limit, somewhere -- I
regularly use a 2-pass assembler-editor package that takes up 12K on a
8-bit machine, and what I'm always running out of is symbol table
entries (darned symbol table is limited to 8K, total).

Still, on the great assembly-language vs. direct-to-object debate, I
still must conclude that emitting assembly language slows things down
a lot. This is especially true on microcomputers, with their slow disk
drives, which is why most microcomputer compilers directly emit object
files.
--
Eric Green  elg@usl.CSNET       P.O. Box 92191, Lafayette, LA 70509
{ihnp4,cbosgd}!killer!elg,      {ut-sally,killer}!usl!elg
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.UUCP (12/23/87)

In article <813@ima.ISC.COM> Green Eric Lee <ihnp4!usl!usl-pc!jpdres10> writes:
>Still, on the great assembly-language vs. direct-to-object debate, I
>still must conclude that emitting assembly language slows things down
>a lot. This is especially true on microcomputers, with their slow disk
>drives, which is why most microcomputer compilers directly emit object
>files.

One solution that I have worked on is to emit a hybrid assembler / object
file. For instance if the assember source was to be

	mov	d7,-(a7)

then the object for this can be generated by the compiler as a 16 bit word,
whereas for something like

	bcc	L42

the compiler emits the assembler as shown.

Now you have the advantage of assembler / HLL mixing, and IF you want to
mess with the assembler, a separate translator will convert from the
hybrid language to full assembly in a twinkle; however you still keep the
advantage of much less disk I/O: the 'mov' above is 14 bytes, against the
2 bytes for the assembled output - this probably means about a 60% saving
in file being processed.
--
		dg@wrs.UUCP - David Goodenough
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

rcodi@yabbie.rmit.oz.au (Ian Donaldson) (01/05/88)

in article <813@ima.ISC.COM>,  says:
> Real computers aren't limited to 640K :-). And if anybody wrote a 640K
> assembly language program all in one file, and really expected my
> assembler to assemble the darn thing, I'd have to say that they got
> what they deserved :-). There's always a limit, somewhere -- if it ...

Compilers that emit assembly language -often- create megabyte sized files
for the assembler.

> But, seriously... one-pass assembling isn't extremely difficult. What
> you do is, on the first pass, emit object code with "fillers" for
> expressions you couldn't evaluate, and in memory, you keep a symbol
> ...

This is easy if you don't have to worry about instructions that
change their size depending on the expresison value (eg long, short
and medium branch instructions).  Many assemblers spend time optimizing
such branches, and this generally requires an extra pass to do it properly.
The gains of optimizing such branches are really worthwhile.

Ian D

--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

andrew@frip.gwd.tek.com (Andrew Klossner) (01/16/88)

> "This [one-pass assembly with fixup chains] is easy if you
> don't have to worry about instructions that change their size
> depending on the expresison value (eg long, short and medium
> branch instructions).  Many assemblers spend time optimizing
> such branches, and this generally requires an extra pass to do
> it properly.  The gains of optimizing such branches are really
> worthwhile."

Branch length optimization is a win, but you don't need multiple passes
to do it.  I recently wrote a one-pass assembler with various sorts of
span length optimization.  There was nothing particularly clever
involved; I just told myself firmly that I wasn't going to do multiple
passes, and after some thought the necessary algorithms became clear.
It helped, but was not vital, that I had a large virtual memory space
and so didn't need to bother with temporary files.

In general, a good guiding principle is that it's much quicker to
review your generated object code in conjunction with in-memory tables
than it is to make another pass over the source code.

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request