[comp.compilers] Simplistic Assemblers?

johnl@ima.ISC.COM (12/16/87)

I realize that this forum is "of, by, and for" compiler gurus, and so I
shouldn't expect expertise in assembler technology.  In case some of the
participants are interested, I would like to take a moment to dispute the
claims made here about simplistic assemblers being better for the human
assembler language programmer than complex assemblers.

By analogy, a C compiler could be made easier for the programmer to use if
we removed features like automatic variables, structures, unions, multiple
subscripts on arrays, the preprocessor, and especially those pesky pointers
which nobody can ever seem to get right.  While we're at it, we can get rid
of a bunch of other complexities by eliminating separate compilation.  Hey,
you can still write a Turing machine in the remaining language, so we
haven't removed anything that was absolutely critical, right?

If this seems like a ridiculous notion, you're right.  But that is a close
approximation to what has been gutted from the typical Unix (tm) assembler
program.  Today's assemblers are generally less capable than those which
were designed for the first-generation computers of the 1950s.

There is a good reason that the OS/360 assembler required four passes
(which still wasn't always enough, the OS/VS assembler can take an
unlimited number of passes).  The OS/360 assembler was designed for use by
human programmers, in a time when assembler language programmers weren't
considered to be a subhuman lifeform.

There is a modern aphorism that "It's much harder to program in assembler
than in a high-level language".  That's no surprise, considering that the
typical programmer has neither training nor experience in assembler coding.
We further hobble him by providing stone ax assembler tools, and then we
have the gall to complain that the resulting program looks like it was
butchered with a stone ax.

Assembler language will always be difficult (an understatement) to "port".
But assembler language doesn't HAVE to be difficult to program in, and the
programs don't HAVE to be difficult to read and maintain.  IF reasonable
programming tools are available to the programmer (and the programmer has
the training and experience to use those tools).
--
Doug Pardee -- Edge Computer Corp., Scottsdale, AZ -- uunet!ism780c!edge!doug,
{ames,hplabs,sun,amdahl,ihnp4,allegra}!oliveb!edge!doug,    mot!edge!doug

[My understanding was that the four-pass assembler did so because it was
shoehorned into 44K, the 256K assembler usually managed in two.  It is also
true that there are some spiffy assemblers, most notably the ones for the
IBM 370 series, that have enormously powerful macro languages that let you
do all sorts of astounding stuff.  For example, I know of at least two IBM
computers for which IBM never wrote an assembler, rather defining macros
for the 370 assembler and slightly postprocessing the object deck.

On the other hand, there is nothing that says that the powerful macro language
necessarily has to be bound up with the assembler. Indeed, I know people who
use the 370 assembler as a macro processor, getting their results directly on
the listing or in the object "deck," never generating an executable program,
which suggests to me that we have an overintegrated program. Unix systems
traditionally have a minimal assembler and a separate macro processor, which
seems usually to be sufficient. I can't help but notice that in all of the
swell IBM assemblers, they never did stuff like optimizing span-dependent
branches on machines like the 1130 and more recently the ROMP, which Unix
assemblers have always done. In fact the IBM ROMP assembler had a warning "you
could have used a short branch instead of a long one here," but it never
occurred to them to have it do anything about it.

It has always seemed to me that the main place that assembler loses is in data
structuring -- typed data and pointers buy you a lot, particularly in the
error checking department, and it'd take an awful lot of macro magic to give
an assembler that. -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

jiml@cs.wisc.edu (James E. Leinweber) (12/17/87)

In article <789@ima.isc.com> Doug Pardee writes:
>There is a modern aphorism that "It's much harder to program in assembler
>than in a high-level language".  That's no surprise ...

John Levine adds:
>It has always seemed to me that the main place that assembler loses is in data
>structuring -- typed data and pointers buy you a lot, particularly in the
>error checking department, and it'd take an awful lot of macro magic to give
>an assembler that.

Most people won't dispute that having a compiler do some of the
work such as register allocation and strong type checking really
does make programming easier. Even if you are using a fancy
assembler. Or maybe especially; programs which are mostly
sophisticated macros such as fancy assembler, troff packages, or
FORTH are notoriously hard to read. Personally, I suspect this is
due to difficulty of reproducing the train of thought the author
had while writing the code. Working in an very expressive medium
makes it easy to embody an idiosyncratic approach into the code.

Intermediate between assemblers and higher level languages is an
alternative which is often over looked.  In the late '60s Don Knuth
called them "structured assemblers".  IBM is using one to do space
shuttle software.  Ed Ream published a lovely article in Dr. Dobbs
last year, showing how to use a rich subset of C as an assembly
language.  His key point was that data structures, procedures, and
variables all have straightforward translations to assembly code.  In
the absence of optimization, all you need to give you full assembly
language power with structured syntax is enough restrictions on the
variable declarations that you end up doing the register allocation
yourself.

Jim Leinweber		jiml@uwslh.uucp		jiml%uwslh.uucp@cs.wisc.edu
 ...!{rutgers, ucbvax, ihnp4, ...}!uwvax!uwslh!jiml
State Laboratory of Hygiene @ Univ. of Wisconsin - Madison; (608) 262-8092
[There used to be a lot of structured assemblers; I used one for the 360
called PL/360 which had a distinct Algol flavor.  There used to be something
called LIL at Bell Labs which was sort of souped down C.  The guy who wrote
LIL wrote a tech report in which he said that he considered LIL a failure
because there wasn't anything it did that you couldn't do about as well in
C.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

peter@sugar.UUCP (Peter da Silva) (12/20/87)

[ Refernced article compained about the simplicity of AS and how much more
  useful some high-zoot macro jobbadoo was ]

Have you ever heard of preprocessors?

I found the UNIX "as" assembler plus the "m4" preprocessor much more useful
than DEC's fully-featured Macro-11. For EXACTLY the same program on the same
processor: John-James FIG-Forth for the PDP-11. Some of the features of PDP-11
"as" are very nice indeed. Much more useful than combining two programs in one
piece (macro pre-processor and assembler).

Don't you prefer:

	head(type,,docol,)		/ not entirely sure of the number
	over; plus; swap		/ of commas in the macro here.
1:	pdo; i; emit; ploop; 1b-.
	semis

to:

TYPE::	HEAD	<TYP,E>,,DOCOL		; Ditto.
	.WORD	OVER, PLUS, SWAP
XTYPAA:	.WORD	PDO, I, EMIT, PLOOP, XTYPAA-$
	.WORD	SEMIS
 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

johnl@ima.ISC.COM (01/06/88)

In article <789@ima.ISC.COM> <ames!oliveb!edge!doug> writes:
>I would like to take a moment to dispute the
>claims made here about simplistic assemblers being better for the human
>assembler language programmer than complex assemblers.
> <....lots of detail omitted....>
>
I agree with this poster, and disagree with all the people who've
responded with things like:
	What you really need is a good macro processor SEPERATE from the 
		assembler.
	For example, 'M4' is wonderful, and does everything an 
		assembler programmer can possibly want.

This is absurd. I've had the misfortune of porting 68010 assembler 
programs from a Motorola development system (with a 'real' assembler)
to the Sun Unix environment (with Sun's Unix 'assembler' and M4).

(The reason for this was budget; management wanted all development on 
the same system, and couldn't at first see why we should pay for a real
assembler when we had one that came with Unix. The TARGET system remained
the same.)

M4 is NOT a full function macro processor. 

My biggest problem was that it doesn't know how to translate a character 
into its ascii value. We had a convention of storing strings as a sequence 
of characters with the high bit set on the last character. Strings were 
defined by a macro. How do you write that macro in M4 ? (Yes, there IS 
a kludge around...not a nice one.)

This was not the only missing function !

M4 is not integrated with the assembler. How do I force a string 
(in the data segment or bss, not the text segment) to start on a word
or long word boundary, when the assembler has no alignment directive
available (except in the text segment) ? 

And then there's the problem of an assembler that's designed to assemble
only the output of a C compiler, not ordinary human generated assembler.
It is NOT fun to have the assembler fail to detect your typos, and
generate semi-random code. I could cope with an assembler which lacks
explanatory error messages, if it would at least catch all my errors !
I could not cope with the Sun assembler, which I understand is a typical 
Unix assembler, possibly better than some of the others.

Management eventually accepted my recommendation that they buy a real
assembler for the Sun...it cost much less than the time I was wasting
wrestling with Sun's assembler.

In all fairness: this was a purchased program, written in an environment
where macro facilities were taken for granted. If I'd been writing it 
from scratch ON THE SUN I would have used C style strings, in spite of 
the extra memory, and I wouldn't have made data structures dependent 
on word alignment. Similarly, I would have designed around all the other
problems. But then, poor tools have always placed arbitrary restrictions
on program design !

I don't think having M4 seperate from the assembler was the real cause
of any of my problems. It exacerbated the weaknesses of both tools by 
making it impossible for them to support each other. (Assembler programmers
frequently write macros to replace missing assembler directives like
'place this point at a page boundary'; include a good selection of
assembler directives and integration won't be needed.)

				Arlie Stephens
				Don't try to reply; this account dies today.
[Unless people have other actual interesting experience to report, we can
probably consider this topic finished.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

peter@sugar.UUCP (Peter da Silva) (01/24/88)

In article <842@ima.ISC.COM>,  writes:
> M4 is NOT a full function macro processor. 
> 
> My biggest problem was that it doesn't know how to translate a character 
> into its ascii value. We had a convention of storing strings as a sequence 
> of characters with the high bit set on the last character. Strings were 
> defined by a macro. How do you write that macro in M4 ? (Yes, there IS 
> a kludge around...not a nice one.)

You shouldn't need to. I ported a FORTH from MACRO-11 to M4 plus AS, and
had to do this very thing. I don't recall how I did it (this was something
like 3 years ago), but I'm pretty sure it did something like:

	head(dup...

generated:

	byte 3|200, 'd', 'u', 'p'|200

Doesn't look like a kludge to me...

> M4 is not integrated with the assembler. How do I force a string 
> (in the data segment or bss, not the text segment) to start on a word
> or long word boundary, when the assembler has no alignment directive
> available (except in the text segment) ? 

Your assembler should have that function. The pdp-11 version of as did.

> And then there's the problem of an assembler that's designed to assemble
> only the output of a C compiler, not ordinary human generated assembler.

Could be a problem, yes. It didn't seem too bad for me. All my interface
code for the file I/O was in as and it came together pretty well. It
did catch typos in mnemonics pretty well...

> I could not cope with the Sun assembler, which I understand is a typical 
> Unix assembler, possibly better than some of the others.

It sounds like the Sun 68000 assembler is NOT better than the typical "as".
As should provide all the functions you mentioned. Perhaps the pdp-11 as
was more heavily used by the programmers who developed it. I believe that
quite a bit of 5th edition was written in it.
[true, including the notorious `fc' Fortran.  -John]
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request