[comp.arch] using assembler

smryan@garth.UUCP (Steven Ryan) (07/27/88)

>Given real world system sizes, budgets, deadlines, and MAINTENANCE
>constraints as your "requirements,"  assembly language will "fit" only
>the smallest, most toy-like, never-to-be-used-for-real prototypes.

Leaping into the fray, I will place a vote for assembly. In a previous
incarnation, I worked on a Fortran compiler written in a macro assembler.
Today I'm working on a compiler written in Pascal. The assembly-written
compiler was easier to maintain for two reasons:

(1)  using macros, the assembler was more extensible than pascal.

(2)  your manager won't let you write an assembly program without block
     comments, interface description, and comments on nearly ever line.

In the case of assembly, nobody pretends the code is self documenting
and so everybody is required to maintain the appropriate documentation.

The pascal code is presumed self documenting which is load of cow doo-doo.
Added to that, old code is commented out rather than deleted. [SCCS? We
don't need no steenking SCCS.]

As a newcomer to Unix, I'm disgusted by the lack of documentation, both
embedded in the code and in the user manuals/interface description.
[Interfaces? We don't need no steenking interfaces.] And I thought CDC was
bad. I still don't know what all the VI/EX commands are--none the ATT manuals
I have describe it fully. I can't use EMACS because nobody has botherred
defining it or make the documentation available.

Assembly is so difficult that nobody deludes himself about the difficulty.

Unix won't go belly up because of ATT's pigging--it'll die when Kernighan,
Ritchie, Thompson, et al retire and the next generation is completely
unable to decipher the supposedly simple and easy operating system.

ken@gatech.edu (Ken Seefried iii) (07/27/88)

In article <1086@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>>Given real world system sizes, budgets, deadlines, and MAINTENANCE
>>constraints as your "requirements,"  assembly language will "fit" only
>>the smallest, most toy-like, never-to-be-used-for-real prototypes.
>
>Leaping into the fray, I will place a vote for assembly. In a previous
>incarnation, I worked on a Fortran compiler written in a macro assembler.
>Today I'm working on a compiler written in Pascal. The assembly-written
>compiler was easier to maintain for two reasons:
>
>(1)  using macros, the assembler was more extensible than pascal.

Hard to believe, but i will conceded that it is possible.

>(2)  your manager won't let you write an assembly program without block
>     comments, interface description, and comments on nearly ever line.

Funny, both my professors and employers don't let me write C or Modula-2
without them either.  I dont see this being endemic to assembler.

>In the case of assembly, nobody pretends the code is self documenting
>and so everybody is required to maintain the appropriate documentation.
>
>The pascal code is presumed self documenting which is load of cow doo-doo.
>Added to that, old code is commented out rather than deleted. [SCCS? We
>don't need no steenking SCCS.]

Sounds like a lack of programmer discapline to me.  I have no illusions  about
the self documenting aspects of pascal; you're right, they don't exist.
C is much worse.   But it is the programmers fault any programme is not
thouroghly documented, almost line by line.

Also, any team that is working on a large software project and not using
SCCS (or better, RCS), deserves the disaster that results.

>As a newcomer to Unix, I'm disgusted by the lack of documentation, both
>embedded in the code and in the user manuals/interface description.

Join the club...;'}

>Assembly is so difficult that nobody deludes himself about the difficulty.

And C is a cakewalk?  Fine, Ill tell you what...lets get together some
time and program, say, a MIPS machine, you in asm, and me in  C.
I'll even tie one hand behind my backand program in FORTRAN.  And 
we'll see how far each gets (the MIPS is a RISC machine.  Even trivial
operations require non-trivialamounts of asm code).

Thats the problem with assembly in my perception.  It makes EVERYTHING
difficult, especially on some of the new architectures. At least with
C or Pascal,some things are easy.

>... [about unix] ...the supposedly simple and easy operating system.

I've never heard this from anyone but marketing types...

	ken seefried iii	...!{akgua, allegra, amd, harpo, hplabs, 
	ken@gatech.edu		inhp4, masscomp, rlgvax, sb1, uf-cgrl, 
	ccastks@gitvm1.bitnet	unmvax, ut-ngp, ut-sally}!gatech!ken

	soon to be open: ...!gatech!spooge!ken (finally ;'})

gillies@p.cs.uiuc.edu (07/28/88)

I have a hard time believing people who claim that assembly is more
modifiable than a high level language.  Consider:

(a) Changing an *int* declaration to a *long int* declaration.

(b) Changing a *float* variable to a *double* variable.  Changing an
*int* to a *float*.

(c) Extensively revising a struct {} data type to add and remove
fields, then recompiling.

(d) increasing the local storage of a subroutine from 5 words (smaller
than the register set) to 50 words (probably larger than the register
set).

(e) Changing a subroutine interface specification (adding & removing
arguments) & making sure all the calls are updated appropriately.

With a high level language (C/Modula-II), these changes are instant
and trivial.  In assembly, even if you plan ahead to be "tricky" with
macros, etc, the changes are nontrivial.

Let's just say "It is possible to make changes to assembly language
programs", but I don't want to hear about how easy it is.


Don Gillies, Dept. of Computer Science, University of Illinois
1304 W. Springfield, Urbana, Ill 61801      
ARPA: gillies@cs.uiuc.edu   UUCP: {uunet,ihnp4,harvard}!uiucdcs!gillies

henry@utzoo.uucp (Henry Spencer) (07/29/88)

In article <1086@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:
>Assembly is so difficult that nobody deludes himself about the difficulty.

Ho ho ho ha ha ha HO HO HO HA HA HA!

If only it were so. :-(
-- 
MSDOS is not dead, it just     |     Henry Spencer at U of Toronto Zoology
smells that way.               | uunet!mnetor!utzoo!henry henry@zoo.toronto.edu

chris@mimsy.UUCP (Chris Torek) (07/29/88)

In article <76700041@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>I have a hard time believing people who claim that assembly is more
>modifiable than a high level language.

I think they mean that the assemblER (the thing that reads the program
source and generates the program object) is more flexible/modifiable
than a higher-level language compiler.  This I find easy to believe.

Before I had decent HLL compilers available to me (when I was
programming on a TRS-80 model I with bits of homebrew hardware and
software: now *that* is primitive :-) ) I did some fairly extensive
Z80 assembly programming.  The only way I could keep track of what I
was doing was to stick to a particular discipline.  Within that, I
was at least as restricted as by a compiler.

Given that (voluntary) set of restrictions, none of the following
were particularly difficult:

>(c) Extensively revising a struct {} data type to add and remove
>fields, then recompiling.

(Done by using various label conventions.  It would have been nice had
I had something to do this for me.)

>(d) increasing the local storage of a subroutine from 5 words (smaller
>than the register set) to 50 words (probably larger than the register
>set).

(Use the stack.  Painful; stack-relative references are *hard* on a Z80.
The typical trick is to use IX or IY; this is slow.)

>(e) Changing a subroutine interface specification (adding & removing
>arguments) & making sure all the calls are updated appropriately.

(A matter of checking all the calls.  Typically pointers were in HL and
integers in BC, or [if small] in B, C, or A.  If many arguments, some
went on the stack, or in a `structure' with a descriptor in HL.)

>With a high level language (C/Modula-II), these changes are instant
>and trivial.

Not all of them, perhaps, but the point is that the HLL helps me to do
this: it makes it much easier, and that is why I use(d) it.  (The
TRS-80 has been off for years, now; it has a bad 4116 somewhere in the
expansion interface, and I have not the inclination to hunt it down,
nor do I have any idea where, or even whether, these 16K chips are
still available.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

albaugh@dms.UUCP (Mike Albaugh) (07/29/88)

Piggy-backing on article <12729@mimsy.UUCP>, by chris@mimsy.UUCP (Chris Torek):

	This is my first attempt to post anything in news, so please bear
with me. Apologies also to Chris Torek for doing it by replying to his message.
It seemed an easy way to get into an on-going discussion. I have two comments
and a question.

	First, in re: assembly. While I thoroughly concur that it should
be used sparingly, I cannot condone an outright prohibition. I'd rather
have to maintain an average C program than an average assembly program, but
I'd rather wrestle a rabid weasel than attempt to modify the sort of C that
all too often results when someone dogmatically refuses to 'descend' to
assembly, and instead pulls every trick the last version of his compiler let
him get away with. If you're so sure you know the right five assembly
instructons, just use them. Don't write ten lines of totally opaque C that
will break next release of the compiler (see below).

	Second, in re: hardware Blitters. I find it at least mildly amusing
that in a group where I might expect to see some discusion of algorithms to
run conventional (SISD) code on multiple processors and/or functional units,
someone is seriously proposing that it is a BAD IDEA to take an explicit,
programmer-given, clean partioning of the task and throw it away. Next I
suppose I'll hear that "Real men don't use disk controllers, they bit-boff the
read/write heads" :-)

	Last, the question. In the last two years I have increasingly run
across cases where new releases of the compiler have introduced bugs in my code.
I'm not talking about tighter type checking catching bogus constructs, I'm
talking about pushing the wrong length of data if a parameter is an expression
containing (0+fred) where fred is a short and the 0 resulted from folding some
#define's. Or would you believe forgeting how to multiply or divide (this seems
especially difficult for C compilers for the 68000). Anyway, what is the
orthodox way to deal with the fact that your 50,000 lines of debugged, tested
code become 50,000 lines of un-debugged, un-tested code every time a new
release of the compiler comes out. I know about regression testing, but the last
time it ONLY munged the multiply that was concerned with the average up-time
of a machine, so I didn't hear about it until a machine in the field had been
running continuously for over 32767 minutes. Any suggestions?

| Mike Albaugh (weitek!dms!albaugh) voice: (408)434-1709
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035

chris@mimsy.UUCP (Chris Torek) (07/30/88)

[I have redirected this to comp.misc, as perhaps I should have done earlier]

In article <517@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes:
>	First, in re: assembly. While I thoroughly concur that it should
>be used sparingly, I cannot condone an outright prohibition.

Certainly---for instance, having /sys/vax/locore.s written in assembly,
rather than in `asm' statements with intermixed C code, is clearly
proper.  Things such as

	#define VECTOR(x) (((void (**)())VEC_BASE)[x])
	...
		void my_keyintr(), (*old_keyintr)();
		ipl_t ipl;

		ipl = iplraise(IPL_KEYINTR);
		old_keyintr = VECTOR(17);
		VECTOR(17) = my_keyintr;
		setipl(ipl);

are somewhat `iffy', although I prefer the C code in this particular
case.  Stuff like

	f()
	{
		long *savedstack = &savedstack - 7;
		...

is, for me, on the other side of that line.

>... Anyway, what is the orthodox way to deal with the fact that your
>50,000 lines of debugged, tested code become 50,000 lines of
>un-debugged, un-tested code every time a new release of the compiler
>comes out. I know about regression testing, but the last time it ONLY
>munged the multiply that was concerned with the average up-time of a
>machine, so I didn't hear about it until a machine in the field had
>been running continuously for over 32767 minutes. Any suggestions?

I doubt you will find any infallible method.  The obvious answer is
a more strict regression test: add such multiplication to your test
suite.  Indeed, add anything that has ever triggered a compiler bug
to your test sets.  (This can quickly become unweildy :-) )

Knuth likes the `exercise every line of code' test: instrument the
compiler, and arrange to have everything in it run at least once
(including, of course, the error recovery code).  This usually requires
external hooks, and it helps immensely to have a compiler that does the
instrumentation automatically.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

rick@pcrat.UUCP (Rick Richardson) (07/31/88)

In article <517@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes:
>especially difficult for C compilers for the 68000). Anyway, what is the
>orthodox way to deal with the fact that your 50,000 lines of debugged, tested
>code become 50,000 lines of un-debugged, un-tested code every time a new
>release of the compiler comes out.I know about regression testing, but the last
>time it ONLY munged the multiply that was concerned with the average up-time
>of a machine, so I didn't hear about it until a machine in the field had been
>running continuously for over 32767 minutes. Any suggestions?

Never, ever, switch tools on production quality software unless you have
to.  If you absolutely have to change tools, treat the software as you
would the first time it is sent to system test -- take nothing for granted.








-- 
		Rick Richardson, PC Research, Inc.

(201) 542-3734 (voice, nights)   OR     (201) 389-8963 (voice, days)
uunet!pcrat!rick (UUCP)			rick%pcrat.uucp@uunet.uu.net (INTERNET)

albaugh@dms.UUCP (Mike Albaugh) (08/02/88)

In article <12756@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek):
>a more strict regression test: add such multiplication to your test
>suite.  Indeed, add anything that has ever triggered a compiler bug
>to your test sets.  (This can quickly become unweildy :-) )

No kidding? The point of capping "ONLY" was that this was the only 16x16
multiply in the whole set of 6 modules that was hit. In a friend's case, only
one of 20 32x32 multiplies was bungled. These bugs seem very sensitive to
context, moving from one to another expression in a routine when, for example:
 fred() { int a; short b; ...} is changed to:
 fred() { short b; int a; ...} WHETHER or NOT a and b are involved in the
eventual "victim" multiply. Also, the 32x32 bug which got my friend was not
one of the classic "corner" cases. Should I check all possible 32x32 multiplies
across all possible permutations of variable declarations. Perhaps you were
under the impression that I have the sources to the compiler. I wish that
were the case, but I don't.

From article <546@pcrat.UUCP>, by rick@pcrat.UUCP (Rick Richardson):
> Never, ever, switch tools on production quality software unless you have
> to.  If you absolutely have to change tools, treat the software as you
> would the first time it is sent to system test -- take nothing for granted.

I am a lowly programmer :-).  People like system administrators decide when
to change releases of the compiler. They do so based on considerations like
how much disk space they can spare to hold old versions and how many other
people are screaming for the latest release to fix the OLD bugs that broke
THEIR code. They also have to deal with vendors who refuse to even listen
to bug-reports on any but the latest version.

	One of my responsibilities here is to maintain/extend a set of
tools and utility subroutines for use in embedded systems. While we no
longer manufacture the games for which they were originally written, there
is enough commonality to warrant re-use of code (I thought this was an
acceptable practice, using proven code rather than re-inventing the wheel :-)
But the release of the compiler that compiled the original code (circa 1982)
won't even run on the latest release of VMS (which is where I work).

> would the first time it is sent to system test -- take nothing for granted.
>                                                   ^^^^^^^^^^^^^^^^^^^^^^^^
	If you mean that literally, there is no point in using even an
assembler, let alone a compiler. And even if I coded in hex, the loader
could always "get" me :-) I have had cases where the inclusion of an
fprintf(stderr... was enough to get rid of the bug, but if the the
fprintf was removed, #if'd or even if ( debugsw == 1) ..., the bug
would re-appear. Is Heisenberg writing compilers these days :-)

	One of the things that worry me most about ADA is that so much
of the code in many "mission critical" systems will be in the form of
"packages" which, like my compiler, are "gauranteed" but not testable.
So we'll know who to sue for the next shuttle explosion or WWIII :-)


| Mike Albaugh (weitek!dms!albaugh) voice: (408)434-1709
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035
| The opinions expressed are my own (My boss is on vacation)

hjm@cernvax.UUCP (hjm) (08/03/88)

In article <519@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes:
> Is Heisenberg writing compilers these days :-)
>
Answer: probably! :-)

	Hubert Matthews

zs01+@andrew.cmu.edu (Zalman Stern) (08/15/88)

> *Excerpts from ext.nn.comp.arch: 27-Jul-88 Re: using assembler Ken Seefried*
> *iii@gatech. (2803)*

> And C is a cakewalk?  Fine, Ill tell you what...lets get together some
> time and program, say, a MIPS machine, you in asm, and me in  C.
> I'll even tie one hand behind my backand program in FORTRAN.  And
> we'll see how far each gets (the MIPS is a RISC machine.  Even trivial
> operations require non-trivialamounts of asm code).

I am not sure which instructions you are refering to as requiring non-trivial
amounts of code to synthesize. Are they things I am likely to use quite often?

Here are some reasons why the MIPS R2000/R3000 should be quite reasonable for
assembly language programming:

Simplicity:

    3 hours with "MIPS R2000 RISC Architecture" by Gerry Kane and I understood
    (or "groked" if you prefer) the R2000 and R2010 (floating point unit).
    There were none of those "What would I use that instruction for?" type
    questions going through my mind.
Orthogonal register set:

    One register is pretty much as good as another. None of this "Where am I
    going to put the CX register so I can do a shift" crud you run into on the
    earlier Intel beasties.
Abundance of registers:

    You get ten or twelve (depending on whether or not you count v0 and v1)
    unsaved registers to use as temporaries. (Actually, you can add in four to
    that for the argument passing registers a0-a3.)
Arguments passed in registers:

    Many routines will not need to allocate a stack frame at all. This frees
    you from having to deal with the calling convention a lot of the time.
Single cycle instructions:

    You don't have to have an instruction timing table handy to write efficient
    code. Almost every instruction takes one cycle. The only exceptions I know
    of are multiply/divide, loads/stores, and branches. (And of course floating
    point.)
Intelligent assembler:

    The assembler removes the burden of scheduling delay slots from programmer.
    The assembler can also synthesize addressing modes for the programmer.
Of course I don't write entire programs in assembly. (For many reasons, most of
which can be summed up by saying "Assembly language is just the wrong level of
abstraction.") I occasionally find it necessary to write a routine or two in
assembly either because high level languages can't do what I need, or because I
need extreme speed. Examples of where this has come up in practice are dynamic
loading and DES encryption.

We have a dynamic loading system which uses a "link snapping" mechanism. This
means that when you call a routine that hasn't been loaded yet, you wind up in
some trampoline code that loads the routine, fixes the original reference to
the routine to point to the newly loaded code, and finally jumps to the new
routine. Since there is no way to jump to a routine in C, this trampoline code
must be written in assembly.

In the DES case, assembly can win big because DES is essentially a bunch of bit
manipulations on a small block of data (64 bytes if I remember correctly.) In
assembly, the entire block of data can be loaded into the register file and
manipulated. The lack of loads and stores during the manipulation makes the
encryption run much faster. (I have yet to run into a C compiler that is tense
enough to do this. Maybe someday, one will exist.) Most people have decided
that the portability loss of assembly is not worth the speed gain for DES code.

I have never actually programmed on the MIPS machine. I have however written
assembly code for the IBM RT which has some of the same features. (Notably
passing arguments in registers.) I have had a much easier time on the RT than
on either the VAX, the 68000, or the 8086. (Granted the 68020 and the 80386 fix
a few of my complaints with these processor families.)

In short, a processor's machine language ought to be simple, regular, and damn
fast.

Sincerely,
Zalman Stern
Internet: zs01+@andrew.cmu.edu     Usenet: I'm soooo confused...
Information Technology Center, Carnegie Mellon, Pittsburgh, PA 15213-3890