[comp.lang.c] inline assembly?

chris@mimsy.UUCP (Chris Torek) (03/09/88)

(Or, Here We Go Again)

In article <703@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
>Suppose that you have a situation in a program where the implementation of
>your compiler has done something bad, like use 10 instructions and 6 memory
>references where 2 instructions and no memory references can do the job.
>The overhead of a subroutine call is likely to be more than what can be
>saved; a goto construction may not work if part of the problem is that the
>compiler will mess up register use (very common).

Then you do something very implementation dependent and fix the problem
however you have to fix it, and your program runs 10 times faster.  And
when you move it to a different machine your implementation-dependent
code no longer works, but that is all right because you were careful
about how you wrote it, and you fall back on the portable version that
in this case happens to generate nearly optimal code so that the only
thing you could do by fiddling with it is make it run 1.0001 times
faster.

So whatever *is* the problem?  When you need to do something that is
not portable, because it is worth it, you do something nonportable.  By
definition, it is not portable, so why should you care in what way it
is not portable?  There is always *some* way; if necessary, you can
compile, link, disassemble, edit, assemble, and relink---all
automatically.  The existence of some way need only exist; it need
not be portable, because by definition it cannot be portable.

(The portable fix for the problem, of course, is to tweak the compiler
so that it generates nearly-optimal code.  This also has a much bigger
potential payoff.  Inline assembly is merely a band-aid.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

wesommer@athena.mit.edu (William E. Sommerfeld) (03/09/88)

In article <10573@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>(The portable fix for the problem, of course, is to tweak the compiler
>so that it generates nearly-optimal code.  This also has a much bigger
>potential payoff.  Inline assembly is merely a band-aid.)

I second that motion.

The original MIT/Athena implementation of DES (for our Kerberos
authentication system) was extensively hand-tweaked, with some
VAX-specific inline assembly, by Steve Miller.  [1]

I spent some time looking at what he did; the bulk of his tweaks
involved replacing `extzv' (a bitfield extract, generated because the
VAX doesn't believe in unsigned shifts) with `rotl' (rotate) when it
was preceded by or followed by a `bicl' with a constant (bit clear);
this produced about a 10-20% improvement on the MicroVAX II.

It was relatively trivial to add a pair of peephole optimization
patterns to Stallman's GNU C compiler to cause it to do just what
Steve Miller did.

\begin{gcc plug}
GCC seems to be extremely portable and very easy to tweak to produce
better code.  If a few more ports get done, the excuse "but the
compiler produces bad code" will probably not be acceptable any more.
\end{gcc plug}

					- Bill

[1] The assembly was put in `#ifdef VAXASM' so it was possible to
compile a version on the VAX which ran just the straight C code.  For
those people who get our DES library but don't have GCC, we'll include
the VAX assembler for the version compiled with GCC.