[net.arch] compiler optimization vs. microcode optimization

johnl@ima.UUCP (John R. Levine) (07/02/86)

In article <1739@milano.UUCP> janssen@milano.UUCP writes:
>I was talking to some VMS FORTRAN developers at DECUS last year, and
>they were griping a bit about having to redo the code generation for
>the 8600 series VAX.  It seems that the microcode was re-written, and
>now the fastest code sequences for something like DO-loops were
>actually in terms of the "high-level" machine instructions, whereas
>before the FORTRAN compiler avoided those instructions because it was
>faster to do the operation in terms of a number of simpler
>instructions.

This sort of thing has been happening forever.  Back in 1969, when I was
using a 360/91, there was a particularly egregious problem in that the '91
didn't implement the 360's decimal instructions in hardware, and the software
simulation was very slow.  This meant that some otherwise plausible code from
PL/I programs, which used decimal instructions in I/O formatting (or something
like that -- it's been a while) ground to a halt.  On the 91, the fastest way
to do a block move was, as I recall, with a lot of double precision floating
point loads and stores rather than with a block move.  And the load multiple
instruction, which loads registers from memory, was only faster than a series
of regular load instructions if you were loading at least 4 registers.  You
get the idea.

More recently, the timings among the Intel 8086, 186, and 286 vary quite a lot.
For example, on an 8086 it's almost always faster to simulate a multiplication
by a constant by a series of shifts and adds, while on the 286 the multiply
wins.

I suppose this means that you're better off optimizing for a RISC since you
can hope that there's no level of confusing microcode between you and the
hardware, but then you start thinking about levels of pipelining and cache
architectures (the 91 ran loops much faster if they were less than 64 bytes
long and were word aligned) and I get confused again.  A while ago, some
friends of mine at Yale were thinking about a compiler which would give you
both the object code for your program and an optimized machine architecture
to run it on.  Maybe they had the right idea after all.
-- 
John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.EDU
The opinions expressed herein are solely those of a 12-year-old hacker
who has broken into my account and not those of any person or organization.