[comp.arch] Compilers and RISC

Moss@cs.umass.edu (02/10/90)

In article <19233@dartvax.Dartmouth.EDU> jskuskin@eleazar.dartmouth.edu (Jeffrey Kuskin) writes:

Yes, but how much do we benefit from the richer instruction sets, even
if all the instruction are hardwired and execute at 1 cycle/instruction?
Isn't one of the RISC folks' main arguments for simple instruction sets
that current compilers don't effectively exploit the complex addressing
modes and instructions supported in CISC chips?  Perhaps someone would
like to speculate on what progess the next decade will bring in
compiler technology...

To which I respond .... 

While I will not claim to be a true expert in these matters, the effect of
RISC on compilers appears to be more subtle than that. A simpler instruction
set makes the task of *instruction selection* easier, but simpler instructions
are not the *only* RISC tenet. A load/store architecture with somewhat more
registers than most CISCs is another typical RISC attribute, and dealing with
that aspect effectively requires really good register allocation -- perhaps
better than you need for a CISC. One actually needs inter-procedural as
opposed to merely "global" (all of a procedure; local = within a basic block)
optimizations to exploit RISC fully. My impression is that RISC chips have
mostly shifted the burden but have not really simplified the job of writing a
compiler. You should also take into account the additional complexities of
dealing with instruction scheduling (results of a load are generally not
available immediately, etc.) and delay slot filling. I do see one *possible*
advantage accruing from all of this, though, which is that instruction
scheduling, delay slot filling, and register allocation may be somewhat more
machine independent in concept and more amenable to being table driven across
architectures than instruction selection generally proves to be. Another
interesting point is that there are now fast optimal techniques for
instruction selection within basic blocks (under some moderately reasonable
conditions) -- see the recent article in TOPLAS (Ganapathi is one of the
authors if I recall correctly) on optimal code generation from trees.

[Note: this discussion originated in comp.arch, but I am following up there as
well as comp.compilers.]
--
		J. Eliot B. Moss, Assistant Professor
		Department of Computer and Information Science
		Lederle Graduate Research Center
		University of Massachusetts
		Amherst, MA  01003
		(413) 545-4206; Moss@cs.umass.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

dgb@cs.washington.edu (David Bradlee) (02/11/90)

In article <1990Feb9.161153.4190@esegue.segue.boston.ma.us>, Moss@cs.umass.edu writes:
> My impression is that RISC chips have mostly shifted the burden but have
> not really simplified the job of writing a compiler. You should also take
> into account the additional complexities of dealing with instruction
> scheduling and delay slot filling. 

I agree completely that RISCs have shifted the compiler burden, especially
considering machines with multiple functional units, etc.  Then, there's
the Intel i860 which gives compilers lots of opportunities for work.

> I do see one *possible* advantage accruing from all of this, though, which
> is that instruction scheduling, delay slot filling, and register
> allocation may be somewhat more machine independent in concept and more
> amenable to being table driven across architectures than instruction
> selection generally proves to be. 

This is probably true for register allocation by itself, given that RISCs
typically have general purpose register sets, but it remains to be seen
for instruction scheduling and for the combination of the two.  Part of
the problem is the specification of the information needed for scheduling
(especially for something like the i860).  Then there's the question of
whether a particular scheduling technique will be good enough for all your
desired targets.  Perhaps not.  

The real point here is that RISCs have changed compiler needs, but there
are still plenty of needs.

	Dave Bradlee
	University of Washington
	dgb@cs.washington.edu

[Keep in mind that the IBM 801 project, the original RISC work, closely
involved John Cocke, Fran Allen, and other compiler experts.  The PL.8
compiler that was part of that effort is still a serious contender for
world's best optimizing compiler.  It has been retargeted for lots of
different machines, evidently without a whole lot of work.  The Berkeley
project as far as I can tell involved no compiler people at all, which
appears to me to be the reason that they invented register windows, being
unaware of how good a job of register management a compiler can do.  -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

pardo@june.cs.washington.edu (David Keppel) (02/12/90)

In article <1990Feb11.040548.223@esegue.segue.boston.ma.us> the
compilers moderator (John Levine) writes:
>  The Berkeley
>project as far as I can tell involved no compiler people at all, which
>appears to me to be the reason that they invented register windows, being
>unaware of how good a job of register management a compiler can do.  -John]

Somebody -- probably Wirth -- has commented that a huge portion of a
compiler's size and a huge number of its bugs come from the optimizer.
Althought I doubt the Berkeley people were making the tradeoff
consciously, I consider a *>>SIMPLE<<* hardware scheme a Good Thing if
it can replace a complicated software scheme and get nearly as good
performance as a software scheme.

Note that this is NOT the RISC argument.  This is the CISC argument,
except that I am arguing that it is OK for the performance to get
WORSE when you use hardware.  It's a provocative position and I won't
try to defend register windows.

	;-D on  ( A trend to go in circles )  Pardo
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
[I always thought that the Intel 432 was an extreme example of why helpful
hardware can be a bad idea.  Bug-wise, consider the size of the errata list in
any modern CISC chip, and that according to a recent comp.arch posting there
are no known bugs in the current Moto 88000 chips. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

colwell@multiflow.com (Robert Colwell) (02/13/90)

In article <1990Feb11.221418.2634@esegue.segue.boston.ma.us> pardo@june.cs.washington.edu (David Keppel) writes:
>Somebody -- probably Wirth -- has commented that a huge portion of a
>compiler's size and a huge number of its bugs come from the optimizer.
>Althought I doubt the Berkeley people were making the tradeoff
>consciously, I consider a *>>SIMPLE<<* hardware scheme a Good Thing if
>it can replace a complicated software scheme and get nearly as good
>performance as a software scheme.

This hasn't been my experience at all.  What I've found is that bad
compilers have lots of bugs, and good ones don't.  Good ones are those
written by good people who know what they're doing.  All else being 
equal, simple things have fewer bugs than more complicated things, 
but as usual, all else is NOT equal.  Good compiler writers won't 
settle for mediocre performance, and they won't shy away from complex
algorithms in the process.  But this doesn't mean they'll end up with
a buggier compiler, it means they have to earn their pay.  So what?
Hardware folks have been doing this for decades.

>[I always thought that the Intel 432 was an extreme example of why helpful
>hardware can be a bad idea.  Bug-wise, consider the size of the errata list in
>any modern CISC chip, and that according to a recent comp.arch posting there
>are no known bugs in the current Moto 88000 chips. -John]

???Why is the 432 an "extreme example of why helpful hardware can be a
bad idea"?  And I have a problem accepting your "Bug-wise" sentence, too.
If you really want to compare errata sheets of RISC and CISC processors
and then draw some conclusion (an exercise I don't necessarily find 
meaningful) then you need to compare how long it takes each to get to
zero errors.  You can't just take a snapshot comparison and conclude
anything.  How many mask revs did it take Moto to get the 88K to a
clean sheet?  Even if we knew that, and we knew how long the (say) 
68040 will take, it's hard to extrapolate from two data points.

It would be fun data to kick around, though, if the micro makers
would cough up the info.  Tell 'em it's for usenet, they'll understand. :-)

Bob Colwell               ..!uunet!mfci!colwell
Multiflow Computer     or colwell@multiflow.com
31 Business Park Dr.
Branford, CT 06405     203-488-6090
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.

amos@nsc.nsc.com (Amos Shapir) (02/13/90)

In article <1990Feb12.180813.6585@esegue.segue.boston.ma.us> colwell@multiflow.com (Robert Colwell) writes:
|In article <1990Feb11.221418.2634@esegue.segue.boston.ma.us> pardo@june.cs.washington.edu (David Keppel) writes:
|
|>[I always thought that the Intel 432 was an extreme example of why helpful
|>hardware can be a bad idea.  Bug-wise, consider the size of the errata list in
|>any modern CISC chip, and that according to a recent comp.arch posting there
|>are no known bugs in the current Moto 88000 chips. -John]
|
|You can't just take a snapshot comparison and conclude
|anything.  How many mask revs did it take Moto to get the 88K to a
|clean sheet?  Even if we knew that, and we knew how long the (say) 
|68040 will take, it's hard to extrapolate from two data points.

There's another factor - the design process has been largely automated
recently, and newer designs contain quite a fewer number of bugs than
earlier ones.  I don't know about 88K and 68k, but the NS32532 had reached
maturity in 3 revisions, while the NS32016 which is much less complicated,
took about 20 - just a few years ago.

-- 
	Amos Shapir
National Semiconductor, 2900 semiconductor Dr.
Santa Clara, CA 95052-8090  Mailstop E-280
amos@nsc.nsc.com or amos@taux01.nsc.com 

dgb@cs.washington.edu (David Bradlee) (02/14/90)

> [Keep in mind that the IBM 801 project, the original RISC work, closely
> involved John Cocke, Fran Allen, and other compiler experts.  The PL.8
> compiler that was part of that effort is still a serious contender for
> world's best optimizing compiler.  ...

The PL.8 compiler project was certainly a valuable effort.  The
register allocation strategy, in particular has been widely used in
various forms.  However, very little has been published concerning
instruction scheduling issues in the PL.8 project.  If anyone knows of
papers discussing scheduling issues for a PL.8 target to a machine
with multiple functional units and floating point (e.g. Motorola
88000), I would certainly be interested in hearing about it.

	Dave Bradlee
	Department of Computer Science and Engineering, FR-35
	University of Washington
	Seattle, WA  98195
	dgb@cs.washington.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{spdcc | ima | lotus}!esegue.  Meta-mail to compilers-request@esegue.
Please send responses to the author of the message, not the poster.