[comp.arch] RISC vs CISC again

mash@mips.COM (John Mashey) (06/02/89)
In article <40718@bbn.COM> slackey@BBN.COM (Stan Lackey) writes:

>In order to go superscalar, future RISC chips will end up having new,
>incompatibile architectures....
This is one opinion.....another opinion is NO.
I know, for sure, that at least 3 of the existing merchant RISCs have projects
that have been going for quite a while, for compatible next-generation
designs.  That's 3 out of [MIPS, SPARC, 88K, i860, i960, 29K, Clipper];
the number is probably more like 5, I just don't know.  Also, I have little
visibility into a few of these, or the proprietary ones like HP PA or ROMP,
so the number may be more.
At least several of these groups expect to be able to do this; it's hard to say
for sure until the implementations start appearing.
At least one (and probably more) of the groups were thinking about
more aggressive pipelines when they designed the ISA of their first chips.

>I strongly suspect the RISC guys are, as we speak, analyzing stuff like:
Actually, such analysis has been going on for years, as can even be seen in
the publication record, and there's been plenty of discussion on superscalar
machines for longer than that in tehe architecture community.
....
>Let me guess what they're finding.  Correct me if I'm wrong.

> Load/store followed by increment/decr of the address is common.
> Load followed by dependent use of the data is common.
> Decrement of register, followed by test (compare or just zero), followed
>   by branch, is common.
> Addition to register followed by its use as an address.
> Bunch of loads in a row, bunch of stores in a row.
>[I could probably think of more with more time.]
Some of these are indeed common [from having looked at lots of code, not from
having the numbers in front of me.]  On the other hand, so are lots of
pairs that correspond to no CISC instruction I've ever seen:
for example, code like:  if (*a++ == 200) ....
generates:
	lw	r1,0(a)
	li	r2,200
	bne	r1,r2,notequal
	add	a,4	# branch delay slot
and of course, if (*++a == 200) generates:
	lw	r1,4(a)
	li	r2,200
	bne	r1,r2,notequal
	add	a,4	# branch delay slot

1) the first pair looks like: load a word, and while doing it, put a
	constant in a different register
2) the second pair looks like:
	compare 2 registers and branch, and while doing it, increment
	a different register
Such sequences are examples of things seen in common code;
don't ask what the frequencies are (I don't know, and right now, data
on instruction-pair or quad frequencies is not something people are
giving away :-)  In any case, there are lots of pairs that don't look
like anything I've ever seen, at least in {S/360, PDP-11, VAX, 3Bxx,
68K, X86, etc }.
Note that in such RISC code sequences, reogrganizers are already working
hard to avoid having the immediately-following code use loaded data.
Also, as seen above, the increment that goes might go with a load/store has
often been moved all over the place, including the potential (as in the
second case) of rearranging offset calculations for convenience,
such as turning pre-increment into post-increment.
>
>Look familiar?  This prompted my first reaction when RISC started
>getting popular: Where do they go from here? Yes, it's an interesting
I know for sure that at least some people were thinking about more
aggressive pipelines when they designed their ISAs.
>technology opportunity.  But I'd rather implement the next generation
>machines as CISC's rather than RISC's, IF I HAVE TO STAY INSTRUCTION
>SET COMPATIBLE.  In other words, I think that when they learn how to
>implement a fast CISC, it will go faster than the same-technology
>RISC.  Why?  Because for the RISC to keep up, it will have to execute
>an average of two instructions per cycle, and one instruction is a lot
>easier to implement than two, even with auto-increment addressing.
Hmmm.  Rather than fight it out on opinions, how about suggesting
some hypotheses that can be tested, and suggest some outcomes
that would tend to prove or disprove them?  Let me take a stab at
restating this in various ways, to get statements where one can

I think what I heard might be any of the following hypotheses:

H-1: at some point, using similar technology, people will build
upward-compatible CISCs that are faster than upward-compatible RISCs,
that are available at roughly the same time.

or, maybe it's:

H-2: in the next generation, using similar technology,
people will build upward-compatible CISCs that are faster than
upward-compatible RISCs, that are available at roughly the same time.

H-2 is more attractive, in that it's testable in the near future,
like 1990 or 1991 at the latest, but maybe this was not was meant?

>Of course, there is always the possibility to do what they did the
>first time: come out with a new architecture that matches the
>technology window exactly.  Note recent product announcements: this
>just keeps happening!  It's just that when you do this, expect a
>limited lifetime of the architecture.
Whenever you do the first instance of an architecture, it certainly
makes sense to understand and use the current technology window.
Beyond that, thoughtful folks think a ways ahead.  At least several
of the current competing groups have at least done serious thinking
towards implementations in differing technologies and with
different pipelines.  Remember that at least some of the hot ideas
existed a long time ago in supercomputers [the main omission being
the different tradeoffs when using VLSI instead.]

One thing that should be fun in the next round: if you think the
last round has spawned arguments about better ways to do things,
even with pipelines kept purposely simple for risk-avoidance,
or simplicity, or time-to-market, wait till you see the arguments
(and hype) that will hit with the next round...   In particular,
this next time we'll get to debate esoterica seldom seen
outside SIGARCH, ASPLOS, etc.  At least, I think this time we've
all got better measurement tools, and so a lot of data should
become available by the time the dust settles.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086