[comp.arch] CISCy RISC? RISCy CISC?

sbw@naucse.UUCP (Steve Wampler) (10/17/88)

(I just know I'm going to regret this, but hey, it's late.)

Just what is it about RISC vs. CISC that really sets them apart?
With my very naive understanding, it really seems that the big
difference is that RISC models will let one get into the high
speed technologies faster (which we really haven't seen out on
the market place yet).  Other than that, I doubt I would care
whether my machine is RISC or CISC, if I can even tell them apart.

A case in point.  I know of a not-yet-announced machine (perhaps
never to be announced machine) that has just about the largest
instruction set I can imagine (not to mention the 15+ addressing
modes).  However, the machine has many of the features that gives
some RISC chips their performance - zillions of registers, big
I and D caches, etc., and gets most instructions down to 1 cycle
per instruction (some of the more complex instructions 'appear'
to run faster because they work on multiple bytes at a time).
The result is a 12.5MHz machine that runs 25000 (claimed)
dhrystones using what I would call a 'throwaway' C compiler.
The manufacturers 'know' they can push the clock up to 30MHz,
which (my estimate, this time) would give >40000 dhrystones.
(Hey, I'm into software, I don't know what's true in hardware.)

Now, I've missed most of the RISC/CISC wars, but these seem to
me to be very fine numbers, at least compared with the
uVAXen I've played with (all of which cost more).
How do they compare to current RISCs?  I'd bet pretty much the same.
I personally couldn't care which machine I'd own (not that I can
afford any).  When the really fast chips come in, I bet the RISC
machines are the first to come out, but still, is there something
that will keep CISC from catching up?
-- 
	Steve Wampler
	{....!arizona!naucse!sbw}

abali@raksha.eng.ohio-state.edu (Bulent Abali) (10/18/88)

In article <973@naucse.UUCP> sbw@naucse.UUCP (Steve Wampler) writes:
>A case in point.  I know of a not-yet-announced machine (perhaps
>never to be announced machine)....

>The result is a 12.5MHz machine that runs 25000 (claimed)
>dhrystones using what I would call a 'throwaway' C compiler.
>The manufacturers 'know' they can push the clock up to 30MHz,
>which (my estimate, this time) would give >40000 dhrystones.

I designed a new machine on a napkin which cranks a humble
100,000 dhrystones at 8 Mhz. It is faster than 
Amdahl 5890 (43,000 dhrys), IBM 3090 (31,000 dhrys), 
and Cray XMP (18,000 dhrys).

bcase@cup.portal.com (Brian bcase Case) (10/20/88)

>(I just know I'm going to regret this, but hey, it's late.)

Nah, we're as gentle as a spring rain in this group.  :-)

>Just what is it about RISC vs. CISC that really sets them apart?
>With my very naive understanding, it really seems that the big
>difference is that RISC models will let one get into the high
>speed technologies faster (which we really haven't seen out on
>the market place yet).  Other than that, I doubt I would care
>whether my machine is RISC or CISC, if I can even tell them apart.

You are right!  *Who cares* what it is, as long as it meets your
needs.  This is the old concept of the "High-level language machine"
about which Patterson wrote before the RISC I came out.  The point
is that a HLL machine can be perfectly well implemented at a VERY
low level as long as the user sees a HLL machine.

BUT, RISC is easier to make go fast than CISC.  Some, but only some,
of the advantage is that RISCs are starting fresh while most CISCs
must be backward compatible.  Even so, there are no new CISC designs
being done, that I know of.  THe point is, if you have a choice, you'd
be dumb to design a CISC instead of a RISC, at least with what we
know at this time.

The real difference:  Optimizing compilers can do a great job of
optimizing at a low level, but not at a higher level; i.e., RISC
instructions implement the right primitives, CISC implements groups
of operations at once thus preventing the compiler from breaking
them up so that the individual parts can be eliminated, factored out
of loops, reused, etc.  Now, given that the compiler wants simple,
composable primitives, we notice with glee that these are exactly the
things that can be implemented in a uniform pipeline! Wow!  *Synergy*
The whole is greater than the sum of the parts.  There is much more to
it, things like registers and exposed parallelism, but I think this
pretty much sums it up (if possible).

Would you buy a book called "Understanding RISC" if someone wrote it?
I hope so!

>A case in point.  I know of a not-yet-announced machine that has just
>about the largest instruction set I can imagine (not to mention the 15+
>addressing modes).  However, [it] has features that give RISC chips
>their performance - zillions registers, big I & D caches, etc., and most
>instructions down to 1 cycle per instruction.  result is a 12.5MHz machine
>that runs 25000 (claimed) dhrystones using what I would call a 'throwaway'
>C compiler.  The manufacturers can push the clock to 30MHz, which would give
>>40000 dhrystones.

Well, if it's true it's true.  If dhrystones scale with clock rate on this
machine that would give 60K dhrystones.  This is not bad for 30 MHz.  2000
dhrystones per MHz is better than the typical 1100-1500 per.  My guess,
though, is that most of the instructions in the set do not contribute
commensurate with their implementation cost.  Studies have shown again and
again that it's hard to beat an instruction set with load, store, add, cmp,
delayed branches (maybe compare-and-branch if that fits in the pipe), and
call, at least for systems code.  Very few other instructions, but some,
contribute more than 1% to *overall* performance.  Did these guys really
study the frequencies of execution and total time taken?  Perhaps they
did.  I think I know who you are talking about, but I can't say anything.
Would this processor come from Germany?  And many RISCs and re-implemented
CISCs will be at more than 30 MHz soon (?).

>...but these seem to me to be very fine numbers, at least compared with the
>uVAXen I've played with.  How do they compare to current RISCs?  I'd bet
>pretty much the same.

Yup, pretty much the same.  I suspect they implemented the simple instructions
as a RISC would, and *that* is the reason that its performance is good, not
the existence of all the other instructions.  But there is always room for
a discovery or two.

>When the really fast chips come in, I bet the RISC machines are the first
>to come out, but still, is there something that will keep CISC from
>catching up?

There are many techniques that can be applied to both RISC and CISC
machines to make them fast.  However, some of them are MUCH more difficult
to implement when the length and format of instructions isn't known a priori
and when instructions can have multiple effects and require multiple cycles.
Thus, the CISC guys will initially implement the simple, RISC-like subsets
of their instruction sets using a uniform pipeline.  The other instructions
will run at the old speeds or a little better.  But, these techniques don't
fix things like two-address instructions vs. three-address instructions,
they can't add more registers (although I know someone is going to do that
to their machine, but it is not simple!), and they can't add some of the
exposed parallelism that RISCs can have.  Or a CISC can have a decoded-
instruction cache, but this can add latency when the next needed instruction
isn't in the decoded instruction cache, which is much smaller (since the
decoded instructions are much bigger than the encoded ones) than a regular
instruction cache.

The point is that RISCs are probably going to be smaller, faster, and
cheaper than CISCs, and implementing out-of-order execution and pattern
matching to allow multiple instructions per cycle will probably be much
easier.

Stay tuned....  We'll all see how this turns out!

malcolm@Apple.COM (Malcolm Slaney) (10/21/88)

Brian (bcase) Case writes:
>
>BUT, RISC is easier to make go fast than CISC.  Some, but only some,
>of the advantage is that RISCs are starting fresh while most CISCs
>must be backward compatible.  Even so, there are no new CISC designs
>being done, that I know of.  

Ummm, I think Symbolics and TI would probably argue with this statement.
Both companies have recently introduced Lisp machines on a chip....and nobody
will ever call a conventional Lisp machine architecture a RISC.

It could be argued that this is just a reimplementation of an old machine but
as near as I can tell the people designing these chips believe that their
architecture is optimum.  Since neither Symbolics or TI have a large number 
of people selling binary only code for these machines I don't think binary
compatibility is an issue.

Rumor has it that the people that used to be at Lisp Machines International 
(LMI) have started a new company to build yet another Lisp machine on a chip.

							Malcolm
P.S.  An interesting question is whether Symbolics/TI/LMI will fail because 
the market is to small to support a processor designed for Lisp and GC or
because CISC's are a mistake.

doug@edge.UUCP (Doug Pardee) (10/21/88)

An all-in-one posting from a not-exactly-unbiased source...

>For an example of an architecture that's 68000 compatible and RISCy to
>the point of executing most instructions in a single clock cycle, look
>no farther than the Edge computer.  However, if you want this on a 
>single chip, instead of a bunch of gate arrays, you'll have to wait.

Just so's everyone will know... we had a clash with Leading Edge computers
over the name, and our lawyers advised that since until recently all of our
units went to OEMs who put their own names on them, the "Edge" name isn't
well known enough to be worth fighting for.  So last month, with Leading
Edge's blessing, we became <fanfare please> Edgcore Technology, Inc.

No, I don't like the name Edgcore.  But then, I thought that "Apple" was a
really stupid name for a computer and that "IBM Personal Computer" was
an awfully unoriginal name.  :-)

 -----  Next subject and author....

>Even so, there are no new CISC designs
>being done, that I know of.

I presume that this means "instruction set designs", not computer designs.
There have been a ton of new systems using old instruction sets recently;
for example Amdahl just came out with a stupendous box using the 370 set.

As for designing new instruction sets, the fact that CISC has attained a
level of stability while new RISC instruction sets seem to appear each month
is hardly a point in favor of RISC.  New instruction sets usually appear
because there are perceived major flaws in the existing sets.  Edge chose to
implement the 680x0 instruction set in our box because we considered it to
be a very practical and powerful set and we didn't think we could do much
better with an Edge-designed set.  We *did* add a few instructions that
we thought were missing, though...

Intel, on the other hand, found it worthwhile to design a new instruction
set for the 80386.  I can't imagine why :-)

 -----  Next subject and author....

>The problem is that many of the current generation of risc machines do
>not support interlocked instructions.
> ...
>... if I want to build mainframes based on multiple high
>performance risc chips, I find myself up the creek.

The incorrect assumption here is that you would want to build a mainframe
using RISC technology -- that RISC technology has anything to offer at
that price/cost level.

As we at Edgcore have shown, it is both possible and practical to implement
CISC instruction sets at speeds faster than RISC.  But -- it doesn't all fit
on one chip.  Yet.

In a mainframe design, who cares if it fits on one chip?  Jeez, in our E2000
system we need an entire triple-high VME card jam-packed with surface-mount
parts just to hold the *caches* that we need to have to keep from starving
the CPU.  The complexity and board area of the CPU itself is insignificant
compared to that required by mainframe-sized multi-level memory systems.

Nobody's going to be building a "single board mainframe" with today's DRAM
and SRAM technologies.  If/when those RAM technologies do reach that point,
it's likely that our 1-instruction-per-clock-cycle 680x0 will fit on one chip
too.  Of course, by then there'll be a different definition of "mainframe"...
-- 
Doug Pardee, Edgcore Technology (formerly Edge Computer), Scottsdale, AZ  
{ames,hplabs,sun,amdahl,allegra}!oliveb!edge!doug    uunet!ism780c!edge!doug

bcase@cup.portal.com (Brian bcase Case) (10/23/88)

>we became <fanfare please> Edgcore Technology, Inc.

Trumpets playing...  A few of us at Apple wanted to get one of your boxes
to run Macintosh:  we wanted to see just how it would feel to use a
Mac at that speed.  We conjectured that scrolling would be too fast to
tolerate!  "Wait! there goes my text!  Stop!"

>ME>Even so, there are no new CISC designs
>ME>being done, that I know of.
>I presume that this means "instruction set designs", not computer designs.
>There have been a ton of new systems using old instruction sets recently;
>for example Amdahl just came out with a stupendous box using the 370 set.

Yes, I did mean to say new CISC instruction set designs.  Sigh, I have
to be more careful.

>As for designing new instruction sets, the fact that CISC has attained a
>level of stability while new RISC instruction sets seem to appear each month
>is hardly a point in favor of RISC.  New instruction sets usually appear
>because there are perceived major flaws in the existing sets.

You are right, sometimes new things come out because old things are
inherently flawed, but sometimes new things come out because we have
thought of better ways!  This doesn't necessarily describe the activity
in the RISC arena, but it will in the future.  Thus, I perceive the
introduction of new RISC instruction sets as an advantage!  It does at 
least one important thing:  it makes you realize that you shouldn't
rely on one instuction set if you hope to be able to take advantage of
the state of the art in price/performance.  Sure, you can have 30 MIPS
370s, if you are willing to pay $10 Mega.  Me?  I'll stick with something
a little closer to the cost of a car.  :-)  Counting on the instruction
set being the same 3 years from now seems like a limiting thing to me.
Surely (*stop* calling me Shirley) we can think of something better
than RISC (Multiflow is an example).

Notice that this is where the MIIL concept comes back in.  I dissagree
with the comment someone else made claiming that the issue of MIILs is
a read herring.  We need something other than source to insulate programs
from architecture tweaks.

>As we at Edgcore have shown, it is both possible and practical to implement
>CISC instruction sets at speeds faster than RISC.  But -- it doesn't all fit
>on one chip.  Yet.

I would bet, a fair amount, that by the time you have one processor on a
chip, there will be single-chip RISC multiprocessors; maybe not for sale
at Fry's, but somewhere.  I don't know how the bandwidth requirements will
be satisfied, but they will exist.

>In a mainframe design, who cares if it fits on one chip?  Jeez, in our E2000
>system we need an entire triple-high VME card jam-packed with surface-mount
>parts just to hold the *caches* that we need to have to keep from starving
>the CPU.  The complexity and board area of the CPU itself is insignificant
>compared to that required by mainframe-sized multi-level memory systems.

yes, everyone has the same problem:  the memory hierarchy.

>Nobody's going to be building a "single board mainframe" with today's DRAM
>and SRAM technologies.  If/when those RAM technologies do reach that point,
>it's likely that our 1-instruction-per-clock-cycle 680x0 will fit on one chip
>too.  Of course, by then there'll be a different definition of "mainframe"...

Yes, it depends on the definition of mainframe.  It could be argued that,
for some types of computations (dhrystones?  :-), single-board computers
exist that are faster than mainframes.

henry@utzoo.uucp (Henry Spencer) (10/25/88)

In article <10358@cup.portal.com> bcase@cup.portal.com (Brian bcase Case) writes:
>Yes, it depends on the definition of mainframe.  It could be argued that,
>for some types of computations (dhrystones?  :-), single-board computers
>exist that are faster than mainframes.

A couple of years ago, some irreverent soul pointed out that the Atari ST
does more dhrystones per dollar than any other machine on the market --
far more than the Crays, for example.
-- 
The dream *IS* alive...         |    Henry Spencer at U of Toronto Zoology
but not at NASA.                |uunet!attcan!utzoo!henry henry@zoo.toronto.edu