[comp.sys.amiga.hardware] RISC Amiga WHat's RISC?

yorkw@pasture.ecn.purdue.edu (Willis F York) (11/10/90)

Well Here's a Novice Question.

I know RISC means Reduced INstruction Set Computer (Or Close)

But what does this mean?
The CPU chip has Fewer Commands (oops Instructions) that it knows how 
to run? What's the big deal about that? OR an it to those few REAL fast?
-------
I'm just attempting to increase my Knowledge of computers in general.
and these Heavy Techie Talks Confuse me.

--
yorkw@ecn.purdue.edu  Willis F York    
----------------------------------------------
Macintosh... Proof that a Person can use a Computer all day and still
not know ANYTHING about computers. 

johnhlee@flute.cs.cornell.edu (John H. Lee) (11/10/90)

In article <yorkw.658182081@pasture.ecn.purdue.edu> yorkw@pasture.ecn.purdue.edu (Willis F York) writes:
>I know RISC means Reduced INstruction Set Computer (Or Close)
>
>But what does this mean?
>The CPU chip has Fewer Commands (oops Instructions) that it knows how 
>to run? What's the big deal about that? OR an it to those few REAL fast?

No question is ever stupid, and novices aren't the only ones to ask this
particular one.

A CISC (Complex Instruction Set Computer) CPU like the beloved 68000 usually
requires several cycles (4-16 and more, for example) of its clock to execute
a single instruction, primarily because of the complexity needed to decode
instruction that basically do *everything*, like 32 bit multiplies, divide,
save all registers on stack and jump to subroutine while saving return address
on stack, etc.  This means a 7MHz 68000 is executing nowhere near 7 million
instructions per second.

RISC CPU is based on the philosophy that if we cut down on the instructions
supported by the CPU, we can make every instruction use only *1* clock cycle,
then we can make things go much faster and do more than a CISC at the same
speed.  Moreover, this makes a RISC simpler, smaller, and easier to design
and manufacture.  So RISCs support and optimize only a small set of core
instructions and replace the complex instructions with a set of smaller ones.
But analysis of most programs show that the majority of the instructions
executed are simple things like MOVEs and branches, so the overall speed
increase is very healthy.  A RISC CPU at 7MHz is indeed executing 7 million
instructions per second.

RISC CPUs also tend to be available in faster parts, because of their simpler
design.

Another advantage is that most programs are compiled from a higher-level
language, and it is difficult for compilers and compiler writers to utilize
the more complex instructions anyways.  Almost all compilers will use simpler
instructions instead of the complex ones.

There are disadvantages, though.  First off, programs compiled for RISC
systems are larger, obviously.  I've seen 30% larger on average quoted.  Also,
to achieve 1 cycle/instruction, compilers still have to do some optimizations
and handle special cases.  So all is not bliss.

Additionally, the CISC camp have been busy using some RISC techniques so that
now some CISC CPU's are nearing the 1 cycle/instruction mark without the
disadvantages of RISC CPUs (I think the 68040 is reportedly 1.4 cycles/
instruction for some instruction mixes.)  But for now, RISCs are faster,
simpler, and cheaper.

-------------------------------------------------------------------------------
The DiskDoctor threatens the crew!  Next time on AmigaDos: The Next Generation.
	John Lee		Internet: johnhlee@cs.cornell.edu
The above opinions of those of the user, and not of this machine.

milamber@caen.engin.umich.edu (Daryl Scott Cantrell) (11/10/90)

In article <yorkw.658182081@pasture.ecn.purdue.edu> yorkw@pasture.ecn.purdue.edu (Willis F York) writes:
>I know RISC means Reduced INstruction Set Computer (Or Close)
[...]
>But what does this mean?

  RISC == Reduced Instruction Set Computing.  It is sort of a different way of
thinking about how computing should be done.. A "normal" computer like an Amiga
is a CISC computer (Complex ISC).  It's CPU, the 680x0, has instructions to do
all sorts of high-level things, like multiplying two signed numbers, allocating
a frame on a private stack, or packing BCDs.  Many of these comlex instructions
take several "cycles" of the processor.  This is why an A3000 clocked at 25 MHz
does not actually perform 25 million instructions per second (MIPS).  A RISC
processor, on the other hand, seeks to do away with any instruction that can't
be done entirely in one cycle, unless it's absolutely necessary.  Complex oper-
ations are implemented as routines which are built from the simple, FAST in-
structions.  They generally take on the order of 1.2-1.4 cycles per instruc-
tion.  And many also try to perform more than one instruction at a time.

----
Exit(TEACHERMODE);
Begin(OPINION);
----

  Of course, RISC will never really be that much faster than CISC because CISC
chips can always bail out.. By pipelining their architecture for the "most-
used" instructions that RISC chips implement, and still having the higher-
level functions available as microcode, which would be much slower than the
optimized instructions but still faster than RISC implementations.  Motorola
seems to be doing exactly this with the 68040, I've seen 1.2-1.3 cyc/ins all
over the place..

>yorkw@ecn.purdue.edu  Willis F York    


+---------------------------------------+----------------------------+
|   // Daryl S. Cantrell                |   These opinions are       |
| |\\\ milamber@caen.engin.umich.edu    |    shared by all of    //  |
| |//  HELP! HELP! I'm being REPRESSED! |        Humanity.     \X/   |
+---------------------------------------+----------------------------+

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (11/12/90)

>>>>> On 9 Nov 90 22:25:01 GMT, milamber@caen.engin.umich.edu (Daryl Scott Cantrell) said:
Daryl>   RISC == Reduced Instruction Set Computing.

Just a few months ago, I thought that also.  (I was then a RISC basher,
BTW.)  As it turns out, RISC is a misnomer.  The cornerstone of RISC is
ruthless _quantitative_ analysis; features (instructions, cache, etc.) are
only added as they prove themselves _quantitatively_ (cost/performance)
when executing _real programs_.

If you _really_ want to get a handle on what RISC is all about, read:

"Computer Architecture - A Quantitative Approach"
Hennessy & Patterson
ISBN: 1-55860-069-8
Morgan Kaufman Publishers

Other than the examples where RISC always (surprise, surprise:-) beats CISC
and little acknowledgement that CISC _can_ be piplined, the book appears to
present an impartial, thorough examination of various cost/performance
tradeoffs.  There is much food for thought here.  (NOTE: I'm only halfway
through the book so far, so add some grains of salt.)

Merely reducing the instruction set doesn't necessarily buy you much.  If
it did, we'd all be using 6502s with 50MHz clocks.  "So why is it called
'RISC'?" you ask.  Beats me.  Instruction set size is only one of many
variables, as is explained in H&P's book.

Daryl> Of course, RISC will never really be that much faster than CISC
Daryl> because CISC chips can always bail out.. By pipelining their
Daryl> architecture for the "most-used" instructions that RISC chips
Daryl> implement, and still having the higher-level functions available as
Daryl> microcode, which would be much slower than the optimized
Daryl> instructions but still faster than RISC implementations.

I'm not sure one way or the other on this.  The example of taking this
approach (with which I'm familiar) is the MicroVAX, which is noticably
slower than its extremely CISC predecessors (the 780 and 785), which are in
turn, noticably slower than several RISC machines all at the same clock
speed.  It may turn out someday that the very fastest single processor
machines are super CISC, but their cost/performance is almost certain to be
much poorer than that of simple RISC.

Daryl> Motorola seems to be doing exactly this with the 68040, ...

Are they, _really_ taking this approach?  I ask because I don't know
either.

Daryl> I've seen 1.2-1.3 cyc/ins all over the place..

Ditto for the 80486.  (No flames, please:-)  However, there are current
RISC processors that manage to average _less_ than 1 cycle per instruction.
(I've even heard of 1/2-1/3 Cycles Per Instruction peak performance.  How
_do_ they do that?)

The future of computer architecture looks to be very interesting.  The only
thing _I'd_ bet money on is an increase in multi-processor systems.

	Just my $0.02,
--
Chuck Phillips  MS440
NCR Microelectronics 			chuck.phillips%ftcollins.ncr.com
2001 Danfield Ct.
Ft. Collins, CO.  80525   		...uunet!ncrlnk!ncr-mpd!bach!chuckp

daveh@cbmvax.commodore.com (Dave Haynie) (11/13/90)

In article <yorkw.658182081@pasture.ecn.purdue.edu> yorkw@pasture.ecn.purdue.edu (Willis F York) writes:
>Well Here's a Novice Question.

>I know RISC means Reduced INstruction Set Computer (Or Close)

>But what does this mean?
>The CPU chip has Fewer Commands (oops Instructions) that it knows how 
>to run? What's the big deal about that? OR an it to those few REAL fast?

Lots of folks argue about what RISC is, mainly those with nothing better to
do.  RISC really isn't any one thing, it's kind of a bannerhead for a set of
tools you might term "the late 80s approach to microprocessor design".  This
toolkit takes into account many ideals, but the main point seems to be that
one should be processing at least one instruction every clock cycle.  To this
end, you have:

	- Simplified Instruction Set
	  This doesn't always mean that the instructions do less work than
	  those of a CISC machine.  For example, most RISC chips have three
	  operand arithmetic functions, versus the two operand equivalents
	  in chips like the 68030.  The basic idea is that the instruction set
	  should be very orthogonal, stick mainly to simple instructions that
	  can be executed in a single cycle, and don't support zillions of
	  different addressing modes.  Once you have such an instruction set,
	  the computer design is simpler and the set can be implemented fully
	  hardwired, rather than via microcode as in the 68030.  
	- Load/Store Architecture
	  Another RISC tenet is that touching main memory is the only 
	  operation likely to take more than one clock cycle.  So you isolate
	  that operation -- only load and store instructions are capable of
	  touching memory, all others work between registers.  Along with this,
	  you'll find that RISC chips tend to have more registers, from 32
	  up to a hundred or more.
	- Pipelining
	  In order to keep roughly one instruction per clock running, RISC
	  designs tend to be heavily pipelined.  You have several stages of
	  execution for each instruction.  So, while a single instruction
	  may actually take 6 clock cycles to pass through the whole machine
	  pipeline, there should be one instruction in each of these 6 slots
	  at all times, therefore yielding an effective 1 clock/instruction.
	  Any code that takes more than this, such as a load or store, will
	  cause a pipeline stall, an unused slot or two in the pipeline.  The
	  long pipeline causes some strange coding practices.  On many RISC
	  chips, even accessing the same register in consecutive instructions
	  will cause a pipeline stall, since the register hasn't quite been
	  written in I0 by the time it need to be read by I1.  So smart 
	  compilers are called for, which can manage register allocations.

There are a few more concepts, but these are the basic ones.  Most of the ideas
that are in today's RISC devices come, either directly or a little roundabout,
from the supercomputer work done by folks like Cray.  And many of the same
reasons are present, only at the chip level.  Optimizing the size of the CPU
design means many fewer gates.  So you can use a faster process technology,
if available, than the folks building CISCs.  Or you can add much more on-chip
cache in the same process technology.  Or you can make the thing yield like
crazy and drop the price relative to a CISC device.  

There really isn't anything in RISC you can't apply to CISC, for the right 
price.  The 68040 is a good example.  The most common 680x0 instructions in
that device are hard wired rather than microcoded.  It has a deep pipeline,
with even a few innovations over most RISCs (for example, address register
increments and decrements, as well as offsets, get resolved in their own
pipeline stage with their own ALU, so these addressing more don't add time
to the instruction execution).  The 68040 also has a large on-chip cache,
which unlike the caches in most other chips, responds in a single cycle for
cache hits, making it nearly as fast as register access.  On the downside, all
this extra logic has made the 68040 take up 1.2 million devices in a 0.8
micron CMOS process.  You aren't going to see this move into GaAS or ECL any
time soon, whereas the MIPS folks already have an ECL version of their MIPS
architecture, the R6000.  And while 68040s will most likely be in the several
100 $ range for some time, you can get RISC chips at or near the same 
performance level for $100, maybe even a little less.


-- 
Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests"
   {uunet|pyramid|rutgers}!cbmvax!daveh      PLINK: hazy     BIX: hazy
	Standing on the shoulders of giants leaves me cold	-REM

slfields@uokmax.ecn.uoknor.edu (Scott L Fields) (11/14/90)

In article <CHUCK.PHILLIPS.90Nov12124302@halley.FtCollins.NCR.COM> Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) writes:
>>>>>> On 9 Nov 90 22:25:01 GMT, milamber@caen.engin.umich.edu (Daryl Scott Cantrell) said:
>Daryl>   RISC == Reduced Instruction Set Computing.
>Daryl> I've seen 1.2-1.3 cyc/ins all over the place..
>
>Ditto for the 80486.  (No flames, please:-)  However, there are current
>RISC processors that manage to average _less_ than 1 cycle per instruction.
>(I've even heard of 1/2-1/3 Cycles Per Instruction peak performance.  How
>_do_ they do that?)

That is not an easy question to answer. If you pipeline the incoming instruct-
ion stream, then the instructions are simply queued up. On the other hand, if
you have more than one instruction sequencer, you can start running those 
instructions. The problem is that you have to keep on eye out for instructions
that would modify the following instructions. example-> if the first instruct-
ion in the queue is add r1 to r2, then you don't won't the next instruction
modifying the values in r1 or r2 until the first instruction completes. God,
that is a simplified approach but should get the idea across. I would see
problems if you ever got a uniform 1 cycle/instruction CPU though. If every
cycle you fetch a new instruction, how can you prefetch the next instruction
into the queue? I am not into RISC design so, at the moment, such solutions
escape me. I do know that the IBM 860 {maybe the 960} CPU averages about
3 instructions a cycle. You could achieve the same types of number for CISC
chips with this kind of architecture but you would need to design programs
with this in mind. Imagine a 68050 with a average cycle time of 4 instructions
at 50 Mhz. Then again, think of the number of transistors to make the BEAST!

Any further discussion merited? Please, speak your peace.

dlt@locus.com (Dan Taylor) (11/15/90)

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) writes:

>>>>>> On 9 Nov 90 22:25:01 GMT, milamber@caen.engin.umich.edu (Daryl Scott Cantrell) said:
>Daryl>   RISC == Reduced Instruction Set Computing.

>Just a few months ago, I thought that also.  (I was then a RISC basher,
>BTW.)  As it turns out, RISC is a misnomer.  The cornerstone of RISC is
>ruthless _quantitative_ analysis; features (instructions, cache, etc.) are
>only added as they prove themselves _quantitatively_ (cost/performance)
>when executing _real programs_.

>Daryl> Of course, RISC will never really be that much faster than CISC
>Daryl> because CISC chips can always bail out.. By pipelining their
>Daryl> architecture for the "most-used" instructions that RISC chips
>Daryl> implement, and still having the higher-level functions available as
>Daryl> microcode, which would be much slower than the optimized
>Daryl> instructions but still faster than RISC implementations.

>Daryl> Motorola seems to be doing exactly this with the 68040, ...

>Are they, _really_ taking this approach?  I ask because I don't know
>either.

>Daryl> I've seen 1.2-1.3 cyc/ins all over the place..

>Ditto for the 80486.  (No flames, please:-)  However, there are current
>RISC processors that manage to average _less_ than 1 cycle per instruction.
>(I've even heard of 1/2-1/3 Cycles Per Instruction peak performance.  How
>_do_ they do that?)


It's quite easy, actually.  If you know that the bus interface is
going to be busy for a cycle, use that cycle to perform a "future"
ALU operation.  The 68020, even, had peak operations of 0 clock
cycles.  I have experimentally verified this.  Not as a flame, but a
comment, since the '040 has 2 internal busses (Harvard Architecture),
and a separate ALU for operand address calculations, its PEAK operation
rate is higher than the '486.  The  '486 does have a very smart pipeline,
though.  Both processors, however, still, ultimately,
have to get the opcodes and operands from memory/disk, and this is where
real differences in overall system performance are made.
-- 

* Dan Taylor    * The opinions expressed are my own, and in no way *
* dlt@locus.com * reflect those of Locus Computing Corporation.    *

dawill@hubcap.clemson.edu (david williams) (11/16/90)

In article <CHUCK.PHILLIPS.90Nov12124302@halley.FtCollins.NCR.COM>, Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) writes:
> >>>>> On 9 Nov 90 22:25:01 GMT, milamber@caen.engin.umich.edu (Daryl Scott Cantrell) said:
> Daryl>   RISC == Reduced Instruction Set Computing.
> 
> Just a few months ago, I thought that also.  (I was then a RISC basher,
> BTW.)  As it turns out, RISC is a misnomer.  The cornerstone of RISC is
> ruthless _quantitative_ analysis; features (instructions, cache, etc.) are
> only added as they prove themselves _quantitatively_ (cost/performance)
> when executing _real programs_.
> 
> Merely reducing the instruction set doesn't necessarily buy you much.  If
> it did, we'd all be using 6502s with 50MHz clocks.  "So why is it called
> 'RISC'?" you ask.  Beats me.  Instruction set size is only one of many
> variables, as is explained in H&P's book.

   Most RISC processors tend to have a rather large register set, to 
minimize the memory fetches required to carry out a given instruction
sequence (more registers == less fetches to get intermediate results)
6502 fails miserably at this ...  :-)

[stuff deleted about CISC being able to speed up with RISC techniques]
> 
> Daryl> Motorola seems to be doing exactly this with the 68040, ...
> Are they, _really_ taking this approach?  I ask because I don't know
> either.

   Yes - Motorola made big claims about their new techniques to do this
when they introduced the 68040.  Note that a company called EDGE was
making a RISC-like chip a little while back that ran 68000 code at 
an average rate of 1.4 cyc/ins a few years back...  Big ASIC chip, I
believe...

> Daryl> I've seen 1.2-1.3 cyc/ins all over the place..
> 
> Ditto for the 80486.  (No flames, please:-)  However, there are current
> RISC processors that manage to average _less_ than 1 cycle per instruction.
> (I've even heard of 1/2-1/3 Cycles Per Instruction peak performance.  How
> _do_ they do that?)

    The Intel 860 and 960 chips achieve this sort of performance by
having multiple computation units.  I think there are *two* integer
units, and a floating point unit.  Carefull programing can have 3
instructions going on, one in each unit (of course, the I units would
tend to complete instructions faster than the F unit can, so techniques
have to balance the load carefully)  Naturally, the IPU's and the FPU
are pipelined, so that their average cycles are low.  Pretty neat chip
all around...




     Dave Williams
        dawill@hubcap.clemson.edu
           "Huh?  What?  Could you repeat the question?"

jma@beach.cis.ufl.edu (John 'Vlad' Adams) (11/17/90)

Of course, the joke 'round here is that a RISC processor is
nothing but a 6502 with a fast clock and lots of 32bit registers...
:)

--
John  M.  Adams   --**--   Professional Student on the eight-year plan!     ///
Internet:   jma@beach.cis.ufl.edu   -or-   vladimir@maple.circa.ufl.edu    ///
"We'll always be together, together in electric dreams" Moroder & Oakey \\V//
Sysop of The Beachside.   FIDOnet 1:3612/557.   904-492-2305  (Florida)  \X/

peterk@cbmger.UUCP (Peter Kittel GERMANY) (11/19/90)

In article <25475@uflorida.cis.ufl.EDU> jma@beach.cis.ufl.edu (John 'Vlad' Adams) writes:
>Of course, the joke 'round here is that a RISC processor is
>nothing but a 6502 with a fast clock and lots of 32bit registers...
>:)

I agree totally!

-- 
Best regards, Dr. Peter Kittel  // E-Mail to  \\  Only my personal opinions... 
Commodore Frankfurt, Germany  \X/ {uunet|pyramid|rutgers}!cbmvax!cbmger!peterk