[comp.sys.next] RISC vs. CISC -- SPECmarks

melling@cs.psu.edu (Michael D Mellinger) (04/04/91)

In article <27fa3350.6bc2@petunia.CalPoly.EDU> araftis@polyslo.CalPoly.EDU (Alex Raftis) writes:

   What makes you say this. Sure RISC is nice, and it has it's place, but 
   look at what advantages RISC gives. It has few intructions, so they execute
   quickly. That's nice, but each instruction does very little, so you need
   five instructions to do the same thigns a CISC processor can do. A RISC
   processor has a lot a registers to work with for speed. Well, the 68040
   has sixteen registers, which I find more than plenty when programming.

Look at the SPECmarks for the chips.  RISC chips are outperforming
CISC chips.

   What are some of its disadvantages? Well, floating point work generally
   is more difficult. They're also nearly impossible to work with at the assembly
   level due to the amount of work the programmer has to do to make the 
   advantages of RISC, like pipelining work.

cc -O is all you need to know.

   On the other hand, look at the 68040. It executes instructions at around
   1.3 cycles per instuction. It gives you lots of registers. It's easy to
   work with at the assembly level while its easy to write compilers for. It
   has a large established base of programmers. It's only major problem is
   in clock speed. A 25 Mhz 68040 can beat a SPARC at 25Mhz, but the SPAC's
   top speed is 40Mhz, which will easily beat the 040. Motorola claims to be
   working on a 50Mhz version of the 040, but I don't have any idea of when
   they claim this will be released.

SPARC isn't a very good RISC.  Look at what MIPS is doing.  They make
the best CPUs.  And that 1.3 cycles/instruction is optimal.  Do you
think that's what we're getting on the NeXT?

   From NeXT's perspective, they decided to go with the Motorola line. If
   they should decide to change at this point consider what they would have to do. They would need to 1. rewrite their system software. 2. Many companies 
   would be forced to make major changes to current software packages. 3.
   They would have to completely rework all that they've done with their
   hardware. Basically, they'd need to pump a lot of money into doing 
   something that only has dubious advantages. With their current strategy
   they can work their way into the workstation market with their 25 Mhz
   cube, which requires low cost memory and support hardware, while they wait
   for faster versions of the processor to come onto the market. Once this
   occurs, they can begin to release faster version of the Cube, while still
   selling their current models as the Workstation for the rest of us.

All most companies would have to do is type 'make' to recompile their
programs if NeXT switched to a RISC chip.

-Mike

araftis@polyslo.CalPoly.EDU (Alex Raftis) (04/06/91)

In article <?paG6lfg1@cs.psu.edu> melling@cs.psu.edu (Michael D Mellinger) writes:
>
>Look at the SPECmarks for the chips.  RISC chips are outperforming
>CISC chips.
>

I'm mainly trying to make the point that CISC is not dead. Most RISC chips
are outperforming CISC, and I'd would say that it's mainly because of 
their simpler hardware design. Yes, someone had an excellent idea when
they decided to design the first RISC chip. CISC on the other hand
is benefiting greatly from advances in hardware design. They're beginning
to approach RISC through put. I guess the real decission maker will be
price and support.

>
>SPARC isn't a very good RISC.  Look at what MIPS is doing.  They make
>the best CPUs.  And that 1.3 cycles/instruction is optimal.  Do you
>think that's what we're getting on the NeXT?
>

I can't comment on this one, as I've never seen a really good comparrison
of RISC to RISC processors, and I'm aware that RISC is generally out-
performing CISC. Also, I'd say 1.3 is not as "optimal" as you think. 
Motorola was being optimal when they said the chip could do 20 Mips, and
NeXT was being minimal when they said 15. Most of what I've seen done
to benchmark the NeXT places it somewhere in between, showing that it
is doing 1.3 cycles per instr. 

If you've done much work with pipe-line theory, you'd see that pipe
lining is very strong in programs that don't branch constantly. Assuming
that you never branch, a good pipe-line should be able to perform at
1 cycle per instruction. If every statement is a branch, you fall way
off the mark, and would perferm at your slowest. With normal code, you
only brach periodically, and with the 040 code, you can often predict
and follow the branch (ie, dbxx statement will usually branch). This
produces their performance of about 1.3.

>
>All most companies would have to do is type 'make' to recompile their
>programs if NeXT switched to a RISC chip.
>

Not really. How much porting of software have you done? I've noticed
that when porting between architectures that yes, you only have to 
recompile, but then you often have to go twiddle with code to make
it work with the new architecture. This is often true with C since it
allows you to work at such a low level. 

There's also the problem of the operating system. Do you remember the
days of NeXTstep 0.8? We'd have to go through all those bugs again
as we waited for Mach on the new architecture to get bug free. We'd
also have to work with a compiler that hasn't been as well tested as 
680x0 based compilers.

Personally, I'd rather see NeXT support a multiprocessor enviroment
like Mach is suppose to be able to do. This would allow a user who
needs more performance to get it with a second board. Also, there's
no one stopping third party developers from make RISC based add in
cards for the NeXT. These could be used by people needing lots of 
number crunching or other specialty applications. I mean, that's
basically what the NeXTdimension does. 

>-Mike


-- 
               -------------------------------------------------- 
                     Internet: alex@cosmos.ACS.CalPoly.EDU
                  or araftis@data.ACS.CalPoly.EDU for NeXTmail

melling@cs.psu.edu (Michael D Mellinger) (04/06/91)

In article <27fcdce4.3a0@petunia.CalPoly.EDU> araftis@polyslo.CalPoly.EDU (Alex Raftis) writes:

   I'm mainly trying to make the point that CISC is not dead. Most RISC chips
   are outperforming CISC, and I'd would say that it's mainly because of 
   their simpler hardware design. Yes, someone had an excellent idea when
   they decided to design the first RISC chip. CISC on the other hand
   is benefiting greatly from advances in hardware design. They're beginning
   to approach RISC through put. I guess the real decission maker will be
   price and support.

I'm not a microprocessor expert, but I think that the current crop of
CISC chips use at least 4 times as many transisters per chip as the
latest RISC chips.  Motorola had so much extra room on the 88K that
they put a 64K cache on the chip.  Less than 10 years ago, 64K was a
lot of memory.  The RISC people are just getting starting.

   I can't comment on this one, as I've never seen a really good comparrison
   of RISC to RISC processors, and I'm aware that RISC is generally out-
   performing CISC. Also, I'd say 1.3 is not as "optimal" as you think. 
   Motorola was being optimal when they said the chip could do 20 Mips, and
   NeXT was being minimal when they said 15. Most of what I've seen done
   to benchmark the NeXT places it somewhere in between, showing that it
   is doing 1.3 cycles per instr. 

I think HP has benchmarks that rate the 68040 at around 12 SPECmarks.
I have the results that were posted to comp.arch, if you want them.

   If you've done much work with pipe-line theory, you'd see that pipe
   lining is very strong in programs that don't branch constantly. Assuming
   that you never branch, a good pipe-line should be able to perform at
   1 cycle per instruction. If every statement is a branch, you fall way
   off the mark, and would perferm at your slowest. With normal code, you
   only brach periodically, and with the 040 code, you can often predict
   and follow the branch (ie, dbxx statement will usually branch). This
   produces their performance of about 1.3.

And then there is delayed branching which always executes the
instruction(s) after the branch.  I have had a little theory.  Anyway,
the NeXT generation of processors are going to execute 2 instructions
per cycle, thus giving you 50 mips and 25MHz or 100mips at 50MHz.

   Not really. How much porting of software have you done? I've noticed
   that when porting between architectures that yes, you only have to 
   recompile, but then you often have to go twiddle with code to make
   it work with the new architecture. This is often true with C since it
   allows you to work at such a low level. 

I think byte alignment is the biggest problem.  The RISC chips insist
that everything be aligned on a word(32 bit) boundary, where as the
CISC chips don't make such a restriction.

struct foo {
	char a;
	int i;
} bus_error;

Try accessing bus_error.i on a SPARCstation and see happens. 

   There's also the problem of the operating system. Do you remember the
   days of NeXTstep 0.8? We'd have to go through all those bugs again
   as we waited for Mach on the new architecture to get bug free. We'd
   also have to work with a compiler that hasn't been as well tested as 
   680x0 based compilers.

GCC already runs on MIPS's architecture.  Can you say piece of cake?

   Personally, I'd rather see NeXT support a multiprocessor enviroment
   like Mach is suppose to be able to do. This would allow a user who
   needs more performance to get it with a second board. Also, there's
   no one stopping third party developers from make RISC based add in
   cards for the NeXT. These could be used by people needing lots of 
   number crunching or other specialty applications. I mean, that's
   basically what the NeXTdimension does. 

I rather see 4 50MHz R4000's than 4 68040's.  Can you say Cray in a Cube?

   >-Mike

bennett@mp.cs.niu.edu (Scott Bennett) (04/06/91)

In article <27fcdce4.3a0@petunia.CalPoly.EDU> araftis@polyslo.CalPoly.EDU (Alex Raftis) writes:
>In article <?paG6lfg1@cs.psu.edu> melling@cs.psu.edu (Michael D Mellinger) writes:
>>
>>Look at the SPECmarks for the chips.  RISC chips are outperforming
>>CISC chips.
>>
>  [text deleted  --SJB]
>Motorola was being optimal when they said the chip could do 20 Mips, and
>NeXT was being minimal when they said 15. Most of what I've seen done
>to benchmark the NeXT places it somewhere in between, showing that it
>is doing 1.3 cycles per instr. 

     And, at that, those are 15 or 20 *CISC* MIPS, which get a lot
more work done than 15 or 20 RISC MIPS.  Further, you don't always
have to up the clock speed to up the processor speed, as the upgrade
from the 68030 to the 68040 exemplifies, so you don't have to end up
with a processor that you could fry an egg on or one that has to use
bulky, expensive SRAM for because the DRAMs available either don't
run that fast or cost even more.
>
>  [more deleted  --SJB]
>
>There's also the problem of the operating system. Do you remember the
>days of NeXTstep 0.8? We'd have to go through all those bugs again
>as we waited for Mach on the new architecture to get bug free. We'd
>also have to work with a compiler that hasn't been as well tested as 
>680x0 based compilers.
>
>Personally, I'd rather see NeXT support a multiprocessor enviroment
>like Mach is suppose to be able to do. This would allow a user who
>needs more performance to get it with a second board. Also, there's
>no one stopping third party developers from make RISC based add in
>cards for the NeXT. These could be used by people needing lots of 
>number crunching or other specialty applications. I mean, that's
>basically what the NeXTdimension does. 

     I agree.  Would anyone (besides me:-) buy a 68050 upgrade board
in a year or two if it were available?
>
>>-Mike
>
>
>-- 
>               -------------------------------------------------- 
>                     Internet: alex@cosmos.ACS.CalPoly.EDU
>                  or araftis@data.ACS.CalPoly.EDU for NeXTmail


                                  Scott Bennett, Comm. ASMELG, CFIAG
                                  Systems Programming
                                  Northern Illinois University
                                  DeKalb, Illinois 60115
**********************************************************************
* Internet:       bennett@cs.niu.edu                                 *
* BITNET:         A01SJB1@NIU                                        *
*--------------------------------------------------------------------*
*  "Well, I don't know, but I've been told, in the heat of the sun   *
*   a man died of cold..."  Oakland, 19 Feb. 1991, first time since  *
*  25 Sept. 1970!!!  Yippee!!!!  Wondering what's NeXT... :-)        *
**********************************************************************

rca@cs.brown.edu (Ronald C.F. Antony) (04/09/91)

In article <?paG6lfg1@cs.psu.edu> melling@cs.psu.edu (Michael D Mellinger) writes:
>All most companies would have to do is type 'make' to recompile their
>programs if NeXT switched to a RISC chip.

Yes, and then put out a new pricelist telling the user the typical
workstation-pricelist story: more performance = higher price. Hell, NO!
And then, all the upgrade fees to pay...

If RISC were software you would call it a hack, (or a C program with
tons of gotos for that matter). Of course a hack is faster (sometimes)
but it is not better. Use hacks where you absolutely have to i.e.
where you are hitting the limits of technology e.g. in supercomputers.
But leave it off our desks. Guess why NeXT went with Objective-C or
why other people go with Lisp. Not because it is the fastest, but
because it is clean. I don't care if a program is a few percents
slower as long as it is maintainable. 

Say no to hacks, say no to RISC (at least in SPARC-like incarnations).

In addition, the 68040 has built-in support for multi-processing.
Which of the RISC chips does? (This is a real question). If there is
any, then that's the one NeXT is most likely to choose, since I would
guess multiprocessing is a more general solution to performance
problems than any RISC-hack can offer, and thus I would suspect NeXT
is putting a higher priority on that than on RISC.

Just my 0.02$

Ronald



------------------------------------------------------------------------------
"The reasonable man adapts himself to the world; the unreasonable one persists
in trying to adapt the world to himself. Therefore all progress depends on the
unreasonable man."   G.B. Shaw   |  rca@cs.brown.edu or antony@browncog.bitnet

sef@kithrup.COM (Sean Eric Fagan) (04/09/91)

In article <71367@brunix.UUCP> rca@cs.brown.edu (Ronald C.F. Antony) writes:
>If RISC were software you would call it a hack, (or a C program with
>tons of gotos for that matter). 

Say *what*?!

How in Gaea's name did you reach that conclusion?  Most RISC instruction
sets, the MIPS in particular, are extremely well thought out and researched,
just the *opposite* of what one would call a "hack."

>Guess why NeXT went with Objective-C or
>why other people go with Lisp. Not because it is the fastest, but
>because it is clean. 

Objective C has *never* struct me as "clean."  It always strikes me as an
obvious hack on top of C.  C++ is cleaner, although not perfect.

>Say no to hacks, say no to RISC (at least in SPARC-like incarnations).

Ah.  The SPARC (and most chips with register-windows) are, IMHO, ugly.  The
SPARC also doesn't have a mulitply instruction (newer versions do, I know),
and this caused it to be slower than a 68030 for certain applications.

Not all RISCs are SPARCs.

>In addition, the 68040 has built-in support for multi-processing.
>Which of the RISC chips does? (This is a real question). 

88k, R3000, R6000, R4000 (when it comes out).  SPUR, I believe.  i860.
(Don't know about the i960, sorry.)  SPARC (although it might be the sparc-2
instruction set that had it).  29k, I think.  RIOS (IBM RS/6000).

>If there is
>any, then that's the one NeXT is most likely to choose, since I would
>guess multiprocessing is a more general solution to performance
>problems than any RISC-hack can offer, and thus I would suspect NeXT
>is putting a higher priority on that than on RISC.

88k, R3000 or R4000.  I doubt that NeXT will chose the SPARC, although they
might.  The 88k is from Motorola, as is the 68k line, so that might be an
incentive.  On the other hand, the MIPS chips are *fast*, and the R4000 is a
64-bit machine (both addresses and integer registers are 64-bits wide),
which provides for future requirements.

Go read comp.arch.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

araftis@polyslo.CalPoly.EDU (Alex Raftis) (04/10/91)

>I rather see 4 50MHz R4000's than 4 68040's.  Can you say Cray in a Cube?
>
>   >-Mike

I agree with most of what you're saying. I'd just rather see a 68040
controlling things, with three 50Mhz R4000's doing all the grunt like
cranking out numbers. Up grading things in this manner circumvents the
problem with re-writing the system software, but gives you most the
benefits of RISC, and besides, your RISC chips under the above configur-
ation don't need to worry about things like drawing and windows, something
the 68040 handles very well (at least from what I've seen on the NeXT's)

-- 
               -------------------------------------------------- 
                     Internet: alex@cosmos.ACS.CalPoly.EDU
                  or araftis@data.ACS.CalPoly.EDU for NeXTmail

melling@cs.psu.edu (Michael D Mellinger) (04/10/91)

In article <71367@brunix.UUCP> rca@cs.brown.edu (Ronald C.F. Antony) writes:

   If RISC were software you would call it a hack, (or a C program with
   tons of gotos for that matter). Of course a hack is faster (sometimes)
   but it is not better. Use hacks where you absolutely have to i.e.
   where you are hitting the limits of technology e.g. in supercomputers.
   But leave it off our desks. Guess why NeXT went with Objective-C or
   why other people go with Lisp. Not because it is the fastest, but
   because it is clean. I don't care if a program is a few percents
   slower as long as it is maintainable. 

Why do you think NeXT is putting an i860 in their graphics board?
It's definitely RISC, and definitely high-performance.

   Say no to hacks, say no to RISC (at least in SPARC-like incarnations).

   In addition, the 68040 has built-in support for multi-processing.
   Which of the RISC chips does? (This is a real question). If there is
   any, then that's the one NeXT is most likely to choose, since I would
   guess multiprocessing is a more general solution to performance
   problems than any RISC-hack can offer, and thus I would suspect NeXT
   is putting a higher priority on that than on RISC.

The 88K has built in support for multiprocessing too.  Anyway, 4 25MHz
68040s are going to about equal one 25MHz R6000(or is it the R4000) 50
mips.  Thats 50 mips for the R6000 vs. 60 mips(at best, no flames this
time) for 4 68040s.  Which direction is would be cheaper for NeXT to
take?

-Mike

melling@cs.psu.edu (Michael D Mellinger) (04/10/91)

In article <1991Apr09.083600.13051@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:

   How in Gaea's name did you reach that conclusion?  Most RISC instruction
   sets, the MIPS in particular, are extremely well thought out and researched,
   just the *opposite* of what one would call a "hack."

It doesn't matter what the assembler code looks like anyway, all the
programmer has to do is type cc or make.

   >Guess why NeXT went with Objective-C or
   >why other people go with Lisp. Not because it is the fastest, but
   >because it is clean. 

   Objective C has *never* struct me as "clean."  It always strikes me as an
   obvious hack on top of C.  C++ is cleaner, although not perfect.

Objective C has far fewer extensions to the C language and is based
more on the Smalltalk paradigm.  I would say that it is definitely
cleaner.  C++ is thorny, and BS is still changing the language.  I
hope that Eiffel works out well on the NeXT.

   88k, R3000 or R4000.  I doubt that NeXT will chose the SPARC, although they
   might.  The 88k is from Motorola, as is the 68k line, so that might be an
   incentive.  On the other hand, the MIPS chips are *fast*, and the R4000 is a
   64-bit machine (both addresses and integer registers are 64-bits wide),
   which provides for future requirements.

   Go read comp.arch.

I hope someone at NeXT does.  Which chip might be moot though,
considering the politics involved.  I think that the 68040 will be
around for a couple of more years.  It still fits in nicely at the
low-end of the market.

-Mike

smb@data.com (Steven M. Boker) (04/10/91)

In article <8lbG1vdl1@cs.psu.edu> melling@cs.psu.edu (Michael D Mellinger) writes:
>
>The 88K has built in support for multiprocessing too. 

Did anyone notice the graphic on the front page of the New York Times
Business section that put NeXT in the 88000 camp with a little asterisk
that noted "expected".  I wonder what their sources are?

Steve 

-- 
 #====#====#====#====#====#====#====#====#====#====#====#====#====#====#====#
 #  Steve Boker           #             "Two's bifurcation                  #
 #  smb@data.com          #             but three's chaotic"                #
 #====#====#====#====#====#====#====#====#====#====#====#====#====#====#====#

waltrip@capd.jhuapl.edu (04/11/91)

In article <8lbG1vdl1@cs.psu.edu>, melling@cs.psu.edu (Michael D Mellinger) 
writes:
> 
> In article <71367@brunix.UUCP> rca@cs.brown.edu (Ronald C.F. Antony) writes:
	[...material deleted...] 
> The 88K has built in support for multiprocessing too.  Anyway, 4 25MHz
> 68040s are going to about equal one 25MHz R6000(or is it the R4000) 50
> mips.  Thats 50 mips for the R6000 vs. 60 mips(at best, no flames this
> time) for 4 68040s.  Which direction is would be cheaper for NeXT to
> take?
	The MIPS R4000 is the chip selected by the consortium of Microsoft,
	Compaq, DEC, etc., who are proposing to build a standard workstation
	of the future.  Its probably possible for NeXT to leverage off of this
	effort (perhaps join it) as the result could be a cheap-to-produce
	platform.  If Adobe could manage to join and guide them towards a
	Display PostScript display controller, NeXT might be a particular
	beneficiary.  Given a standard platform and a standard OS (OSF/1 or
	OSF/2), then everyone has a common base to apply their peculiar value-
	added vision.  With binary compatibility, the cost of applications
	should be reduced and it should be a more competitive environment with
	the result that we could expect a decrease in applications prices.

	I think this consortium will probably be effective in squeezing out
	the non-players and, for selfish reasons, I would prefer to see NeXT
	become a squeezer rather than a squeezee.  The NeXTstep application
	environment with multi-media support seems like it has a good shot at
	medium term viability and long term prosperity provided that the
	playing is radically altered against them.  This looks like one of
	those "you can't lick them so you'd better join them" situations.

c.f.waltrip

Internet:  <waltrip@capsrv.jhuapl.edu>

Opinions expressed are my own.

smb@data.com (Steven M. Boker) (04/11/91)

In article <1991Apr10.121142.1@capd.jhuapl.edu> waltrip@capd.jhuapl.edu writes:
>	The MIPS R4000 is the chip selected by the consortium of Microsoft,
>	Compaq, DEC, etc., who are proposing to build a standard workstation
>	of the future.  Its probably possible for NeXT to leverage off of this
>	effort (perhaps join it) as the result could be a cheap-to-produce
>	platform.  

This is the OS/2 Windows Microsoft workstation of the future.  Don't
think otherwise.  I'm only sorry for Compaq, who has seemed more and more
like DEC every day.  I am glad to see that Microsoft couldn't pull any
more of the players into their OS/2 shuck and jive.

Steve


-- 
 #====#====#====#====#====#====#====#====#====#====#====#====#====#====#====#
 #  Steve Boker           #             "Two's bifurcation                  #
 #  smb@data.com          #             but three's chaotic"                #
 #====#====#====#====#====#====#====#====#====#====#====#====#====#====#====#

doug@eris.berkeley.edu (Doug Merritt) (04/15/91)

In article <1991Apr10.155032.14786@data.com> smb@data.com (Steven M. Boker) writes:
>
>Did anyone notice the graphic on the front page of the New York Times
>Business section that put NeXT in the 88000 camp with a little asterisk
>that noted "expected".  I wonder what their sources are?

The San Jose Mercury said rather offhandedly last week that NeXT will be
coming out with an 88000 system in the fall in the $8K to $10K range.
More precisely they said it would be based on a new member of the 88K
family that Motorola hasn't released yet; I've forgotten the model number.
This was in the context of relating a talk Steve Jobs gave where he
announced that 8000 systems were shipped in the first quarter, so they
sort of implied that Steve announced the 88K without directly saying he did
so. It may be just another rumor that they're treating as fact.

A little while ago (April 6), I saw this:

In article <27fcdce4.3a0@petunia.CalPoly.EDU> araftis@polyslo.CalPoly.EDU (Alex Raftis) writes:
>     And, at that, those are 15 or 20 *CISC* MIPS, which get a lot
>more work done than 15 or 20 RISC MIPS.  Further, you don't always
>have to up the clock speed to up the processor speed, as the upgrade
>from the 68030 to the 68040 exemplifies, so you don't have to end up
>with a processor that you could fry an egg on or one that has to use
>bulky, expensive SRAM for because the DRAMs available either don't
>run that fast or cost even more.

Since no one else has rebutted this, I will. This is all just plain wrong.
The issue of "CISC MIPS" versus "RISC MIPS" is a particular instance of
the widely held truth that MIPS == "Meaningless Instrumentation to
Propel Sales", that is, that MIPS is always so vague that it is meaningless.
Nonetheless, in particular cases one can run other, more meaningful benchmarks
to find out something closer to the truth.

And the truth is that the 68040 is *not* faster than the Sparc/88K/MIPS
processors. Your supposedly more powerful "CISC MIPS" are a phantom.

As for "having to up the clock speed to up the processor speed", this
is also without substance, because the improvement seen in going from
the 68030 to the 68040 was basically one of making the most commonly
executed and RISC-like instructions in the 68K operate at RISC-like
speeds. You can't get this improvement twice. Oh, sure, everyone will be
going for superscalar, where two instructions get executed every clock cycle,
but the RISCs will do this at least as effectively as CISCs (doubtless better,
in fact).

In other words, the playing field is more level now. The 68040 has improved
enough that it will benefit from increased clock speed *almost* as much
as RISCs do. And it will benefit from superscalar almost as much as
RISCs will. It was never the case that the 68K actually had some sort
of advantage there; you've got it backwards.

Don't forget, CISCs came first, and RISCs were introduced for very good
reasons. Find out what the reasons were/are before take shots in the dark.

And as for DRAM not being able to keep up with RISCs, that's even further
off the mark. DRAM speeds are as much an issue for CISC *performance* as
for RISCs; it is hardly an advantage to claim that because CISCs waste
cycles, that they therefore aren't limited by DRAM speeds. They certainly
are -- both CISCs and RISCs can either do useful work while waiting on
memory, or they cannot and must wait.

I have my own doubts about RISC as the ultimate in architecture, but they
are clearly the fastest way to do things in the *near-term*. The other
issues I'm interested in will probably not be competitive for another
5-10 years; we'll likely have to hit a performance plateau with superscalar
RISCs before it makes sense to start building in other directions.
	Doug
-- 
--
Doug Merritt		doug@eris.berkeley.edu (ucbvax!eris!doug)
		or	uunet.uu.net!crossck!dougm

greg@sif.claremont.edu (Tigger) (04/18/91)

In article <1991Apr15.165540.14270@agate.berkeley.edu>, doug@eris.berkeley.edu (Doug Merritt) writes:
>
> [fairly well reasoned argument that RISC is better than CISC deleted]

There is one monkey wrench that no one seems to think about.  Gallium
Arsenide.  CISC chips used to make a great deal of sense when memory was
so much slower than CPUs.  Instead of wasting clock cycles sitting around
idle waiting for memory, the CISC chip would perform a complex instruction
that took multiple clock cycles.  Now that memory has done a fairly good
job of catching up, RISC makes a great deal of since.  That's where GaAs
comes in.  It is much faster than silicon, and much more expensive.  The
price is going to stay high for some time.  Unless you're paying supercomputer
prices, that means you can't build a whole GaAs computer.  If anything, its
just going to be the CPU.  Once again, your CPU will be so much faster than
your memory that CISC is going to be the more productive design.

|   Greg Orman                                 greg@pomona.claremont.edu   |
|                 If you believe then the time has come                    |
|                 For serious fun                                          |
|                                         - Fieger/Averre                  |

melling@cs.psu.edu (Michael D Mellinger) (04/18/91)

In article <1991Apr17.192605.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:


   There is one monkey wrench that no one seems to think about.  Gallium
   Arsenide.  CISC chips used to make a great deal of sense when memory was
   so much slower than CPUs.  Instead of wasting clock cycles sitting around
   idle waiting for memory, the CISC chip would perform a complex instruction
   that took multiple clock cycles.  Now that memory has done a fairly good
   job of catching up, RISC makes a great deal of since.  That's where GaAs
   comes in.  It is much faster than silicon, and much more expensive.  The
   price is going to stay high for some time.  Unless you're paying supercomputer
   prices, that means you can't build a whole GaAs computer.  If anything, its
   just going to be the CPU.  Once again, your CPU will be so much faster than
   your memory that CISC is going to be the more productive design.


Is there some reason that memory can't be built out of GaA?  Also,
RISC chips will be the first CPU's made out of GaA because of the low
# of transisters that can currently be put on on a chip made out of
GaA, and RISC chips require fewer CPUs than the monolithic 68040.

Anyway, NeXT needs a near term solution to the price/performane rat
race, and RISC is currently blowing the doors off of CISC.

-Mike
  

greg@sif.claremont.edu (Tigger) (04/19/91)

In article <racGk93s1@cs.psu.edu>, melling@cs.psu.edu (Michael D Mellinger) writes:
> 
> In article <1991Apr17.192605.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
> 
>    Unless you're paying supercomputer
>    prices, that means you can't build a whole GaAs computer.
> 
> Is there some reason that memory can't be built out of GaA?

Cost.  I believe Seymour Cray has been working on the first commercially
available gallium arsenide computer.  As I'm sure you can imagine, the price
is many orders of magnitude higher than you or I could afford.

> Also,
> RISC chips will be the first CPU's made out of GaA because of the low
> # of transisters that can currently be put on on a chip made out of
> GaA, and RISC chips require fewer CPUs than the monolithic 68040.

I could be wrong, but don't some of the current RISC chips (such as IBM's
RIOS) begin to approach the complexity of the 68040?  Besides, you can
always split the chip up, the way Motorola has done with their 88000 chip
set.  There is something of a performance penalty, but it really isn't
necessary to have your floating point and memory management units on the
same ship is it?

> Anyway, NeXT needs a near term solution to the price/performane rat
> race, and RISC is currently blowing the doors off of CISC.

I would tend to disagree.  I have a benchmark that I run periodically
when I have the chance to get my hands on a new machine with a C compiler.
The VAX 9000 and the NeXT '040 are two of the three fastest machines that
I've had a chance to test.  Both are CISC designs.  Both outperformed systems
based on the two most widespread RISC designs, MIPS and SPARC.  The only
RISC chip that was in the same league was RIOS, and then only when the
compiler's optimization was turned on to insure superscalar operation.

I'm not trying to put RISC down.  Currently, the things that really make
RISC are the best way to go.  Hell, the fact that some of those concepts
were used in the 68040 is probably the single most important factor in it
being such a good chip.  While it certainly isn't a RISC chip, it isn't
really a CISC design either.  It's a hybrid.

I guess you could look at it kinda like modern economics.  Neither pure
capitalism nor pure socialism exist, because a mix of some sort turns out
to be better.  Take the best of both worlds and make something better.  The
68040, I think, is a step in that direction.  It will be interesting ten or
fifteen years from now to see whether that's the way things have gone.

|   Greg Orman                                        Let's get lost       |
|   greg@pomona.claremont.edu                            - Fieger/Averre   |

benseb@nic.cerf.net (Booker Bense) (04/19/91)

In article <1991Apr18.180538.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
>In article <racGk93s1@cs.psu.edu>, melling@cs.psu.edu (Michael D Mellinger) writes:
>> 
>> In article <1991Apr17.192605.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
[ Stuff about Super Computers Deleted ]
>
>> Anyway, NeXT needs a near term solution to the price/performane rat
>> race, and RISC is currently blowing the doors off of CISC.
>
>I would tend to disagree.  I have a benchmark that I run periodically
>when I have the chance to get my hands on a new machine with a C compiler.
>The VAX 9000 and the NeXT '040 are two of the three fastest machines that
>I've had a chance to test.  Both are CISC designs.  Both outperformed systems
>based on the two most widespread RISC designs, MIPS and SPARC.  The only
>RISC chip that was in the same league was RIOS, and then only when the
>compiler's optimization was turned on to insure superscalar
>operation.

- Getting performance out of the machine depends on alot more than
MIPS or even MFLOPS. The whole system has to be balanced, ie if you have a
fast chip , you have to have a fast disk , fast memory ,etc... The
only real test of a machine is to take code that you use and run it. 
I work with some of these ``RISC'' monsters ( DECstation 5000, Sparc )
and they run benchmarks great, but getting any real complex code
running is painful. Most benchmarks do all their work in loops small 
enough to reside in the instruction buffer. This is what makes those 
numbers so spiffy and performance degrade so fast when you start
having to load/ unload instruction buffers. I haven't played with my
NeXT enough yet to get a feel for how well balanced it is. One thing I
am truly impressed with is the speed of it's graphics. Complex
real time animation is possible, this is not true of the SparcStation
1 on my desk. Granted the Sparcstation has color and is suffering 
under the burden of attempting to animate in X11 %-)!. ( Before the flames 
start I like X11, but the support for animation is minimal ).
>
[ Rational arguement about merging of RISC CISC designs deleted ]

- So in conclusion what you really need to measure is 

price/(performance doing what I do )

-BWT , the software that comes with the NeXT is worth the price of
the computer alone. When you consider the price / performance ratio
you should consider platforms with the same software configuration. 

- Booker C. Bense                    
prefered: benseb@grumpy.sdsc.edu	"I think it's GOOD that everyone 
NeXT Mail: benseb@next.sdsc.edu 	   becomes food " - Hobbes

bostrov@prism.cs.orst.edu (Vareck Bostrom) (04/20/91)

In <339@nic.cerf.net> benseb@nic.cerf.net (Booker Bense) writes:

>In article <1991Apr18.180538.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
>>In article <racGk93s1@cs.psu.edu>, melling@cs.psu.edu (Michael D Mellinger) writes:
>>> 
>>> In article <1991Apr17.192605.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
[ Stuff about Super Computers Deleted ]
>>I would tend to disagree.  I have a benchmark that I run periodically
>>when I have the chance to get my hands on a new machine with a C compiler.
[CISC performance stuff deleted] 

>- Getting performance out of the machine depends on alot more than
>MIPS or even MFLOPS. The whole system has to be balanced, ie if you have a
>fast chip , you have to have a fast disk , fast memory ,etc... The
>only real test of a machine is to take code that you use and run it. 
[more stuff about balanced systems deleted]
>[ Rational arguement about merging of RISC CISC designs deleted ]

>- So in conclusion what you really need to measure is 

>price/(performance doing what I do )

I will agree with the stuff about the SPARCs poor graphic performance.
Have you ever noticed that Sun-3's seem to do graphics and animation
better than Sun-4's? 

The NeXT seems fast enough for me now (the 030 wasn't) until I try to 
do something like ray-traceing, then when I compare it to SPARC-2's and
DEC 5000's, the 040 NeXT seems greatly underpowered. I don't worry about
it too much, as I really don't have a NEED to have the trace done at any
certain time. 

Too bad somebody doesn't come out with a iWARP coprocessing board for the
cube, with compiler and everything. Put a supercomputer on your desk...

AS much as an improvement as the 040 is over the 030, the windowing system
on the NeXT still seems much slower than Xwindows on a SPARC 1 or Sun-3,
and my cube seems to swap a hell of a lot (I only have 8mb, but
the Suns I have worked with had only 8mb, and they did fine). I also notice
that building large programs takes longer on the NeXT than on SPARC and MIPS
based machines, though the resulting binaries appear to be about equal in 
speed (25 MHz SPARC vs 25 MHz 68040), as long as it wasn't a heavy FP 
type program. 

I certanily DON'T think the 68040 gets anywhere near 2.0 MFlops, let alone 3.3.
Perhaps 1.4 on a good day. A floating point accellerator is needed on the
NeXT for scientific applications. ??

- Vareck Bostrom
bostrov@prism.cs.orst.edu
- or -
bostrov@gnu.ai.mit.edu


>- Booker C. Bense                    
>prefered: benseb@grumpy.sdsc.edu	"I think it's GOOD that everyone 
>NeXT Mail: benseb@next.sdsc.edu 	   becomes food " - Hobbes

fjs@nntp-server.caltech.edu (Fernando J. Selman) (04/20/91)

Just my 2 cents contribution to this discussion.

When comparing the Next to Sparc1+ and 2 I found that for
applications that are not cache intensive the Next is equivalent to
the Sparc 1+ (and by extension 1/2 the speed of the Sparc 2). When
doing a cache intensive application (25000 body simulation) the
relative efficiency of the Sparc 1+ double with respect to the
Next. I conclude from this that at least this RISC chip is not
more efficient than the 68040, the difference being attributed
to the cache. The NextWorld bechmarks seem to have been done in
the regime when the cache is not important, this seem to be the
reason why the Sparc 2 looks there only twice as fast as the Next.

				Fernando

madler@nntp-server.caltech.edu (Mark Adler) (04/20/91)

Vareck Bostrom writes:
>> A floating point accellerator is needed on the
>> NeXT for scientific applications.

Sad, but true.  I heard that the 96002 DSP didn't make it on the
68040 NeXT's due to lack of availability.  If the next NeXT's have
96002's, then we'll have some serious computing power ...

Mark Adler
madler@pooh.caltech.edu

bennett@mp.cs.niu.edu (Scott Bennett) (04/22/91)

In article <1991Apr15.165540.14270@agate.berkeley.edu> doug@eris.berkeley.edu (Doug Merritt) writes:
>
>A little while ago (April 6), I saw this:
>
>In article <27fcdce4.3a0@petunia.CalPoly.EDU> araftis@polyslo.CalPoly.EDU (Alex Raftis) writes:

     Actually, no, he didn't.  I did.

>>     And, at that, those are 15 or 20 *CISC* MIPS, which get a lot
>>more work done than 15 or 20 RISC MIPS.  Further, you don't always
>>have to up the clock speed to up the processor speed, as the upgrade
>>from the 68030 to the 68040 exemplifies, so you don't have to end up
>>with a processor that you could fry an egg on or one that has to use
>>bulky, expensive SRAM for because the DRAMs available either don't
>>run that fast or cost even more.
>
>Since no one else has rebutted this, I will. This is all just plain wrong.
>The issue of "CISC MIPS" versus "RISC MIPS" is a particular instance of
>the widely held truth that MIPS == "Meaningless Instrumentation to
>Propel Sales", that is, that MIPS is always so vague that it is meaningless.
>Nonetheless, in particular cases one can run other, more meaningful benchmarks
>to find out something closer to the truth.

     I agree entirely.  The people who take it a step further by saying
that they convert "native MIPS" to "VAX 11/780 MIPS" are just making it
worse.
>
>And the truth is that the 68040 is *not* faster than the Sparc/88K/MIPS
>processors. Your supposedly more powerful "CISC MIPS" are a phantom.

     I don't recall saying anything w.r.t. the relative computing speeds
of the 68040 and the 88000 series.
>
>As for "having to up the clock speed to up the processor speed", this
>is also without substance, because the improvement seen in going from
>the 68030 to the 68040 was basically one of making the most commonly
>executed and RISC-like instructions in the 68K operate at RISC-like

     So they sped up those particular instructions.  This merely details
exactly what I said.

>speeds. You can't get this improvement twice. Oh, sure, everyone will be

     That's probably true, but doesn't necessarily rule out other
kinds of efficiency gains in the 68050.

>going for superscalar, where two instructions get executed every clock cycle,
>but the RISCs will do this at least as effectively as CISCs (doubtless better,
>in fact).

     Case in point.  By executing multiple complex instructions per cycle,
the CISC would appear to benefit at least as much as the RISC, not as you
state.
>
>In other words, the playing field is more level now. The 68040 has improved
>enough that it will benefit from increased clock speed *almost* as much
>as RISCs do. And it will benefit from superscalar almost as much as
>RISCs will. It was never the case that the 68K actually had some sort
>of advantage there; you've got it backwards.
>
>Don't forget, CISCs came first, and RISCs were introduced for very good
>reasons. Find out what the reasons were/are before take shots in the dark.

     I am aware of the reasons.  I am also aware that these issues are
still controversial and unresolved.
>
>And as for DRAM not being able to keep up with RISCs, that's even further
>off the mark. DRAM speeds are as much an issue for CISC *performance* as
>for RISCs; it is hardly an advantage to claim that because CISCs waste
>cycles, that they therefore aren't limited by DRAM speeds. They certainly

     This is just silly.  First, to say that a CISC chip that can do
useful work without having to fetch and decode an instruction every
cycle somehow implies that the CISC chip is wasting cycles compared
to what a RISC chip would be doing is ludicrous.  Quite clearly, the
penalty in *this* case falls to the RISC chip, which *must* fetch an
instruction every cycle, thereby placing an extra workload on memory.
     Secondly, it is worth noting here that the extreme case of CISC,
the vector processing computer, spends a *much* larger portion of its
storage cycles doing work than either a RISC or a more typical sort
of CISC.  In a vector processor, one instruction fetch and decoding
is amortized over the length of the vector upon which the operation
is carried out.  Amdahl's vector processors had maximum vector lengths
of 512 and 1024 items, depending on the hardware model.  In other words,
it was possible for 1024 floating point adds, for example, to be done
eight at a time every 14 ns for the fetch/decode expense of *one*
instruction.  Neither are Crays noted for their slowness. ;-)

>are -- both CISCs and RISCs can either do useful work while waiting on
>memory, or they cannot and must wait.
>
>I have my own doubts about RISC as the ultimate in architecture, but they
>are clearly the fastest way to do things in the *near-term*. The other
>issues I'm interested in will probably not be competitive for another
>5-10 years; we'll likely have to hit a performance plateau with superscalar
>RISCs before it makes sense to start building in other directions.

     You might take a look at IBM's definition of RISC.  The RS6000's
are pretty fast and have an instruction set that doesn't look like a
traditional RISC instruction set.

>	Doug
>-- 
>--
>Doug Merritt		doug@eris.berkeley.edu (ucbvax!eris!doug)
>		or	uunet.uu.net!crossck!dougm


                                  Scott Bennett, Comm. ASMELG, CFIAG
                                  Systems Programming
                                  Northern Illinois University
                                  DeKalb, Illinois 60115
**********************************************************************
* Internet:       bennett@cs.niu.edu                                 *
* BITNET:         A01SJB1@NIU                                        *
*--------------------------------------------------------------------*
*  "Spent a little time on the mountain, Spent a little time on the  *
*   Hill, The things that went down you don't understand, But I      *
*   think in time you will."  Oakland, 19 Feb. 1991, first time      *
*  since 25 Sept. 1970!!!  Yippee!!!!  Wondering what's NeXT... :-)  *
**********************************************************************

preston@LL.MIT.EDU (Steven Preston) (04/22/91)

> I could be wrong, but don't some of the current RISC chips (such as
> IBM's RIOS) begin to approach the complexity of the 68040?

Yes, it is clear that RISC has become a misnomer; it (RISC) now means that
the average cycles per instruction is about 1 or less.

I read somewhere that Motorola claims that the '040 takes an average
of 1.3 clock cycles per instruction.  This probably qualifies it as a
RISC machine, by the above criterion.

--
Steve Preston  (preston@ll.mit.edu)

greg@sif.claremont.edu (Tigger) (04/23/91)

In article <9104220802.AA09560@LL.MIT.EDU>, preston@LL.MIT.EDU (Steven Preston) writes:
> 
> Yes, it is clear that RISC has become a misnomer; it (RISC) now means that
> the average cycles per instruction is about 1 or less.

RISC has been a misnomer since before it came out of the labs.  I don't know
about the early IBM prototypes, but I'm pretty sure that every single 
commercially produced RISC design has had a larger instruction set than the
6502 in my old Apple II.  Does that make my old Apple II an early RISC
workstation?  I somehow don't think so...

|   Greg Orman                                        Let's get lost       |
|   greg@pomona.claremont.edu                            - Fieger/Averre   |

ddj@zardoz.club.cc.cmu.edu (Doug DeJulio) (04/23/91)

In article <1991Apr22.165148.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
>RISC has been a misnomer since before it came out of the labs.  I don't know
>about the early IBM prototypes, but I'm pretty sure that every single 
>commercially produced RISC design has had a larger instruction set than the
>6502 in my old Apple II.  Does that make my old Apple II an early RISC
>workstation?  I somehow don't think so...

Naw.  It's Reduced instruction set, not Tiny instruction set (TISC?).
Way back then, small instruction sets were normal, so there wasn't
much to reduce.  Only after the popularity of chips with huge clunky
instruction sets (cf. vax "polysolve" instruction) and bizzarre
architectures (cf. 8086) could "RISC" become an apropriate term...
-- 
DdJ

Disclaimer: I *like* the VAX, *and* the 8086.  I'd like 'em more if
            they weren't wierd-endian.

fjs@nntp-server.caltech.edu (Fernando J. Selman) (04/23/91)

I just came across another acronysm for some RISC processors:

CRISP: Complex Reduced Instrunction Set Processor!

				- Fernando

scott@texnext.gac.edu (Scott Hess) (04/23/91)

In article <1991Apr18.180538.1@sif.claremont.edu> greg@sif.claremont.edu (Tigger) writes:
   > Also,
   > RISC chips will be the first CPU's made out of GaA because of the low
   > # of transisters that can currently be put on on a chip made out of
   > GaA, and RISC chips require fewer CPUs than the monolithic 68040.

   I could be wrong, but don't some of the current RISC chips (such as IBM's
   RIOS) begin to approach the complexity of the 68040?  Besides, you can
   always split the chip up, the way Motorola has done with their 88000 chip
   set.  There is something of a performance penalty, but it really isn't
   necessary to have your floating point and memory management units on the
   same ship is it?

If you look, you'll see that all of the chips that have been created in the
past couple years seem to be RISC chips (this was brought to my attention
by a prof in one of my classes just this spring).  It seems to be true!
What it makes one wonder is whether the naming of chips like the RIOS
RISC is done because they are really RISC, or because RISC is what
people want.  Considering all I've seen on the RIOS (not much,
admitedly), I'd say it's more of a CISC that uses some of the things
that RISC people have found to be helpful to make fast chips.

   > Anyway, NeXT needs a near term solution to the price/performane rat
   > race, and RISC is currently blowing the doors off of CISC.

   I would tend to disagree.  I have a benchmark that I run periodically
   when I have the chance to get my hands on a new machine with a C compiler.
   The VAX 9000 and the NeXT '040 are two of the three fastest machines that
   I've had a chance to test.  Both are CISC designs.  Both outperformed systems
   based on the two most widespread RISC designs, MIPS and SPARC.  The only
   RISC chip that was in the same league was RIOS, and then only when the
   compiler's optimization was turned on to insure superscalar operation.

???  The local Personal Iris with an R2000 is apparently the equal of
our NextStations.  Not that that's bad - after all the NextStation costs
about 1/5 the price.  Then again, the Iris is a couple years older, too.
The ones running with R4000 should be about 2-4 times the speed of an '040.

I guess I really don't care what happens.  There definitely is room for
both CISC and RISC.  I don't see any processors in the near future
breaking the strangle-hold on the market that 80x86 and 680x0 have -
those have gotta account for something like 98% of all CPUs shipped, if
not more.  But I guess it's not hard to see that each version of those
chips is requiring more and more work to get them faster.  Which isn't
bad - it's not like Motorola and Intel can't afford the work, right?

But either way, I think falling into the problems Apple has with changing
the Mac, and IBM/Microsoft have changing the PC is dangerous.  If we
could supply the interoperability that the PC supplies (a PC is a PC is
a PC) in a more general manner (across platforms), I think the wins could
be pretty decent.  NeXT is going to have to be darn flexible if they
want to win in more than the "Professional Workstation" market, and by
moving to a multiple-architecture approach, they could do it.

Later,
--
scott hess                      scott@gac.edu
Independent NeXT Developer	GAC Undergrad
<I still speak for nobody>
"Simply press Control-right-Shift while click-dragging the mouse . . ."
"I smoke the nose Lucifer . . . Banana, banana."

sef@kithrup.COM (Sean Eric Fagan) (04/25/91)

In article <1991Apr22.044553.16805@mp.cs.niu.edu> bennett@mp.cs.niu.edu (Scott Bennett) writes:
>     Case in point.  By executing multiple complex instructions per cycle,
>the CISC would appear to benefit at least as much as the RISC, not as you
>state.

Nope, not really. That's the problem:  with a "CISC" instruction set, it's
really very difficult to go superscalar, at least compared to some "RISC"
instruction sets.  Why?  Because not enough registers, too many memory
references in a single instruction, or other small, niggling details.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

wayner@CS.Cornell.EDU (Peter Wayner) (04/25/91)

sef@kithrup.COM (Sean Eric Fagan) writes:

>Nope, not really. That's the problem:  with a "CISC" instruction set, it's
>really very difficult to go superscalar, at least compared to some "RISC"
>instruction sets.   Why?  Because not enough registers, too many memory
>references in a single instruction, or other small, niggling details.

In another sense, going "superscalar" is much easier with CISC
machines.  I think the Intel 486 does a PUSH instruction in one cycle.
In RISC land, this is a decrement and a load. The CISC designer just
needs to use enough silicon to pipeline the important instructions.
There is no need for complex logic to handle all the possible cases of
two instructions coming down the pipe. The RISC designer needs to
worry about generality.

This is just an academic nit, though, because I generally agree with 
you.



>-- 
>Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
>sef@kithrup.COM  |  I had a bellyache at the time."
>-----------------+           -- The Turtle (Stephen King, _It_)
>Any opinions expressed are my own, and generally unpopular with others.
-- 
Peter Wayner   Department of Computer Science Cornell Univ. Ithaca, NY 14850
EMail:wayner@cs.cornell.edu    Office: 607-255-9202 or 255-1008
Home: 116 Oak Ave, Ithaca, NY 14850  Phone: 607-277-6678

shwake@raysnec.UUCP (Ray Shwake) (04/25/91)

scott@texnext.gac.edu (Scott Hess) writes:

>???  The local Personal Iris with an R2000 is apparently the equal of
>our NextStations.  Not that that's bad - after all the NextStation costs
>about 1/5 the price.  Then again, the Iris is a couple years older, too.
>The ones running with R4000 should be about 2-4 times the speed of an '040.

	There's something disingenuous about comparing that which *will* be
	available to that which *is* available. The '40 is already available
	in shipping products - and has been for six months - while the 4000,
	according to MIPS, won't be available before late '91.

>I guess I really don't care what happens.  There definitely is room for
>both CISC and RISC.  I don't see any processors in the near future
>breaking the strangle-hold on the market that 80x86 and 680x0 have -

	CISC and RISC are more alike today than was the case years ago,
	as the designers of each learn the best techniques of their counter-
	parts. 

-----------  
uunet!media!ka3ovk!raysnec!shwake				shwake@rsxtech

bennett@mp.cs.niu.edu (Scott Bennett) (04/25/91)

In article <1991Apr24.170804.25670@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
>In article <1991Apr22.044553.16805@mp.cs.niu.edu> bennett@mp.cs.niu.edu (Scott Bennett) writes:
>>     Case in point.  By executing multiple complex instructions per cycle,
>>the CISC would appear to benefit at least as much as the RISC, not as you
>>state.
>
>Nope, not really. That's the problem:  with a "CISC" instruction set, it's
>really very difficult to go superscalar, at least compared to some "RISC"
>instruction sets.  Why?  Because not enough registers, too many memory
>references in a single instruction, or other small, niggling details.

     If you disallow pipelining in the CISC machine, then it is most
likely to be impossible to have so-called superscalar operation.  However,
most CISC machines now are not only pipelined, they are *multiply* pipe-
lined.  Since a superscalar RISC can only be that way by pipelining,
let's at least compare only pipelined architectures.  FWIW, the MC68040
supposedly averages about 1.3 clock cycles per instruction because of
the pipelining used.  That obviously doesn't reach "superscalar", but
it isn't terribly far off, either.
     In any case, what really matters is how much work gets done per
clock cycle, not how many instructions get done per cycle.  One example
is the case of moving blocks of data from one memory location to another.
A typical RISC must 1) initialize a loop (one or more instruction fetch/
decodes) and in the body of the loop must 2) load a word into a register
(one fetch/decode), 3) store from the register into the new location (one
fetch/decode), 4) increment both addresses (probably two fetches/decodes),
5) loop back to repeat until finished (at least one fetch/decode).  Some
CISCs have something like a "repeat" instruction that will execute 
another instruction (e.g. a storage-to-storage move) a given number of
times while incrementing addresses in that instruction, so the whole
operation may require as few as two fetches/decodes.  Other CISCs have
single instructions capable of doing block moves, so they only need one
fetch decode.  That means more of the cycles required get spent doing the
actual work that needs to be done than would be the case with a RISC.  A
CISC operating in such a way would be at the *opposite* end of the spectrum
from "superscalar", but would get its work done more quickly anyway.
>
>-- 
>Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
>sef@kithrup.COM  |  I had a bellyache at the time."
>-----------------+           -- The Turtle (Stephen King, _It_)
>Any opinions expressed are my own, and generally unpopular with others.


                                  Scott Bennett, Comm. ASMELG, CFIAG
                                  Systems Programming
                                  Northern Illinois University
                                  DeKalb, Illinois 60115
**********************************************************************
* Internet:       bennett@cs.niu.edu                                 *
* BITNET:         A01SJB1@NIU                                        *
*--------------------------------------------------------------------*
*  "Spent a little time on the mountain, Spent a little time on the  *
*   Hill, The things that went down you don't understand, But I      *
*   think in time you will."  Oakland, 19 Feb. 1991, first time      *
*  since 25 Sept. 1970!!!  Yippee!!!!  Wondering what's NeXT... :-)  *
**********************************************************************

skk@sparc.UUCP (Stuart Kreitman) (04/26/91)

>
>Say no to hacks, say no to RISC (at least in SPARC-like incarnations).
>
A religious opinion flying in the face of industry history?
I agree SPARC ain't pure or holy. I have thought alot about the
instruction set while writing assertions and tests on object file 
conformance to the SPARC architecture manual.

However, SPARC and other ~RISC machines do perform well.
The people at Sun, Sparc International and 
elsewhere are building the first true multivendor shrink wrap UNIX 
architecture after Xenix/286.

>In addition, the 68040 has built-in support for multi-processing.
>Which of the RISC chips does? (This is a real question). If there is
	Er, you mean an atomic load-store?  Version 7, page 80

>any, then that's the one NeXT is most likely to choose, since I would
	You mean the people at NeXT, or the sign out front? 
	NeXT was cool when they had that expensive real estate near XEROX PARC,
	but they've since moved uptown.
>guess multiprocessing is a more general solution to performance
>problems than any RISC-hack can offer, and thus I would suspect NeXT

	A guess is right. I'll share the truly broad brush with you:
	multiprocessing sometimes works, but I've already slaved on
	2 multiprocessors that were failures (TOPS-10 and MegaFrame)

>is putting a higher priority on that than on RISC.
	What if it were only a marketing decision?
>
>Just my 0.02$
>
>Ronald
	Another dude who blames stuff on "them"

sef@kithrup.COM (Sean Eric Fagan) (04/26/91)

(Here we go again.  *sigh*  Are people really this ignorant?)

In article <1991Apr25.025800.4377@mp.cs.niu.edu> bennett@mp.cs.niu.edu (Scott Bennett) writes:
[in response to my assertion that making a superscalar CISC machine {e.g.,
68050} is much harder than with most of the current RISC machiens.]
>     If you disallow pipelining in the CISC machine, then it is most
>likely to be impossible to have so-called superscalar operation.  

Who said I was disallowing pipelining.  Pipeline your bloody CISC to death
for all I care.  For the most part, it won't make that much difference:
most CISC chips, such as the 68k and iAPX*86 series, tend to do too many
memory references in each instruction to make superscalar feasible.  Or
don't you realize you can only access one memory location at a time?  (Well,
not completely true, but true enough.)

>However,
>most CISC machines now are not only pipelined, they are *multiply* pipe-
>lined.  

Oooh.  They have more than one stage of pipelining.  Like all of the current
RISC chips, which had them before the CISC chips.

>Since a superscalar RISC can only be that way by pipelining,
>let's at least compare only pipelined architectures.  FWIW, the MC68040
>supposedly averages about 1.3 clock cycles per instruction because of
>the pipelining used.  That obviously doesn't reach "superscalar", but
>it isn't terribly far off, either.

Bullshit.  It is *very* far off.  Note the word "supposedly" in your
statement.  Please look at John Mashey's figures; I think they indicate
slightly higher (1.4 or 1.5 CPI?) for the '40; on the other hand, the R3000
got what, 1.2 or 1.3.

>     In any case, what really matters is how much work gets done per
>clock cycle, not how many instructions get done per cycle.  

No, it doesn't.  What matters is *how quickly you can get your job done*.  I
don't care if you can do a POLY instruction in 3 cycles; if you still take 2
cycles to do an add, most current RISC chips will blow you away (unless your
application consists of POLY instructions).

|One example
|is the case of moving blocks of data from one memory location to another.
|A typical RISC must 1) initialize a loop (one or more instruction fetch/
|decodes) and in the body of the loop must 2) load a word into a register
|(one fetch/decode), 3) store from the register into the new location (one
|fetch/decode), 4) increment both addresses (probably two fetches/decodes),
|5) loop back to repeat until finished (at least one fetch/decode).  Some
|CISCs have something like a "repeat" instruction that will execute 
|another instruction (e.g. a storage-to-storage move) a given number of
|times while incrementing addresses in that instruction, so the whole
|operation may require as few as two fetches/decodes.  Other CISCs have
|single instructions capable of doing block moves, so they only need one
|fetch decode.  That means more of the cycles required get spent doing the
|actual work that needs to be done than would be the case with a RISC.  A
|CISC operating in such a way would be at the *opposite* end of the spectrum
|from "superscalar", but would get its work done more quickly anyway.

This was so precious I decided to keep all of it intact.

Note how, because "some" CISCs have a "repeat" instruction, which doesn't
necessarily buy you anything (talk to Henry Spencer), all CISCs are better.

Never mind that fact that for most RISCS and CISCs the code is almost
identical, with only optimizations for the specific processor.

Listen *very* carefully:  as of right now, the most popular chip that has a
repeat instruction is the '386 and '486.  For the '486, MOVS instruction, no
prefix, takes 7 clock cycles.  A "REP MOVSB" takes 5, *if you are moving 0
bytes*, 13, *if you are moving 1 byte*, and 12+3*(number of bytes).  The
overhead is essentially the same for setting up the rep instruction as it is
otherwise (unless you have other uses for the registers MOVS and REP want,
in which case you have to spill them, and reload, which is going to add even
more time).

The '40 doesn't have a repeat instruction; it's block-memory move loop looks
very much like the RISC version, except that the RISC versions can generally
take advantage of overlapping memory loads/stores.  I.e.

	lb	$temp, ($base + $inc)
	addu	$inc, 1, $inc
	sb	$temp, ($src + $inc)

can take a total of three cycles (well, four:  the sw needs to complete).  A
68k is likely to use something like

	mov.b	[a0+d1], [a1+d1]
	add.l	d1, $1

or somesuch (sorry I'm not completely up to date on my 68k assembly; it's
been a while).  Note that the mov instruction has two memory references;
this is BAD.  (Even if I'm wrong, and there's only one memory reference per
instruction, making it look like the RISC version, the '40 still doesn't
have overlapping loads, I believe.)

Go *learn* before you start stating that people who know a lot more than you
(not me; I mostly just nod my head and agree with people like mashey and
patterson) are complete fools for not doing things properly.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

sef@kithrup.COM (Sean Eric Fagan) (04/26/91)

In article <1991Apr24.181932.17810@cs.cornell.edu> wayner@CS.Cornell.EDU (Peter Wayner) writes:
>In another sense, going "superscalar" is much easier with CISC
>machines.  I think the Intel 486 does a PUSH instruction in one cycle.

It executes only one instruction each cycle.  How, pray tell, is that
superscalar?  Yes, the PUSH instruction can execute in one cycle (provided
you are pushing either a "general purpose" register or an immediate; most
code I've seen for the 80186 and later likes to push lots of memory
locations, in which case it takes 4 cycles, not counting memory latency).
However, I seem to recall that there are lots of "gotcha's" in that 1 cycle.
As I do not remember them, I shall defer trying to discuss them.

>In RISC land, this is a decrement and a load. The CISC designer just
>needs to use enough silicon to pipeline the important instructions.

In RISC land, you do a store followed by a decrement.  This will execute in
two cycles on a MIPS R3000, I believe (since the decrement executes while
the store is waiting).

On the other hand, the equivalent POP takes 4 cycles on the '486; on the
R3000, the equivalent load/increment takes (tada) 2 cycles.  Oooh.

>There is no need for complex logic to handle all the possible cases of
>two instructions coming down the pipe. The RISC designer needs to
>worry about generality.

And the CISC designer needs to try to think which sequences of
"instructions" are going to be commonly executed, and make them work fast as
a single instruction (such as PUSH).

Tell me: would it be preferable to have a limited PUSH instruction execute
in one cycle (limited because it can only store into a hardwired location),
or to have a more general store/arithmetic sequence execute in two cycles?

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

sef@kithrup.COM (Sean Eric Fagan) (04/26/91)

In article <SCOTT.91Apr23101927@texnext.gac.edu> scott@texnext.gac.edu (Scott Hess) writes:
>If you look, you'll see that all of the chips that have been created in the
>past couple years seem to be RISC chips (this was brought to my attention
>by a prof in one of my classes just this spring).  It seems to be true!
>What it makes one wonder is whether the naming of chips like the RIOS
>RISC is done because they are really RISC, or because RISC is what
>people want.  Considering all I've seen on the RIOS (not much,
>admitedly), I'd say it's more of a CISC that uses some of the things
>that RISC people have found to be helpful to make fast chips.

Let me get this straight:  the chip's implementation is complex (although it
is still basicly a load/store architecture, and looks a lot like any other
"RISC" chip), therefore it is a Complex Instruction Set Computer?

Wow.  I'm impressed.  What the hell are they teaching you people?

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

Garance_Drosehn@mts.rpi.edu (04/27/91)

Could the RISC vs CISC debate be moved into private messages, or some other
more appropriate newsgroup?  Other than hearing yourselves talk, what good does
the debating the matter here do?  How is this really next-related?  

If we break up comp.sys.next because of the volume of articles, we should have
a separate group for these debates which don't really have a thing to do with
the NeXT machine per se...

edwardm@hpcuhe.cup.hp.com (Edward McClanahan) (04/27/91)

Steven Preston writes:

> > I could be wrong, but don't some of the current RISC chips (such as
> > IBM's RIOS) begin to approach the complexity of the 68040?

> Yes, it is clear that RISC has become a misnomer; it (RISC) now means that
> the average cycles per instruction is about 1 or less.

> I read somewhere that Motorola claims that the '040 takes an average
> of 1.3 clock cycles per instruction.  This probably qualifies it as a
> RISC machine, by the above criterion.

When the "system" is considered (as apart from just the CPU), RISC machines
generally have an "average cycles per instruction" greater than one.  This
is due to things like Cache and Virtual-to-Real Translation Misses.  Some
"systems" improve their "cpi" by increasing the sizes of their cache(s) and
TLBs (caches containing virtual-to-real translations).  Here at HP, our RISC
processor accesses the Code and Data caches with Virtual Address Tags (as
opposed to Physical Address Tags).  This allows the Cache and TLB accesses
to proceed in parallel instead of serially (i.e. accessing the TLB to get the
physical address to index into the Cache).

All other things being equal (cache, TLB, memory sizes and access times), just
increasing the CPU clock speed actually INCREASES the "cpi".  It should be
clear that this isn't necessary bad...

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

  Edward McClanahan
  Hewlett Packard Company     -or-     edwardm@cup.hp.com
  Mail Stop 42UN
  11000 Wolfe Road                     Phone: (480)447-5651
  Cupertino, CA  95014                 Fax:   (408)447-5039

melling@cs.psu.edu (Michael D Mellinger) (04/27/91)

In article <qdwg=g+@rpi.edu> Garance_Drosehn@mts.rpi.edu writes:

   Could the RISC vs CISC debate be moved into private messages, or some other
   more appropriate newsgroup?  Other than hearing yourselves talk, what good does
   the debating the matter here do?  How is this really next-related?  

Why isn't it NeXT related?  NeXT is going to move to a RISC platform
in the future.  I know some people that won't even buy a computer
unless it contains a RISC chip.  RISC has become a marketing tool.

Some of us are just trying to understand the full impact of RISC so we
don't have to listen to all of the marketing hype like RISC is "good"
and CISC is "bad".  In other words, we want to understand why RISC,
and in particular the one that NeXT is probably going to use(the 88K),
is actually better than CISC.  We could move it to comp.arch, but this
has been a lightweight discussion so far.

In fact, how does the 88K compare to the 68040 in performance? 

   If we break up comp.sys.next because of the volume of articles, we should have
   a separate group for these debates which don't really have a thing to do with
   the NeXT machine per se...

The great thing about computers is that they can eliminate laborious
task by automating them.  Try putting the thread(the current RISC
topic) in your kill file.

-Mike

torrie@cs.stanford.edu (Evan Torrie) (04/27/91)

melling@cs.psu.edu (Michael D Mellinger) writes:
 
>Some of us are just trying to understand the full impact of RISC so we
>don't have to listen to all of the marketing hype like RISC is "good"
>and CISC is "bad".  In other words, we want to understand why RISC,
>and in particular the one that NeXT is probably going to use(the 88K),
>is actually better than CISC.  We could move it to comp.arch, but this
>has been a lightweight discussion so far.

>In fact, how does the 88K compare to the 68040 in performance? 

  Well, the Harris Night Hawk is a 25MHz 88000 based machine, and its
Spec results are as follows:

  GCC:  22.80    Spice: 13.44
  Esp:  20.98    Dod:   11.94
  Li:   19.83    Nasa7: 16.20
  Eqn:  16.94    Mat:   21.45
		 Fppp:  18.64
		 Tom:   14.55

  SpecInt:  20.02   SpecFP:  15.73
  Overall SpecMark:  17.32

  I think the previous figures given for the 040 were more like 11.x?
  Given that the 88K has been out for over 2 and a bit years, and it's
due for a 3x performance upgrade later this year with the 88110, it's
not hard to see why NeXT would like to run with the 88110.


-- 
------------------------------------------------------------------------------
Evan Torrie.  Stanford University, Class of 199?       torrie@cs.stanford.edu  
"And in the death, as the last few corpses lay rotting in the slimy
 thoroughfare, the shutters lifted in inches, high on Poacher's Hill..."

scott@texnext.gac.edu (Scott Hess) (04/27/91)

In article <1991Apr26.163303.8614@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
   In article <SCOTT.91Apr23101927@texnext.gac.edu> scott@texnext.gac.edu (Scott Hess) writes:
   >If you look, you'll see that all of the chips that have been created in the
   >past couple years seem to be RISC chips (this was brought to my attention
   >by a prof in one of my classes just this spring).  It seems to be true!
   >What it makes one wonder is whether the naming of chips like the RIOS
   >RISC is done because they are really RISC, or because RISC is what
   >people want.  Considering all I've seen on the RIOS (not much,
   >admitedly), I'd say it's more of a CISC that uses some of the things
   >that RISC people have found to be helpful to make fast chips.

   Let me get this straight:  the chip's implementation is complex (although it
   is still basicly a load/store architecture, and looks a lot like any other
   "RISC" chip), therefore it is a Complex Instruction Set Computer?

   Wow.  I'm impressed.  What the hell are they teaching you people?

Wow.  I'm impressed.  Did you read any of my article at all?

What I said was that many chips nowadays are called RISC because
RISC is "in".  Sort of a fad, in other words.  To reword it yet
again (just in case you missed it) - the number of people calling
their new designs "RISC" has weakened the meaning of the term
tremendously.

MIPS, for instance, seems fairly close to the spirit of RISC.
Sparc, likewise.  RIOS, meanwhile, is streching things a bit.  And
why else would a beast like the '040 be called 'RISC-like in many
ways'?

Later,
--
scott hess                      scott@gac.edu
Independent NeXT Developer	GAC Undergrad
<I still speak for nobody>
"Simply press Control-right-Shift while click-dragging the mouse . . ."
"I smoke the nose Lucifer . . . Banana, banana."

scott@mcs-server.gac.edu (Scott Hess) (04/30/91)

In article <1991Apr24.181932.17810@cs.cornell.edu> wayner@CS.Cornell.EDU (Peter Wayner) writes:
   sef@kithrup.COM (Sean Eric Fagan) writes:

   >Nope, not really. That's the problem:  with a "CISC" instruction set, it's
   >really very difficult to go superscalar, at least compared to some "RISC"
   >instruction sets.   Why?  Because not enough registers, too many memory
   >references in a single instruction, or other small, niggling details.

   In another sense, going "superscalar" is much easier with CISC
   machines.  I think the Intel 486 does a PUSH instruction in one cycle.
   In RISC land, this is a decrement and a load. The CISC designer just
   needs to use enough silicon to pipeline the important instructions.
   There is no need for complex logic to handle all the possible cases of
   two instructions coming down the pipe. The RISC designer needs to
   worry about generality.

Not!  The CISC instruction set is the one that requires the complex logic
to handle all the possible cases.  That's because CISC gives you much
more to work with.  A basic example of the problem is a sequence where
the first instruction does a post-increment on one of the registers,
and the very next one uses that register.  In most RISC machines, this
is not possible, as there is no post-increment instruction.

Of course, this is not to say that RISC machines don't have data hazards.
But, the RISC data hazards are necessarily a subset of CISC hazards,
and thus should be _easier_ to handle (I didn't say "easy", I said
easier!).

Later,
--
scott hess                      scott@gac.edu
Independent NeXT Developer	GAC Undergrad
<I still speak for nobody>
"Simply press Control-right-Shift while click-dragging the mouse . . ."
"I smoke the nose Lucifer . . . Banana, banana."