earl@mips.COM (Earl Killian) (04/19/88)
Now that Motorola has announced the 88000, I believe all the commercial "RISC"s are out in the open (or am I missing something?). This list includes the MIPS R2000/R3000, the Fairchild Clipper, the IBM RT, the HP Precision, the AMD 29000, Sun SPARC, Intel 80960, and Motorola 88000 (speak up if I left anyone out!). I propose that comp.arch develop a standard form for describing "RISC" architectures and apply it to the above. (We could include military and research machines as well, if people so desire.) Below I propose such a form, which will, no doubt, require generalization. Once we agree on what it takes to fairly well characterize an architecture and its implementation, we can fill in the answers for all of the above (unless people think this is a worthless exercise?). First some definitions of my terminology are in order, because it's probably different from everyone else's. The latency of an operation is the time it takes for the entire operation to complete. The issue time is the time before you can start the next instruction, and the rate is the time until you can start another instruction of the same type. For example, a machine might require 3 cycles for a load instruction: 1 to calculate the address, 2 to access the cache, and allow a new load every 2 cycles, but allow a non-load to start immediately. I describe this load as 3/1/2. What is commonly called the load delay (as opposed to latency) is the time after the load before you can reference the result. This is the latency minus the issue time (3 - 1 = 2) in this case. Don't confuse latency with delay. Some latency/issue/rate examples from the Cray-1S (from memory, so don't quote me): logicals: 1/1/1 shift: 2/1/1 integer add: 3/1/1 load: 11/2/2 An example of a multi-cycle latency, non-pipelined floating point unit might have: add: 2/1/2 mul: 4/1/4 I hope that is clear enough. If not, I'll try to clarify. Here is my proposed form to characterize architectures and their implementations. I'll post the MIPSco numbers once we agree on the data to collect. > Peak native MIPS What is the clock cycle time? What is the peak native MIPS rate? > Implementation technology What are the parameters of the implementation technology? > Instruction format What instruction sizes are used? What size are immediate operands? What size are branch displacements? > Integer Registers How are the registers organized [simple, windowed]? How many total integer registers? Hardwired zero register? For windowed machines: How many registers are addressed by an instruction? How many of these are not windowed? What window increments are supported? Window overflow and underflow are handled in [software, hardware]? > Integer Alu What is the logical latency/issue/rate? What is the shift latency/issue/rate? What is the add latency/issue/rate? What is the compare latency/issue/rate? > Branches Which operand comparisons are implemented in the conditional branch instruction, and which require a separate instruction? Where is the result of separate comparisons stored [registers, condition codes]? Which forms of branch delay are present in instruction set [execute N if no branch, execute N if branch, execute N always]? What are the taken and not-taken cycle counts for each branch type? > Loads/Stores What addressing mode(s) do load instructions use? What addressing mode(s) do store instructions use? Which load/store sizes are supported [8, 16, 32, 64]? What is the load latency/issue/rate? What is the store latency/issue/rate? > Integer Multiply/Divide How is multiply is implemented [software, multiply step, hardware]? How many cycles to perform 32x32->32 multiply? How is divide is implemented [software, divide step, hardware]? How many cycles to perform 32x32->32 divide? > Floating Point Are floating point registers separate from integer registers? How many 32-bit floating point registers? How many 64-bit floating point registers? How many 80-bit floating point registers? How is floating point is implemented [software, coprocessor, on-chip]? What are the floating point operation latency/issue/rates? 32-bit 64-bit 80-bit add mul div Which floating point units can operate in parallel? Can floating point operate in parallel with integer? Are floating point exceptions precise? > Memory management Page size? Translation cache [none, off-chip, on-chip]? Translation cache size in entries? Translation cache associativity [direct-mapped, 2-set, 4-set, full]? Translation cache miss handled by [software, hardware]? > Caches Instruction cache [none, off-chip, on-chip]? Data cache [none, off-chip, on-chip]? Are I and D caches separate? I-cache total size in bytes? I-cache associativity [direct-mapped, 2-set, 4-set, fully associative]? I-cache address block size in bytes (bytes per tag)? I-cache transfer block size in bytes (bytes read on cache miss)? I-cache index [virtual, physical]? I-cache tag [virtual, physical]? D-cache total size in bytes? D-cache associativity [direct-mapped, 2-set, 4-set, fully associative]? D-cache writes [write-through, write-back]? D-cache address block size in bytes (bytes per tag)? D-cache transfer block size in bytes (bytes read on cache miss)? D-cache index [virtual, physical]? D-cache tag [virtual, physical]? -- UUCP: {ames,decwrl,prls,pyramid}!mips!earl USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086
ram%shukra@Sun.COM (Renu Raman, Taco Bell Microsystems) (04/19/88)
In article <2048@gumby.mips.COM> earl@mips.COM (Earl Killian) writes: >Now that Motorola has announced the 88000, I believe all the >commercial "RISC"s are out in the open (or am I missing something?). >This list includes the MIPS R2000/R3000, the Fairchild Clipper, the >IBM RT, the HP Precision, the AMD 29000, Sun SPARC, Intel 80960, and >Motorola 88000 (speak up if I left anyone out!). > >UUCP: {ames,decwrl,prls,pyramid}!mips!earl >USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086 Here is a brief listing of RISCs from university, commercial & Govt projects I am sure I have missed out many more. Can somebody email me or fill in the rest? Universities: Berkeley RISC I/II, SOAR, SPUR Stanford MIPS-X Purdue (With RCA?) Commercial Acorn: ARM AMD: 29000 ATT: CRISP Fairchild: Clipper HP: Spectrum IBM: RT(ROMP) & its next generation Mips: R[2/3]000 Motorola: 88000 (an example of how no. got changed before release:-)) Pyramid: 90X Ridge: Ridge-32 Sun: Sparc Xerox: ?? (ECL) Intel: 80960 Commercial(Announced and in-the-works) Apple: <code-named aquarius - so is it next february:-)> Apollo: <no. to-be-stamped :-)> DEC: ??? DoD & Darpa Contracted GE: RPM-40 RCA: ?? TI: ?? MD: ?? Rockwell: ?? --------------------- Renukanthan Raman ARPA:ram@sun.com Sun Microsystems UUCP:{ucbvax,seismo,hplabs}!sun!ram M/S 18-41, 2500 Garcia Avenue, Mt. View, CA 94043
csg@pyramid.pyramid.com (Carl S. Gutekunst) (04/20/88)
In article <49983@sun.uucp> ram@sun.UUCP (Renu Raman) writes: >Commercial > > Pyramid: 90X Also Pyramid 9810, same architecture. It has been debated whether the Pyramid architecture is "really" RISC, since it has a rather bulky instruction set, microcode, an interlocked pipeline, and instructions you'll never find on a CPU designed by John Hennessey (like interruptable block move). On the other hand, it *did* borrow liberally from RISC I and MIPS-X, the most visible ele- ments being the sliding register window for function calls, and the notion of using smart compilers instead of smart hardware. The 90x and 9810 both use a schottky TTL implementation, but that is strictly an implementation issue. One other commercial RISC: Celerity 1200, et al What is ironic is that Celerity's product literature has been very low-key about its RISC architecture, though the 1200 is more "RISC" than either Ridge or Pyramid, the two vendors who were making the most noise about RISC at the time the 1200 was announced. The machine's floating point performance scaled well with its integer performance, something only a few other processors have demonstrated (e.g. the MIPS R2000). > Apollo: <no. to-be-stamped :-)> The *system* has been announced as the Apollo 10000. Sun's salespeople have been making derisive noises about it, since Apollo announced the box before two of its eight (six? ten?) gate arrays had seen silicon. Actually, announcing processors that have only been simulated has become commonplace. This is almost reasonable, given the quality of simulation tools available these days. > DEC: ??? DEC's big RISC engine goes by the code name "Titan." It is something around 10 MIPS, up to 10 tightly-coupled CPUs. It's supposed to be a big secret, so don't spead this around. :-) A mildly interesting point is the success of commercial RISC products that have been in the marketplace for a while. Ridge and Celerity are essentially in the past tense, although Ridge is still tying to make a go of it. Pyramid is thriving. The IBM PC/RT was a flop. The MIPS R2000 has done fairly well, although I've been surprised by the number of vendors jumping on top of SPARC when the R[23]000 has so much more going for it. (Save the flames, I've read all the debates on this.) The SPARC is too new to call, but it appears that it will be a smashing success. <csg>
celerity@bucasb.bu.edu (Roger B.A. Klorese) (04/20/88)
In article <49983@sun.uucp> ram@sun.UUCP (Renu Raman) writes:
!Commercial
!
! Acorn: ARM
! AMD: 29000
! ATT: CRISP
Celerity: C1200 and C1230 Accel
! Fairchild: Clipper
! HP: Spectrum
! IBM: RT(ROMP) & its next generation
! Mips: R[2/3]000
! Motorola: 88000 (an example of how no. got changed before release:-))
! Pyramid: 90X
! Ridge: Ridge-32
! Sun: Sparc
! Xerox: ?? (ECL)
! Intel: 80960
!
!Commercial(Announced and in-the-works)
!
! Apple: <code-named aquarius - so is it next february:-)>
! Apollo: <no. to-be-stamped :-)>
Celerity: 6000
! DEC: ???
!
dre%ember@Sun.COM (David Emberson) (04/20/88)
Of course, the number of cycles to do this or that is a function of the implementation, not the architecture. And none of us will supply the really interesting data--on the chips we haven't announced yet! Earl, if the purpose of this exercise is to prove that the R3000 will outbench the 16 MHz Fujitsu SPARC, then on behalf of Sun Microsystems I concede (assuming your published data to be correct--I have never seen an R3000). How about adding to the list "total dollars being invested in new implementations?" And don't forget "number of engineers worldwide working on this architecture." Ah, this is going to be one fun war--and we all win! Dave Emberson (dre@sun.com)
csg@pyramid.pyramid.com (Carl S. Gutekunst) (04/20/88)
In article <20123@pyramid.pyramid.com> I wrote: >On the other hand, it [the Pyramid 90x] *did* borrow liberally from RISC I >and MIPS-X.... Foo. I should look where I type. I meant the original Stanford MIPS (what was it called?), not the MIPS-X. <csg>
celerity@bucasb.bu.edu (Roger B.A. Klorese) (04/20/88)
In article <20123@pyramid.pyramid.com> csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: >A mildly interesting point is the success of commercial RISC products that >have been in the marketplace for a while. Ridge and Celerity are essentially >in the past tense, although Ridge is still tying to make a go of it. It's funny, Carl: every time you've posted on RISC before, I've made some piddling correction about the Celerity information. Now that I'm no longer a Celerity employee (despite the borrowed account), I'm gonna do it again! Last week, it was announced that Floating Point Systems is in the process of acquiring Celerity's assets and liabilities, and will continue the development of the Celerity 6000, with the remaining engineering staff, as well as picking up support if the installed base. So yes, Celerity the totally independent company is in the past tense, but Celerity the FPS subsidiary is not. --- Roger B.A. Klorese MIPS Computer Systems, Inc. {ames,decwrl,prls,pyramid}!mips!rogerk 25 Burlington Mall Rd, Suite 300 rogerk@mips.COM Burlington, MA 01803 * Your witticism here.* +1 617 270-0613
ram%shukra@Sun.COM (Renu Raman, Taco Bell Microsystems) (04/20/88)
In article csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: >> >> Pyramid: 90X > >Also Pyramid 9810, same architecture. It has been debated whether the Pyramid >architecture is "really" RISC, since it has a rather bulky instruction set, Exactly. I had included "RISC" and "claimed RISC". I guess one of the objectives of Earl's original note is to settle this thing called "what is RISC". >> Apollo: <no. to-be-stamped :-)> > >The *system* has been announced as the Apollo 10000. Wrongo! That is the machine. Now that I think back, the processor goes by the name - PRISM. The "no-to-be-stamped" was a smiley remark to some previous discussion here about when & how machines/processors get their marketing ids. >A mildly interesting point is the success of commercial RISC products that >have been in the marketplace for a while. lower development cost, better simulation tools, Unix and compatibility being a non-issue are partly responsible for RISC successes. Query: Are there any RISC(y) processors that is running an OS other than UNIX? ><csg> Renu
csg@pyramid.pyramid.com (Carl S. Gutekunst) (04/20/88)
In article <577503463.14723@bucasb.bu.edu> rogerk@mips.com (Roger B.A. Klorese) writes: >It's funny, Carl: every time you've posted on RISC before, I've made some >piddling correction about the Celerity information. Deja vu? I don't think this correction was piddling, though, since I may have unintentionally scared some Celerity users into thinking that their machines are orphans, which is certainly not true. >Last week, it was announced that Floating Point Systems is in the process of >acquiring Celerity's assets and liabilities.... Yes, I knew that, and have guarded hopes that the 6000 will be a reality. I say guarded, since we really don't know what FPS will do. (If you've ever been in the middle of a takeover, you'll know that even what your old president is told is suspect, let alone the grunt engineers, let alone the media.) My point, though, was that out of the first five commercial RISC ventures, three flopped. Now don't misunderstand; I suspect that if Celerity had the kind of financial backing that Ridge and the PC/RT had (Ridge raised $20 Million after they had already failed once), it would have been successful. But the road to RISC has been a rocky one. Can someone from Ridge comment on their health? <csg>
butcher@G.GP.CS.CMU.EDU (Lawrence Butcher) (04/20/88)
When is a RISC not a RISC? Today I got copies of the 80960KB Programmer's and Hardware Designer's reference manuals. 32 AND 64 bit instructions. Enthusiastic addressing modes. Multiple-cycle instructions. Confused call/return instructions. Decimal data type. Trig functions in microcode instead of manufacturer-sanctioned subroutines. No delayed branching. Zero-cycle branches anyway by making other instructions SO SLOW that the branch is finished before the previous instruction is done. Multiplexed address/data bus. No memory management. No support for page faults. Maximum instruction time 75878 clocks +- 40%. (probably typo :-) Maybe the 8087 is a RISC? But really Intel does not advertise this chip as a RISC. They have target the "embedded-processor" market. The KB chip doesn't suggest workstations to me. I had hoped that this chip would help AMD, Motorola, and MIPS revise the price of their chip sets downword. Maybe next one, Intel? :-) Weitek has a family of processors called the XL-8000/XL-8032/XL-8064. I don't think that they are advertised as being RISC, but I think that they are. The 3 chip set contains no memory management, but can deal with page faults. The architecture has seperate instruction address, 64 bit instruction data, data address, and 32 or 64 bit data busses. At most an integer instruction, a floating point multiply-accumulate, and a short conditional branch can be executed each clock. A complete cross-development system is available. The set comes 8 MHz, 10 MHz, and 12 MHz. The 8 MHz part dhrystones around 6500, I think. It is MUCH faster at floating point than that number suggests. Let me point out an article that might be interesting to readers. The Volume 16 Number 1 March 1988 issue of Computer Architecture News has an article by Wm. A. Wulf on "The WM Computer Architecture". Wulf has a background in compiler-design and has a very good idea of what instruction sequences occur in real code. He describes a RISC instruction set with 32-bit instructions which name 3 source registers and 2 alu operations per instruction. He argues that the compiler can juggle ALU ops so that the second operation frequently does useful work. His machine transfers instructions to the Integer ALU and Floating Point ALU thru fifo's, condition codes from the ALUs to the IFU thru fifo's, and data to and from memory thru fifo's. This thing seems like a step in the direction of a RISC VLIW. If things like page faults were figured out, and if interrupts could happen without causing registers to be overwritten before being used, and if delayed branching really wasn't important as claimed, and if the ALU ops were simple (only one ALU could multiply or divide), would this instruction set really be 2 or more times faster than today's RISCs at the same speed for roughly the SAME cost? Would it be as economical for a conventional RISC to fetch 2 instructions at the same time and execute them in parallel if there were no data dependency??
earl@mips.COM (Earl Killian) (04/20/88)
In article <50070@sun.uucp> dre%ember@Sun.COM (David Emberson) writes:
Of course, the number of cycles to do this or that is a function of the
implementation, not the architecture.
Yes, that's why I continuously referred to "the architecture and its
implementation" in my posting. I think the implementations are
actually more interesting than the instruction set architecture
underneath. Good implementation is more difficult, and there's a lot
to be learned from such study. The places where implementation and
instruction set design interact are especially interesting. I've
noted numerous times in this forum when other designers said their
instruction set chose method X because Y was too hard when we made the
opposite choice, and vice versa.
And none of us will supply the really interesting data--on the
chips we haven't announced yet!
Of course. But as soon as something new is announced, we'll have a
good way to communicate information, right? I'm certainly not
suggesting that we take a snapshot of April 88 and never update it.
(Nor am I letting the fact that MIPS' unannounced designs are oodles
better than the current ones from talking about our current ones :-)
Earl, if the purpose of this exercise is to prove that the R3000
will outbench the 16 MHz Fujitsu SPARC, then on behalf of Sun
Microsystems I concede (assuming your published data to be
correct--I have never seen an R3000).
That was definitely not my intent. My purpose was as an aid to help
me keep track of what's going on out in the wide world, because it's
getting tough with all the different machines and implementations.
The recent Moto/Intel announcements were the real spur. I started
making a list of the features for everything I knew about, and
realized there were a lot of blanks. I thought comp.arch would be
both helpful in filling in the blanks, and interested in the results.
How about adding to the list "total dollars being invested in new
implementations?" And don't forget "number of engineers worldwide
working on this architecture."
Ah, this is going to be one fun war--and we all win!
One remark in the spirit of your posting: you seem to be suggesting
that you prefer comp.arch not discuss the Sun/Fujitsu SPARC
implementation because it's uncompetitive with respect to the others,
and instead you would rather wait for a worthier SPARC entrant.
If so, fine. In the meantime would you care to comment on what data
is relevant for when you do have something to talk about?
--
UUCP: {ames,decwrl,prls,pyramid}!mips!earl
USPS: MIPS Computer Systems, 930 Arques Ave, Sunnyvale CA, 94086
paulr@granite.dec.com (Paul Richardson) (04/20/88)
I am kind of tired of listening to these arguments of what is and what is not a RISC machine.So that we can move onto more interesting architectural topics I propose the follwing definition. RISC: 1) A machine in which the instruction set is designed/chosen based on what makes the most sense to put into the hardware.For instance, you probably wouldn't want to piss away alot of hardware on the equivalent of the *VAX* POLY instruction,yet on the other hand optimizing load/stores might turn out to be a big performance win. 2) Although it is not necessarily a requirement,it has been a characteristic of RISC machines that they have a 'large' general purpose register file.Just how many defines large is up to the designer but from the papers on register allocation I have seen (especially David Wall's) it seems that something between 32 and 100 is about all that current compiler technology knows how to deal with I have had the oppurtunity to work with one so called 'RISC' machine, (The dec resaerch box called the TITAN) and read about many others.The underlying similarity amongst them all seems to be that good engineering practice was used in determing the architecture/instruction set.Doing things like taking statisctics on what the frequency of instructioms used in REAL programs and using that to determine what does and what does not go into the hardware seems to make sense to me.Making compilers smarter and do things like schedule instructions seems to make sense to me.Using registers instead of main memory during run time makes sense to me (this is not a risc discovery). OK OK OK, that's my two sense worth.Now could we talk about things like merits of delayed branching,or limits on pipeline depth,or nifty floating point algoritms etc...I know that there are plenty of bright people out there just dying to spill their grey matter on other topics /pgr
oconnor@sungoddess.steinmetz (Dennis M. O'Connor) (04/20/88)
An article by ram@sun.UUCP (Renu Raman) says:
] In article <2048@gumby.mips.COM> earl@mips.COM (Earl Killian) writes:
] Here is a brief listing of RISCs from university, commercial & Govt projects
] I am sure I have missed out many more. Can somebody email me or fill in
] the rest?
] DoD & Darpa Contracted
]
] GE: RPM-40
Bulk CMOS, Silicon exists.
] RCA: ??
RCA did TWO RISC designs for DARPA, one GaAs, one CMOS SOI :
One was called "GaAs Microprocessor"
The other was called "High-Speed CMOS Microprocessor"
Unfortunately, neither design was funded for production.
BTW, GE and RCA are one company now. That happened about
halfway through the RCA micro's design period. No silicon.
] TI: ??
TI and CDC ( Control Data ) teamed up to do, for DARPA :
"High-Speed GaAs Microprocessor", which is still in development I think.
Recently anounced fab of a below-target-speed version, I think.
] MD: ??
McDonnell-Douglas Astronautics Company did the MD 484, in GaAs
Still in development, I think.
] Rockwell: ??
I don't know anything about the Rockwell effort. Is it DARPA ?
You left out Sperry. Sperry ( now part of Unisys ) did the
"High Speed CMOS Microprocessor", also for DARPA. Silicon exists.
] Renukanthan Raman ARPA:ram@sun.com
The GE RPM-40 used to be called "High Speed CMOS Microprocessor" too,
among other things. We decided it took too long to say, and was
eating space on our viewgraphs. So RPM-40 ( RISC Pipelined
Microprocessor, 40MIPS ) was born.
Check the Goverment Printing Office for reports on these efforts.
They are not classified, but are ITARS restricted, I think.
--
Dennis O'Connor oconnor%sungod@steinmetz.UUCP ARPA: OCONNORDM@ge-crd.arpa
( I wish I could be polite all the time, like Eugene Miya )
(-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)
baum@apple.UUCP (Allen J. Baum) (04/20/88)
-------- [] >In article <1468@pt.cs.cmu.edu> butcher@G.GP.CS.CMU.EDU (Lawrence Butcher) writes: >When is a RISC not a RISC? Today I got copies of the 80960KB Programmer's >and Hardware Designer's reference manuals. > 32 AND 64 bit instructions. > Enthusiastic addressing modes. Well, I might agree with you there. They are not as bad as, say, Clipper, but I'm not sure I'd use that as an argument why something was or wasn't RISC. > Multiple-cycle instructions. You mean like HP Spectrum, or IBM RT/PC? Not much of an argument? > Confused call/return instructions. I'm not sure that calling the variations on call/return 'confused' is terribly technical. They allow choices of their full 'call', with argument passing, etc., when they need it, and an optimised version for when you don't. They are covering all their bases > Decimal data type. Their decimal instructions will add or subtract a single decimal digit. This doesn't seem to be a horrendous amount of support. HP Spectrum has decimal support as well. > Trig functions in microcode instead of manufacturer-sanctioned subroutines. I'm not sure if the microcode gives some advantages over using the built-in floating point add/sub/mul/div, but I seem to recall that they are both faster and more accurate. Intel has had Prof. Kahan (of IEEE standards fame) on their consulting list since weill before the IEEE standard. If they believe that high performance, accurate trig routines are important for their embedded market, I'd say that this was probably a good choice. Besides, they can always trap on the opcodes if they don't want to implement them. > No delayed branching. You mean like Ridge or CRISP or Clipper? Based on conversations I've had with Ridge and CRISP people, I'm now mostly satisfied that a software branch prediction bit can perform about as well as delayed branching, depending on the success rate of filling branch holes. If you can fill holes better than you can predict, then delayed branching is better. Papers I've seen from Berkeley show an 80% correct branch prediction rate can be achieved. Its not clear that you can maintain 80% of branch holes being filled, especially if you also have to fill load holes. >Zero-cycle branches anyway by making other instructions SO SLOW that the >branch is finished before the previous instruction is done. Like SPARC? This is implementatino, not architecture. You can be sure that the next implementation won't be so (dare I say it?) wimpy. > Multiplexed address/data bus. What does this have to do with RISC? You may as well complain about using the address bus twice a cycle, ala MIPS. > No memory management. No support for page faults. The -MC models have all the memory management you would want. An embedded controller is not so dependent on an MMU, so they left it out of SOME versions of the chip. >Maximum instruction time 75878 clocks +- 40%. (probably typo :-) Maybe not a typo. It's for the Remainder Real instruction. The trig and log functions take 104-441 cycles otherwise, and are interruptible. >Let me point out an article that might be interesting to readers. The >Volume 16 Number 1 March 1988 issue of Computer Architecture News has an >article by Wm. A. Wulf on "The WM Computer Architecture". Wulf has a >background in compiler-design and has a very good idea of what instruction >sequences occur in real code. He describes a RISC instruction set with >32-bit instructions which name 3 source registers and 2 alu operations >per instruction. He argues that the compiler can juggle ALU ops so that >the second operation frequently does useful work. This sounds a lot like the original Stanford MIPS. They gave it up as a bad idea. I'll look for the article, though. -- {decwrl,hplabs,ihnp4}!nsc!apple!baum (408)973-3385
fotland@hpihoah.HP.COM (Dave Fotland) (04/20/88)
Many of the items on your form are implementation rather than architecture, so you will need a separate entry for each implementation. Maybe you could collect the architecture stuff at the front of the form so we wouldn't have to repeat it. For example HP Precision architecture has several implementations: Model CPU FPU Cache Bus HP9000/825 1 1 1 1 HP9000/835 1+ 2 2 1 HP9000/840 2 3 3 1 HP9000/850 1 1 or 2 4 2 HP9000/855 3 2 5 2 (And the equivalent HP3000 machines. The difference is HP9000 runs HP-UX and HP3000 run MPE). There are 3 completely different CPU's (one with two versions), 3 different floating point coprocessors, 5 different caches (different size and/or organization), and two different busses (with 3 different memory systems). -David Fotland fotland@hpda.HP.COM
garyb@hpmwtla.HP.COM (Gary Bringhurst) (04/21/88)
Why has no one mentioned the Inmos Transputers? They are certainly Risc'ish. Gary L. Bringhurst Hewlett-Packard Company
allen@granite.dec.com (Allen Akin) (04/21/88)
In article <20123@pyramid.pyramid.com> csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: > >DEC's big RISC engine goes by the code name "Titan." It is something around 10 >MIPS, up to 10 tightly-coupled CPUs. It's supposed to be a big secret, so don't >spead this around. :-) > ><csg> Just to clarify things for the masses: Titan is a research RISC machine designed and implemented several years ago by DEC's Western Research Lab in Palo Alto. It's been mentioned in a number of papers published by WRL (see Wall and Powell's paper in ASPLOS II, for example) so feel free to spread it around. :-) Allen
walter@garth.UUCP (Walter Bays) (04/21/88)
In article <49983@sun.uucp> ram@sun.UUCP (Renu Raman) writes: >Here is a brief listing of RISCs from university, commercial & Govt projects >I am sure I have missed out many more. Can somebody email me or fill in >the rest? > ... > Fairchild: Clipper Intergraph: Clipper C100, C300 Intergraph bought the Fairchild Advanced Processor Division which makes the Clipper. National Semiconductor owns the rest of Fairchild. -- ------------------------------------------------------------------------------ Any similarities between my opinions and those of the person who signs my paychecks is purely coincidental. E-Mail route: ...!pyramid!garth!walter USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303 Phone: (415) 852-2384 ------------------------------------------------------------------------------
phil@osiris.UUCP (Philip Kos) (04/21/88)
Earl (and everyone who has responded so far) - Great idea. I have a suggestion about the terminology, though. It's probably too late to do anything about this, but what the hell... In article <2048@gumby.mips.COM>, earl@mips.COM (Earl Killian) writes: > First some definitions of my terminology are in order.... the rate is > the time until you can start another instruction of the same > type. While I don't have a really big problem with this one, the common (to me anyway) definition of "rate" is not a unit of time, but a measure of something else reduced to standard units of time. For instance, "one SP floating add instruction can be started every cycle" is a "rate", but "one cycle" is not. What Earl suggests is actually the reciprocal of my understood meaning, and seems to me something like using "Hz" to indicate the period of a wave instead of its frequency. It doesn't really make any difference to the discussion as long as everyone understands what is meant by the term "rate". However, anyone walking in on the middle of a discussion where the meaning of "rate" is assumed as above by all the participants is likely to get REALLY confused if he assumes the "traditional" meaning. Suggestions for better terms (or roots, anyway): "delay", "lag", "hold", "period", etc. (you get the picture), probably qualified as "type-interlock delay" or something to differentiate it from the basic pipeline issue delay or whatever. (I think that this differentiation is probably what Earl was going for when he proposed the term "rate" anyway.) Of course, since I'm not contributing anything substantial to the discussion, you should all ignore me anyway... :-) Phil Kos ...!decvax!decuac!\ Information Systems ...!uunet!mimsy!aplcen!osiris!phil The Johns Hopkins Hospital ...!allegra!/ Baltimore, MD
dre%ember@Sun.COM (David Emberson) (04/21/88)
Earl, I would like to apologize for being cynical about your intent. On further reflection, I think there is some value to this cataloguing of RISCS--if only, as you say, to communicate information. We are all in search of the Holy Instruction Set which solves all our problems of bit efficiency, ease of implementation, etc. so such a list may actually inspire someone to insight on the problem of architecture comparisons. One thing which is missing from the list (which might fall under the category of "parameters of implementation technology") and which is the thing which does make the Fujitsu SPARC competitive (if you will forgive my flirting with delivery of a commercial message for the moment) is price. As such, I think the component compares favorably when price-performance rather than performance alone is considered. And no, I do not wish to limit anyone's discussion of SPARC or anything else. It would certainly be nice, though, if we could talk about "the latest stuff." Some of you MIPS guys are very dear friends of mine and it would be nice to compare notes--although it would not surprise me if you knew in detail what was going on here anyway! I had an interviewee the other day describe one of my most secret activities in great detail! It seems he had a previous interview at one of our "technology partners." Ah, the joys of the free enterprise system... I vaguely remember hearing someone about ten years ago give a talk on the subject of architecture comparison. Unfortunately I do not remember who it was, but they defined two measures of an architecture, R and S. The R measure was a metric of the number of register references and the S measure was a metric of the number of memory references (storage) in a given piece of code. Presumably a high R/S is desirable, although this is far from certain in the presence of write-back caches. In any case, these indicators are independent of implementation technology. It would be nice if we could develop some such set of metrics which would allow architectures to be compared for efficiency and ease of implementation. I haven't a clue as to how to measure something for ease of implementation--I'll leave that to some enterprising person. I am in full agreement with your statement that implementations are the most interesting. We can get real numbers from them and identify real areas for improvement. On another subject, does anyone know how the 88200 cache consistency scheme works? Dave Emberson (dre@sun.com)
root@mfci.UUCP (SuperUser) (04/21/88)
In article <1468@pt.cs.cmu.edu> butcher@G.GP.CS.CMU.EDU (Lawrence Butcher) writes: >When is a RISC not a RISC? Today I got copies of the 80960KB Programmer's >and Hardware Designer's reference manuals. 32 AND 64 bit instructions. >Enthusiastic addressing modes. Multiple-cycle instructions. Confused >call/return instructions. Decimal data type. Trig functions in microcode >instead of manufacturer-sanctioned subroutines. No delayed branching. >Zero-cycle branches anyway by making other instructions SO SLOW that the >branch is finished before the previous instruction is done. Multiplexed >address/data bus. No memory management. No support for page faults. >Maximum instruction time 75878 clocks +- 40%. (probably typo :-) I'll skip this question in the hopes of keeping whatever friends I still have at Intel...:-) >This thing seems like a step in the direction of a RISC VLIW. If things >like page faults were figured out, and if interrupts could happen without >causing registers to be overwritten before being used, and if delayed >branching really wasn't important as claimed, and if the ALU ops were >simple (only one ALU could multiply or divide), would this instruction >set really be 2 or more times faster than today's RISCs at the same speed >for roughly the SAME cost? Would it be as economical for a conventional >RISC to fetch 2 instructions at the same time and execute them in parallel >if there were no data dependency?? I had a fairly sarcastic reply all typed in, but I'll spare you...Multiflow's TRACE does all of the above, and I think I can make a much stronger claim for its being a RISC than some of the other machines listed in the other thread of discussion currently going on in this newgroup: load/store, no microcode, simple instructions, delayed branches, and the ultimate in moving runtime functionality to compile-time (trace-scheduling!). We could have an interesting discussion on what it would take to realize a similar VLIW on a chip, though -- you need pretty high interconnectivity, and a very wide instruction cache to tell all the functional units what to do. Bob Colwell mfci!colwell@uunet.uucp Multiflow Computer 175 N. Main St. Branford, CT 06405 203-488-6090
root@mfci.UUCP (SuperUser) (04/21/88)
In article <50217@sun.uucp> dre%ember@Sun.COM (David Emberson) writes: > > >I vaguely remember hearing someone about ten years ago give a talk on the >subject of architecture comparison. Unfortunately I do not remember who it >was, but they defined two measures of an architecture, R and S. The R measure >was a metric of the number of register references and the S measure was a >metric of the number of memory references (storage) in a given piece of code. >Presumably a high R/S is desirable, although this is far from certain in the >presence of write-back caches. In any case, these indicators are independent >of implementation technology. It would be nice if we could develop some such >set of metrics which would allow architectures to be compared for efficiency >and ease of implementation. I haven't a clue as to how to measure something >for ease of implementation--I'll leave that to some enterprising person. I am >in full agreement with your statement that implementations are the most >interesting. We can get real numbers from them and identify real areas for >improvement. > Dave Emberson > (dre@sun.com) I bet you're remembering the Military Computer Family work of the mid-to-late '70s done at Carnegie-Mellon. R was the "canonical processor cycles" for a benchmark; S was the program size, and M was the memory bus traffic. In our "Computers, Complexity, and Controversy" paper in Computer magazine Sept. 1985 we applied this evaluation method to Berkeley's RISC-II, mostly as an intellectual exercise, but partly to show that the field had already outgrown this kind of approach to architectural evaluation. My feeling was that the fundamental problem was that MCF was extremely careful to separate implementation from architecture, and RISC is quite willing to mix the two freely (trading object code compatibility across products in a company's product line (Sun-4/Sun-3) for the added performance available when you can max-out a given set of implementation constraints. It's probably easier to gauge "difficult-of-implementation" than "ease"; if, in a blindfold test, you gave me the VAX instruction set and RISC-I's, I'd have no problem picking the one I'd find easier to implement, and I'd have a list of reasons why. But of course, then you'd want to quantify how much easier, and that's a good question. Bob Colwell mfci!colwell@uunet.uucp Multiflow Computer 175 N. Main St. Branford, CT 06405 203-488-6090
paulr@granite.dec.com (Paul Richardson) (04/21/88)
In article <221@granite.dec.com> allen@decwrl.dec.com (Allen Akin) writes: >In article <20123@pyramid.pyramid.com> csg@pyramid.pyramid.com (Carl S. Gutekunst) writes: >> >>DEC's big RISC engine goes by the code name "Titan." It is something around 10 >>MIPS, up to 10 tightly-coupled CPUs. It's supposed to be a big secret, so don't >>spead this around. :-) >> >><csg> > >Just to clarify things for the masses: > >Titan is a research RISC machine designed and implemented several years >ago by DEC's Western Research Lab in Palo Alto. It's been mentioned in >a number of papers published by WRL (see Wall and Powell's paper in >ASPLOS II, for example) so feel free to spread it around. :-) > >Allen More Titan History: I was on a team of engineers trying to turn the research Titan into Titan the product.Obviously we never succeeded mostly because of political reasons,some valid,some not valid: Titan: 'Risc' Machine designed to run at 40 ns,I believe they are running at 42. Scalar processor consisted of datapath,and 64kb i and d caches (split) line size was 4 longwords Seperate Coprocessor 4 banks of 64 32 general purpose registers bit registers. 128 Mbytes of main store Entire processor (icache,dcache,datapath and floating point coprocessor) were contructed from 24 pin dip components (100K ecl). Processor boards were approx 20" x 28",something like that 7 slot I/O bay supported disks(currentl RA81s),enet,serial lines, and fiber optic link. Machine was designed as single user workstation for members of WRL. Languages at the time included modulea-2,C,and Fortran (I think they have lisp up now too) A system (above hardware running 4.3 BSD) performed on an aggregate basis,of 10 times a 780. Fully funtional protos were completed 2 years ago. I think it is still the fastest running uniprocessor in DEC Aprroximately same compiler technology as Mips.The papers mentioned by Allen should clue you in.
walter@garth.UUCP (Walter Bays) (04/22/88)
In article <50110@sun.uucp> ram@sun.UUCP (Renu Raman) writes: > Query: Are there any RISC(y) processors that is running an OS other > than UNIX? Query 2: What new commercial processors have been introduced in the last five (or so) years that run an OS other than UNIX? Partial Answer 2: IBM PC, Apple Macintosh, Apollo Query 3: What new commercial processors have been introduced in the last five (or so) years that do not run UNIX? -- ------------------------------------------------------------------------------ Any similarities between my opinions and those of the person who signs my paychecks is purely coincidental. E-Mail route: ...!pyramid!garth!walter USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303 Phone: (415) 852-2384 ------------------------------------------------------------------------------
liz@hpcupt1.HP.COM (Liz Peters) (04/23/88)
> Query: Are there any RISC(y) processors that is running an OS other > than UNIX? > >><csg> > >Renu >---------- HP's commercial OS, MPE, runs on HP's Precision Architecture. This combination is offered in the HP3000 line of computers. Liz Peters hplabs!hpda!liz
cdshaw@alberta.UUCP (Chris Shaw) (04/26/88)
In article <219@granite.dec.com> paulr@granite.UUCP (Paul Richardson) writes: >RISC: > 1) A machine in which the instruction set is designed/chosen based > on what makes the most sense to put into the hardware.... > > 2) ..a characteristic of RISC machines that they have a 'large' > general purpose register file. ... >/pgr One of the 801 people (Blasgen) gave a talk here a while ago about 801, and the associated philosophy. Back then, what "Reduced" meant was "reduced instruction time". That is, the design goals were to have the simplest instructions (nop/add/logic...) take one clock. Clearly, more complicated stuff like multiply would take longer, but shortness of TIME was the main design goal. Now, an ethic of this kind will lead a designer down a restricted design path: Simple addressing, pipelines, caches, etc. If I recall right, the 801 was not a single-chip machine, so area restrictions did not apply (as much). Given that the Berkeley and Stanford people wanted a single-chip CPU, the silicon area restriction applies, so "Reduced" starts to mean "reduce the NUMBER of instructions (so we can fit something useful on chip)". I think that applying RISC to mean Reduced TIME is the only thing that makes 100% sense as a "commandment". Reducing NUMBER of instructions will probably come out in the wash. -- Chris Shaw cdshaw@alberta.UUCP (via watmath, ihnp4 or ubc-vision) University of Alberta CatchPhrase: Bogus as HELL !
mcp@ziebmef.UUCP (Marc Plumb) (04/29/88)
garyb@hpmwtla.HP.COM (Gary Bringhurst) writes: >Why has no one mentioned the Inmos Transputers? They are certainly Risc'ish. Sigh... Is a processor with message passing, time-slicing, and context-switching in microcode a RISC? I honestly don't know where the Transputer belongs, but 3 registers is a bit of a change from traditional RISC architectures. The Transputer is a RISC in the "Relegate Important Stuff to Compiler" sense - the amount of useful stuff that's been stripped from the instruction set on the grounds that it can be implemented in terms of existing instructions is astounding. For example, since the Transputer considers any non-zero value to be true, the magnitude comparison operations have been reduced to signed greater-than, subtraction (zero result means equal inputs) and "equal to constant", which can be used with a zero argument to implement logical not. Sufficient, certainly (it's turing-equivalent), but pleasant to use?? Sorry to go on, but the RISCiness of Transputers is a fabrication of buzzword-happy marketroids, and I wouldn't want them to delude reasonably sane people. -- -Colin (ncrcan!ziebmef!mcp)
livesey@sun.uucp (Jon Livesey) (05/02/88)
In article <358@ziebmef.UUCP>, mcp@ziebmef.UUCP (Marc Plumb) writes: > > garyb@hpmwtla.HP.COM (Gary Bringhurst) writes: > > >Why has no one mentioned the Inmos Transputers? They are certainly Risc'ish. > > Sigh... > > [much deleted] > > Sufficient, certainly (it's turing-equivalent), but pleasant to use?? > > Sorry to go on, but the RISCiness of Transputers is a fabrication of > buzzword-happy marketroids, and I wouldn't want them to delude reasonably > sane people. You make some very good points about the transputer. Unfortunately you went a tiny bit overboard in the last two sentences. Pleasantness-of- use is not an implicit guarantee for RSIC machines. Why should it be? The RISCiness of Transputers is not a fabrication of marketeers. The Transputer turns up in perfectly respectable academic surveys of RISC machines. One reference is Tabak D. "RISC Architecture", Research Studies Press, 1987. Tabak is Abrahams-Curiel Professor of Computer Engineering at Ben Gurion University, Israel, and has a cross appointment at George Mason University. Tabak is careful to explain why he includes the Transputer as a RISC machine: "Although the machine language has 111 instructions, (approximately as in the Pyramid or Ridge, there is only a *single instruction format* [Tabak's enphasis] and a very simple one." {page 98} Tabak goes on to explain the Transputer instruction format and instruction set, emphasising that they "*eliminate the need* for *complicated addressing modes*" [Tabak's emphasis again]. He descibes their "prefix", which allows any operand to be manipulated in the Operand Register before being used, and the "operate" code which allows an instruction to be applied to the operands already loaded into the three operand Evaluation Stack. Clearly, this does not make for simple or intuitive assembler language programming, but Tabak makes the comment: "It should be stressed that the regular user is not supposed to program in the machine language [he gives a short description of Occam, deleted here]" {page 100} In an introductory section, Tabak lists eight criteria for RISCness. Transputer ---------- 1. Few instructions (< 100 is best) 111 2. Few addressing modes (1 or 2) one 3. Few instruction formats. one 4. Single cycle execution. true for 80% of inst. 5. Memory access by load/store instruction only. yes 6. Large register set. none, but 4k on-chip memory. [there are six utility regs, such as PC, etc.] 7. Hardwired control unit. no, microcoded. 8. HLL support reflected in architecture yes Using Tabak's criteria, the Transputer violates one of eight, satisfies three, at least loosely, and satisfies four more completely. Tabak comments that the violation of using microcode is also seen in some other systems, and may be forgiven by advancing technology. Jon.