eggert@sdcrdcf.UUCP (09/06/87)
In article <288@tropix.UUCP> mjl@tropix.UUCP (Mike Lutz) writes
... the B1700 was a pleasure to work with at the microcode level (and
anyone who has done serious microprogramming knows what an amazing
statement that is!) While not a "RISC" machine, the B1700 was
optimized for emulation, and the pieces just fit together well....
What irony! David Patterson, Mr. RISC, wrote his PhD thesis at UCLA in 1975 on
formal verification of microcode for the D-machine (as Lutz says, really the
Burroughs 1700). Partly because of the D-machine's pleasantness, Patterson was
surprisingly successful. But the aggravation of verifying microcode convinced
him that microcode causes more problems than it cures; he turned to the design
of machines that don't need microcode. So that unlikely couple, formal
verification and dynamically microcodable CISC, helped spawn RISC!
bpendlet@esunix.UUCP (Bob Pendleton) (09/08/87)
in article <4782@sdcrdcf.UUCP>, eggert@sdcrdcf.UUCP (Paul Eggert) says:
-
-In article <288@tropix.UUCP> mjl@tropix.UUCP (Mike Lutz) writes
-
- ... the B1700 was a pleasure to work with at the microcode level (and
- anyone who has done serious microprogramming knows what an amazing
- statement that is!) While not a "RISC" machine, the B1700 was
- optimized for emulation, and the pieces just fit together well....
-
-What irony! David Patterson, Mr. RISC, wrote his PhD thesis at UCLA in 1975 on
-formal verification of microcode for the D-machine (as Lutz says, really the
-Burroughs 1700). Partly because of the D-machine's pleasantness, Patterson was
-surprisingly successful. But the aggravation of verifying microcode convinced
-him that microcode causes more problems than it cures; he turned to the design
-of machines that don't need microcode. So that unlikely couple, formal
-verification and dynamically microcodable CISC, helped spawn RISC!
What a wierd coincidence! I just happen to have copies of David Pattersons
papers, "Strum: Structured Microprogram Development System for Correct
Firmware" IEEE Transactions on Computers, Vol. @-25n No 10, October 1976, and
"An Experiment In High Level Language Microprogramming and Verification"
CACM, October 1981, Volume 24, Number 10, sitting on my desk. I was rereading
them late last week.
I worked on the B1700 at the University of Utah in the early 1970s and have
been telling people for several years that I thought RISC looked like
vertical microcode "done right." If what you say is true, then I was right.
Imagine that.
In five years I expect that RISC will be passe, that WISC ( wide instruction
set computers ) will be all the rage. WISC will be horizontal microcode
"done right." It will have all the advantages of RISC, but WISC machines will
run faster and cost less. We haven't abandoned microcode, we've just let it
out of the closet.
Bob Pendleton
--
Bob Pendleton @ Evans & Sutherland
UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet
Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet
I am solely responsible for what I say.
mjl@tropix.UUCP (Mike Lutz) (09/08/87)
In article <4782@sdcrdcf.UUCP> eggert@SM.Unisys.com (Paul Eggert) writes: >In article <288@tropix.UUCP> mjl@tropix.UUCP (Mike Lutz) writes > > ... the B1700 was a pleasure to work with at the microcode level (and > anyone who has done serious microprogramming knows what an amazing > statement that is!) >What irony! David Patterson, Mr. RISC, wrote his PhD thesis at UCLA in 1975 on >formal verification of microcode for the D-machine (as Lutz says, really the >Burroughs 1700). Just to clear up a misconception: the B1700 was *not* the D-machine. The two were designed and built by two different divisions in Burroughs, and, as far as I can tell, had little influence on one another. The D-machine had two levels of emulation; the B1700 was a vertical microengine. I pity the poor soul who might have tried to nanocode a D-machine to make it into a B1700. However, the comments on David Patterson were right on target. What he demonstrated in his thesis was startling to the microprogrammming community. Most folks were just starting to address the problems of high level languages on horizontal machines, when Patterson showed a system that a) balanced the vertical and horizontal resource utilization in the D-machine, b) was verifiable, and c) in one case generated a smaller & faster emulator than one coded by hand (the accepted norm at the time)!
sd@erc3ba.UUCP (S.Davidson) (09/11/87)
In article <475@esunix.UUCP>, bpendlet@esunix.UUCP (Bob Pendleton) writes: > > In five years I expect that RISC will be passe, that WISC ( wide instruction > set computers ) will be all the rage. WISC will be horizontal microcode > "done right." It will have all the advantages of RISC, but WISC machines will > run faster and cost less. We haven't abandoned microcode, we've just let it > out of the closet. > > Bob Pendleton > -- It's happened already, though they are not all the rage yet. They are called Very Long Instruction Word machines, and one of the originators, Josh Fisher, did his dissertation on global compaction of horizontal microcode. Josh moved to Yale after he graduated, and then moved to a company to build a VLIW machine. I don't know the current status of this machine, though. At Yale, though, Josh got some very impressive speedups from unrolling loops and basically running compaction on them, assuming a lot of available resources. I don't know of any results on real hardware, however. By the way, I wouldn't say that RISCs are vertical microcode engines done right. They just include a lot of stuff not necessary in microcode, like direct addressing and multiplies. It has never been that hard to generate compilers for vertical microcode.
carl@otto.COM (Carl Shapiro) (09/13/87)
Followup-To: Distribution: >... Josh Fisher, did his dissertation on global compaction of horizontal >microcode. Josh moved to Yale after he graduated, and then moved to a >company to build a VLIW machine. I don't know the current status of >this machine ... He (and others, some who also worked on the project at Yale) has built the machine in question. It's called the TRACE computer, and is being produced and sold by Multiflow Computer, Inc. in Branford, Connecticut.
bpendlet@esunix.UUCP (Bob Pendleton) (09/14/87)
in article <347@erc3ba.UUCP>, sd@erc3ba.UUCP (S.Davidson) says: - - In article <475@esunix.UUCP>, bpendlet@esunix.UUCP (Bob Pendleton) writes: -- -- In five years I expect that RISC will be passe, that WISC ( wide instruction -- set computers ) will be all the rage. WISC will be horizontal microcode -- "done right." It will have all the advantages of RISC, but WISC machines will -- run faster and cost less. We haven't abandoned microcode, we've just let it -- out of the closet. -- -- Bob Pendleton -- -- - - - It's happened already, though they are not all the rage yet. They are Read my words. Did I say that such machines did not exist? It is hard to make predictions about things that don't exist, easy when things already exist. In fact, such machines have existed for at least 15 years. Recent developments in compiler technology have made them practical for use by people with ordinary budgets. Microcoding has been very expensive. - called Very Long Instruction Word machines, and one of the originators, - Josh Fisher, did his dissertation on global compaction of horizontal A good reference is "Trace Scheduling: A Technique for Global Microcode Compatction" Joseph A. Fisher. IEEE Transactions on Computers, vol. c-30, no. 7, July 1981. By the by, VLIW(TM) and Trace Scheduling(TM) are trademarks of Multiflow Computers, Inc. So I chose to use WISC instead. - microcode. Josh moved to Yale after he graduated, and then moved to a - company to build a VLIW machine. I don't know the current status of this machine, - though. At Yale, though, Josh got some very impressive speedups from unrolling - loops and basically running compaction on them, assuming a lot of available resources. - I don't know of any results on real hardware, however. A good reference is "Bulldog: A Compiler for VLIW Architectures" John R. Ellis MIT Press, 1986 ISBN 262-05034-X The hardware is advertised regularly in Aviation Week & Space Technology. Aviation trad pubs. have lots of computer related info. that some CS types seem to be totally unaware of. - - By the way, I wouldn't say that RISCs are vertical microcode engines done right. - They just include a lot of stuff not necessary in microcode, like direct - addressing and multiplies. It has never been that hard to generate compilers - for vertical microcode. The world YOU live in doesn't need direct addressing and multiplies in microcode. The world I live in requires direct addressing and multiplies, even floating multiplies, in microcode. Out side of your own world, your assumptions do not apply. It has been very hard to write GOOD compilers for horizontal microcode. Please read what I said, not what you wanted me to say. The original article contained references to two key papers in this field. Using anecdotes and rumors to "correct" me is as pointless as my flaming you in this reply. At least I've provided references to cover your anecdotes. Bob P. -- Bob Pendleton @ Evans & Sutherland UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet I am solely responsible for what I say.
turner@uicsrd.UUCP (09/16/87)
/* Written 9:59 am Sep 14, 1987 by fay@encore.UUCP in uicsrd:comp.arch */ > > Clancy et al. - Proc. Summer 1987 Usenix Conf.) which describes some of > Multiflow's hardware and software. Truely incredible stuff, if it's for > real. > .... > Normally "conditional jumps occur every five to eight instructions", > making parallelization very difficult. So simply take a trace of > normal program execution and have the compiler assume it will USUALLY > execute that trace. > .... > Then compile the new program as if it were not going to take the > seldom-used branches and plunge ahead. > .... > > My question to those parllel machine compiler writers out there: is anyone > writing compilers for non VLIW machines using the same methods? Why can't, > say, an Alliant-type (or Cedar-type, etc.) machine with hardware lock-step > between computational elements get a trace execution, recompile assuming > no branches, and when the 1000th instruction diverts from the "chosen > path", just back up the CE's and undo the damage? ^^^^^^^^^^^^^^^^^^^ > > peter fay > fay@multimax.arpa > /* End of text from uicsrd:comp.arch */ The problem is one of dynamic vs. static allocation of operations to functional units. In a VLIW machine the compiler allocates operations to functional units at COMPILE time. The compiler knows which operations should be undone when an unexpected (unpredicted?) branch occurs. In any machine that dynamically allocates iterations of a loop to CE's it is VERY difficult to determine what operations must be undone, since an early iteration could branch out of the loop after some number of other iterations have finished. Notice that vector operations within a CE are staticly allocated to the sections of pipe. So that vector operations could have conditional branches by allowing 'back up'. Unless the vector register length is verry long I have doubts as to the effectiveness of this however. --------------------------------------------------------------------------- Steve Turner (on the Si prairie - UIUC CSRD) UUCP: {ihnp4,seismo,pur-ee,convex}!uiucdcs!uicsrd!turner ARPANET: turner%uicsrd@a.cs.uiuc.edu CSNET: turner%uicsrd@uiuc.csnet *-)) Mutants for BITNET: turner@uicsrd.csrd.uiuc.edu Nuclear Power (-%
eugene@pioneer.arpa (Eugene Miya N.) (09/16/87)
In article <478@esunix.UUCP> bpendlet@esunix.UUCP (Bob Pendleton) writes: >-- In five years I expect that RISC will be passe, that WISC ( wide instruction >-- set computers ) will be all the rage. WISC will be horizontal microcode >- It's happened already, though they are not all the rage yet. They are >A good reference is "Trace Scheduling: A Technique for Global Microcode >Compatction" Joseph A. Fisher. IEEE Transactions on Computers, vol. c-30, >no. 7, July 1981. > >By the by, VLIW(TM) and Trace Scheduling(TM) are trademarks of Multiflow >Computers, Inc. So I chose to use WISC instead. Added note: the lastest copy of Computer has yet another CISC/RISC debate (this time including Mike Flynn). I don't think RISCs will disappear, they will become passe as a design fad. I expect more specialized RISCs, tuned toward specific applications like signal processing but more general purpose than systolic arrays. You won't see debate in this group in favor of CISC because too few software people read the group, and I don't mean programmers, I mean the kinds of people pushing tagged architectures, etc. The group lacks balance. You will know when CISC is dead when the 370 & the VAX disappear (I have this bridge...). There isn't enough experience with ELI/VLIW/WISC and too few people (like count on one hand) know how to code these types of machines to make me confident this is the next fad. The Trace (machine) that I saw at least didn't crash (running U*x). From the Rock of Ages Home for Retired Hackers: --eugene miya NASA Ames Research Center eugene@ames-aurora.ARPA "You trust the `reply' command with all those different mailers out there?" {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!aurora!eugene
johnl@ima.ISC.COM (John R. Levine) (09/16/87)
In article <347@erc3ba.UUCP> sd@erc3ba.UUCP (S.Davidson) writes: >[Horizontal microcode RISC machines have] >happened already, though they are not all the rage yet. They are >called Very Long Instruction Word machines, and one of the originators, >Josh Fisher, did his dissertation on global compaction of horizontal >microcode. Josh moved to Yale after he graduated, and then moved to a >company to build a VLIW machine. ... Josh's company, Multiflow Computer, is shipping their smallest minisuper, the Trace 7/200. It runs real fast, e.g. LINPACK 6.0 mflops compared to, say, an IBM 3090-200's 6.8 mflops, which is not bad for a machine that costs $300K. According to people I know there, it turned out to run faster than they projected it to, and in some customer benchmarks outperformed a Cray X/MP. The Trace 7 has a 256 bit instruction word, they're working on 512 bit and 1024 bit versions. Unlike most other minisupers, there is no vector processing hardware. It executes one enormous instruction at a time, and there is considerable compiler cleverness involved in getting as much useful work as possible done in a single enormous instruction. Not using vectors means that existing cruddy Fortran code can be compiled effectively without having to rework it to make it more easily parallelizable. [Disclaimer: No connection to Multiflow except that I know a lot of the people who work there.] -- John R. Levine, Cambridge MA, +1 617 492 3869 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.something The Iran-Contra affair: None of this would have happened if Ronald Reagan were still alive.
bcase@apple.UUCP (09/17/87)
In article <2785@ames.arpa> eugene@pioneer.UUCP (Eugene Miya N.) writes: >Added note: the lastest copy of Computer has yet another CISC/RISC >debate (this time including Mike Flynn). Sigh. >I don't think RISCs will >disappear, they will become passe as a design fad. I expect more >specialized RISCs, tuned toward specific applications like signal >processing but more general purpose than systolic arrays. You won't see >debate in this group in favor of CISC because too few software people >read the group, and I don't mean programmers, I mean the kinds of people >pushing tagged architectures, etc. RISC will become passe as a design fad as soon as something comes along to replace the compiler. In other words, there the "passeness" of RISC is nowhere in sight. I agree that specialized RISCs will appear, but that has nothing to do with RISC being a fad; rather it has to do with the need for specialized processors and the ease with which RISC processors can be specialized. Arguements in favor of tagged architectures aren't necessarily arguements against RISC. See SOAR, SPUR, and SPARC. >The group lacks balance. You will >know when CISC is dead when the 370 & the VAX disappear (I have this >bridge...). There isn't enough experience with ELI/VLIW/WISC and too >few people (like count on one hand) know how to code these types of >machines to make me confident this is the next fad. You are right, there isn't enough experience, but that is true of every new thing when it is new! Compilers will (should) be the "people" coding these machines. Personally, I consider VLIW one of the very few truly new ideas to come along (and in some sense, it isn't really new). >The Trace (machine) that I saw at least didn't crash (running U*x). See? It's a great machine! :-)
andy@rocky.UUCP (09/18/87)
In article <6266@apple.UUCP> bcase@apple.UUCP (Brian Case) writes: >In article <2785@ames.arpa> eugene@pioneer.UUCP (Eugene Miya N.) writes: >>Added note: the lastest copy of Computer has yet another CISC/RISC >>debate (this time including Mike Flynn). >Sigh. >RISC will become passe as a design fad as soon as something comes along >to replace the compiler. In other words, there the "passeness" of RISC >is nowhere in sight. Flynn used the same compiler/optimizer with different final code generators to study a number of different architectures. (The compiler and optimizer were written under John Hennessy's direction a few years ago. Yes, that Hennessy.) All of the architectures had the same ALU; they differed in instruction format and register set architecture. (They compared different register window schemes with monolithic register sets of various sizes.) Since all of the tests used the same compiler and optimizer, much of the remaining differences were due to differences between the architectures. One result was that more compact instruction formats were more effective at reducing instruction traffic than expanding the instruction cache. ``[The 360-like CISC] achieves the same memory performance as [the RISC architecture], but uses an instruction cache of only half the [RISC] cache size.'' Flynn, et al argue that this decoding hardware is smaller than the I-cache necessary for equivalent RISC performance. Remember, the critical path in MIPS, MIPS-X, and the Berkeley RISC processors is not in the control logic; I don't know about MIPS Co's product. ``From data traffic considerations, it seems that the [360-like CISC] with a register set of about size 16 plus a small data cache is preferable to multiple register sets for most area combinations.'' Maybe instruction bandwidth isn't important, but data bandwidth seems to be. As Flynn and company conclude, ``@i[Balanced optimization] is the key to overall instruction set efficiency.'' Let's see some data from RISC folks. -andy ps - The article is in the September 87 issue of IEEE Computer. -- Andy Freeman UUCP: {arpa gateways, decwrl, sun, hplabs, rutgers}!sushi.stanford.edu!andy ARPA: andy@sushi.stanford.edu (415) 329-1718/723-3088 home/cubicle
sd@erc3ba.UUCP (S.Davidson) (09/18/87)
> > A good reference is "Trace Scheduling: A Technique for Global Microcode > Compatction" Joseph A. Fisher. IEEE Transactions on Computers, vol. c-30, > no. 7, July 1981. Right after our paper on microcode compaction techniques, in which we had the pleasure of killing off a research area, by showing that the problem was solved. We didn't really invent any of the compaction techniques in that paper, by the way, but implemented and compared the popular ones. The reference is "Some Experiments in Local Microcode Compaction for Horizontal Machines," by S. Davidson, D. Landskov, B. D. Shriver, and P. W. Mallett., reprinted in "Advances in Microprogramming" ed. by Mallach and Sondak, Artech House, 1983, (second ed.) A better article (but without the results) is "Local Microcode Compaction Techniques," Computing Surveys, Sept. 1980, with David Landskov as first author. Unfortunately I haven't seen any complete solutions for the global compaction problem. Josh's ideas are still the best, I think, but the jury is still out. > > By the by, VLIW(TM) and Trace Scheduling(TM) are trademarks of Multiflow > Computers, Inc. So I chose to use WISC instead. > I wonder if Trace Scheduling as a trademark would stand up in court. It was used as a description of a particular algorithm in several papers before this company existed. Wide Instruction _Set_ Computer doesn't seem right, a reduced instruction set makes sense (the set is reduced, not the instructions) but what is a wide instruction set. How about wide instruction computer? I'm not sure that makes any more sense, though. VLIW seems the best description, too bad they grabbed the term. > - microcode. Josh moved to Yale after he graduated, and then moved to a > - company to build a VLIW machine. I don't know the current status of this machine, > - though. At Yale, though, Josh got some very impressive speedups from unrolling > - loops and basically running compaction on them, assuming a lot of available resources. > - I don't know of any results on real hardware, however. > > A good reference is "Bulldog: A Compiler for VLIW Architectures" John R. Ellis > MIT Press, 1986 ISBN 262-05034-X > The reference to their simulated results is "Using an Oracle to Measure Potential Parallelism in Single Instruction Stream Programs," by Alex Nicolau and Josh, 14th Annual Microprogramming Conference, pp. 171 - 182. Alex was Josh's student, he is now a professor at Cornell, (or was 2 years ago when he wrote a paper form Micro 18). > - > - By the way, I wouldn't say that RISCs are vertical microcode engines done right. > - They just include a lot of stuff not necessary in microcode, like direct > - addressing and multiplies. It has never been that hard to generate compilers > - for vertical microcode. > > The world YOU live in doesn't need direct addressing and multiplies in > microcode. The world I live in requires direct addressing and multiplies, > even floating multiplies, in microcode. Out side of your own world, your > assumptions do not apply. You might be interested in reading "The Cultures of Microprogramming" by Nick Tredennick, Micro 15, pp. 79 - 83. Sheraga and Gieser have done some very nice work on compilers for microcode with floating point and all that stuff, (one paper is in Micro14, others have been in IEEE Trans. Comput. or Software Eng, I forget which. The issue really is however, how are RISC machines done "right" in comparison to vertical microcode engines, considering the difference between a microcode engine and a computer. > > It has been very hard to write GOOD compilers for horizontal microcode. > I think I know that, having written one (a compiler, not necessarily a good one.) It is very hard to write even bad compilers for horizontal microcode. Your audience for the compiler makes a big difference too. See my article "Progress in High Level Microprogramming," in the July 1986 IEEE Software. Not enough references there, Bruce made me take most of them out. There is a more extended article on high level microprogramming languages in the book "Microprogramming Handbook," ed. by Stan Habib, out Real Soon Now. > Please read what I said, not what you wanted me to say. The original article > contained references to two key papers in this field. Using anecdotes and > rumors to "correct" me is as pointless as my flaming you in this reply. At > least I've provided references to cover your anecdotes. > Never meant to correct you, since there was nothing to correct. I'm not sure all the readers of this group are up on WICs or whatever. By the way, what is a reference for a 15 year old WIC? I mean one like the Multiflow, Bulldog machine, not a big heterogeneous horizontal word. > Bob P. > -- > Bob Pendleton @ Evans & Sutherland > UUCP Address: {decvax,ucbvax,ihnp4,allegra}!decwrl!esunix!bpendlet > Alternate: {ihnp4,seismo}!utah-cs!utah-gr!uplherc!esunix!bpendlet > I am solely responsible for what I say. Scott Davidson {ihnp4,allegra}!erc3ba!sd
bcase@apple.UUCP (09/18/87)
In article <600@rocky.STANFORD.EDU> andy@rocky.UUCP (Andy Freeman) writes: >Flynn used the same compiler/optimizer with different final code generators >to study a number of different architectures. (The compiler and optimizer >were written under John Hennessy's direction a few years ago. Yes, that >Hennessy.) John Hennessey certainly knows up from down when it comes to compilers. But, in my humble opinion, many really important optimizations happen *after* code generation; this is especially true for RISCs, I believe. Looking at the output of modern, commercial "optimizing" compilers, I am appalled at the code quality for certain cases. Just because a text book says that optimization occurs before code generation doesn't mean that's the best way. > All of the architectures had the same ALU; they differed in >instruction format and register set architecture. (They compared different >register window schemes with monolithic register sets of various sizes.) **>Since all of the tests used the same compiler and optimizer, much of the** **>remaining differences were due to differences between the architectures.** This is the claim I don't believe, not even a little bit. >Remember, >the critical path in MIPS, MIPS-X, and the Berkeley RISC processors is >not in the control logic; I don't know about MIPS Co's product. I'm not so sure that I believe this statement. It is true that, in most of the cases listed, little *area* was spent, but, at least for the original Stanford MIPS, the master pipeline controller was a real problem. Remember, whether or not to "complexify" instructions set definition is driven (or should be) by what software (compiler, OS) wants/can deal with, not *only* by what hardware can stand. The fact that I can maintain cycle time even if I "complexify" the instruction set does not mean it is the right thing to do! What if the compiler never emits those complex instructions? >``From data traffic considerations, it seems that the [360-like CISC] >with a register set of about size 16 plus a small data cache is preferable >to multiple register sets for most area combinations.'' Again, I question the compiler effort here. >Maybe instruction bandwidth isn't important, but data bandwidth seems >to be. As Flynn and company conclude, ``@i[Balanced optimization] is >the key to overall instruction set efficiency.'' Let's see some data >from RISC folks. Bandwidth is not the only consideration: LATENCY is often more important where loads/stores are concerned (at least in machines, like RISC II, Am29000, and I suspect SPARC that have a relatively low percentage of loads/stores). High instruction bandwidth is very important for RISC machines; latency is also important but there are techniques for dealing with this so that it won't be so apparent at the chip boundary. Techniques like interleaving and using burst-mode memories (VDRAMS, SCDRAM, nibble-mode, etc.) can deal with sequential bandwidth, but if it takes 2 milliseconds to get the first word, who cares? Latency, latency, latency. Thus, arguements against RISC founded on bandwidth requirements directed me, at least, will fall on deaf ears. About the only real data that I can offer is that the percentage of loads/stores for stack-cache machines (RISC II, SPARC, Am29000, etc) is often about 1/2 that observed in machines with only flat register files (MIPS, etc.).
martin@felix.UUCP (Martin McKendry) (09/18/87)
In article <6266@apple.UUCP> bcase@apple.UUCP (Brian Case) writes: >In article <2785@ames.arpa> eugene@pioneer.UUCP (Eugene Miya N.) writes: >>Added note: the lastest copy of Computer has yet another CISC/RISC >>debate (this time including Mike Flynn). > >Sigh. One of the vocal opponents of RISC (on whatever grounds) is/was Doug Jensen, of CMU. I have just heard that he is joining a startup called "Kendell Square Research", who are building new hardware and software for atomic laser blasters, or real-time control, or some such. Now an interesting thing to know would be what the hardware is to look like. Anyone know? -- Martin S. McKendry; FileNet Corp; {hplabs,trwrb}!felix!martin Strictly my opinion; all of it
martin@felix.UUCP (Martin McKendry) (09/18/87)
In article <6266@apple.UUCP> bcase@apple.UUCP (Brian Case) writes: >In article <2785@ames.arpa> eugene@pioneer.UUCP (Eugene Miya N.) writes: >>Added note: the lastest copy of Computer has yet another CISC/RISC >>debate (this time including Mike Flynn). > >Sigh. One of the vocal opponents of RISC (on whatever grounds) is/was Doug Jensen, of CMU. I have just heard that he is joining a startup called "Kendell Square Research", who are building new hardware and software for atomic laser blasters, or real-time control, or some such. Now an interesting thing to know would be what the hardware is to look like. Anyone know? Kendell Square Research is in Boston, 'behind MIT'. -- Martin S. McKendry; FileNet Corp; {hplabs,trwrb}!felix!martin Strictly my opinion; all of it
earl@mips.UUCP (Earl Killian) (09/19/87)
In article <6281@apple.UUCP>, bcase@apple.UUCP (Brian Case) writes: > About the only real data that I can offer is that the percentage of > loads/stores for stack-cache machines (RISC II, SPARC, Am29000, etc) is > often about 1/2 that observed in machines with only flat register files > (MIPS, etc.). If the percentage for those machines is really half of the MIPS-style RISC machines, I suspect it is because either those machines have compilers that generate unnecessary non-load/store instructions, or the architecture requires extra non-load/store instructions to get the same work done (e.g. using condition codes in RISC II, SPARC, and address arithmetic in Am29000, etc.). We really need a unit of real work for integer programs, like the flop for fp programs. Then we could measure load/stores per workunit. The data that I have below suggests that for the MIPSco architecture/compiler the savings varies widely, but never gets to 50%. This data is basically the % of load/stores that are due to register save/restore. Register windows would eliminate some, but not all of these (load/stores for window overflow/underflow should be factored in). So these are an upperbound on the savings. (That is savings of load/stores, not of cycles). espresso 0.6% spice 4.0% wolf 5.6% yacc 10% diff 12% compress 12% uopt 18% nroff 28% ccom 38% P.S. I'm actually a fan of register windows, even though the MIPSco architecture doesn't have them. However, I think some of the common wisdom about register windows is wrong (e.g. how many load/stores they save) and overstates their usefulness. This is because the early work was done without the benefit of optimizing compilers. Too bad they didn't; with an optimizing compiler they would have found only half as many physical registers (i.e. silicon) are necessary to get the same performance. The SPARC folks, who do have a good compiler, discovered this too (but too late to change the architecture to take advantage of it). The worst thing about register windows is that they are sometimes used to justify multi-cycle load/store. For a program like spice, you save at most 4% and pay 32% (the % of remaining load/stores in spice) for every extra cycle added to load/store. Yuck.
aglew@ccvaxa.UUCP (09/20/87)
..> Talking about Flynn's article in Computer, CISC vs. RISC, ..> somebody quoted the conclusion that ``A 360-like CISC ..> with 16 registers and a moderate sized data cache'' ..> may be the way to go. (paraphrased). By the way, I am not so sure that you should call a 360-like instruction set "CISC". The IBM 360 is actually quite a simple machine: a limited number of instruction formats, fairly regular register use (too few registers), a limited number of addressing modes... DON'T FLAME ME, PLEASE!!!! I know all too well the CISCy aspects of the 360: translate instructions, block moves, and so on - but just want to point out that, if you subtract a few things, the 360 doesn't look too bad in a RISC light. Of course, RISC began when the IBM 801 group took up where the 360 left off, without marketing pressure to force CISCy kluges... Flynn is trading off register set size for memory+register->register operations, both in the context of a simple instruction set. Note that he is not trading off against all the complicated addressing modes that a VAX, true CISC, has. It isn't written in Stone that a RISC has to be a load-store architecture. Most are, true, but only because critical evaluation seems to fall on that side. Flynn examines the alternative... I've often thought that RISC might better be described as "Reduced Addressing Mode Machine", RAMM. In fact, I wrote a paper in an undergrad course on "RAMM/RISC/SEISM". After finishing my undergrad course, being all fired up with RISC, I went to work for a minicomputer manufacturer. While waiting for a US visa, I had them send up a processor manual. The instruction set looked a lot like a 360 - base register, etc. I wondered what I was getting into. But then I mapped out the instruction set and register usage patterns, and I said to myself "Damn! This machine could run damned fast!" You see, while it looked like an IBM 360, there weren't very many of the CISCy features that had been forced upon Amdahl. In fact, our CPUs do typically run one instruction per cycle. Sometimes more. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms.arpa I always felt that disclaimers were silly and affected, but there are people who let themselves be affected by silly things, so: my opinions are my own, and not the opinions of my employer, or any other organisation with which I am affiliated. I indicate my employer only so that other people may account for any possible bias I may have towards my employer's products or systems.
ebg@clsib21.UUCP (Ed Gordon) (09/29/87)
Having never worked explicitly with the D machine, but having worked with D machine graduates, and the Mini-D machine, I cannot claim to have explicit knowledge of it's inner working. But, if my knowledge does not fail me, the D-machine, had a highly parallel, rather complicated parallel instruction set, each made up of sub-instructions to handle the registers, alu, and branch processing. All of which was done in parallel for each instruction, producing a fairly complex instruction, disregarding the simplistic nature of the register set, which was understandable considering the nature of the system. The processor was essentially a microcontroller, with explicit mechanisms for control of the bus. I was also involved in the development of a "RISC"-like, pre-"RISC" era processor, but I don't know what became of it. If I understand the RISC architecture, it does not strive for parallelism, but strives for a reduced instruction set, to simplify chip design, in order to speed up execution times of instructions (one cycle per?). The extent of the effort was defined to me as an attempt to produce "assembler" instructions, without the complications of (slow) HLL constructs, not "firmware" instructions with explicit bus manipulation, as does the Mini-D. Comparing RISC's with the Mini-D, is like comparing "Apples" with "Oranges", (or is that "Apples" with "Compaqs"?). Do I misunderstand the concept? --Ed Gordon --Data System Associates "I know it's only rock and roll, and the opinions expressed are my own, and are not necessarily those of any of the major recording studios, or any other semi-coherent organization."