gary@clipper.ingr.com (Gary Oblock) (05/08/91)
-- My Most Humble Apologies -- I now see that I need to clarify my call for information on 3 address vs 2 address architectures. I work for the division of Intergraph that originally developed the CLIPPER chip set (used in the Intergraph workstations). I'm a compiler developer and my conversation was with a fellow compiler developer. The CLIPPER chip set is a load/store architecture with a small RISCish instruction set. One of the deviations that the CLIPPER architecture has from the typical RISC architecture is that those instructions which do arithmetic operations have only two REGISTER addresses verses the typical three. This is what I am curious about. I am not particularly interested in the `classical' three address architectures such as the PDP-11. What I'm interested in is one of the characteristics of a `current' architecture. ---------------------------------------------------------------- end grovel begin grumble Come ON guys! So far only two people have sent me any information. I've also seen one posting on the subject. I could have sworn that John Mashey would have something to say about the wisdom of this feature in a RISC processor. Doesn't anyone like David Wall or John Hennessy (there was nothing relevant in Hennessy & Patterson) read this group?? *-------------------------* | Gary Oblock | | Intergraph Corp. | | Advanced Processor Div. | | 2400 Geng Road | | Palo Alto, CA 94303 | | (415)494-8800 | | gary@clipper.ingr.com | *-------------------------*
preston@ariel.rice.edu (Preston Briggs) (05/08/91)
gary@clipper.ingr.com (Gary Oblock) writes: >Come ON guys! > >So far only two people have sent me any information. I've also seen >one posting on the subject. I could have sworn that John Mashey would >have something to say about the wisdom of this feature in a RISC >processor. Doesn't anyone like David Wall or John Hennessy (there was >nothing relevant in Hennessy & Patterson) read this group?? Perhaps no one has done a study, or perhaps people who have are waiting to publish it somewhere worthwhile, or perhaps they're busy. I've seen several postings that talked about the trade-offs. Beyond talk, you need to make some measurements. It's not too hard to imagine a simple study (give the right tools). If you've got a 3-address machine, measure a bunch of code. Then, mark the arithmetic instructions as 2-address instead of 3-address (pretty dependent on the particular compiler), and hack the necessary code that tries to take advantage of 2-address instructions. (Chaitin talks about how to modify his allocator to handle 2-address instructions). Then measure the same bunch of code (speed and size). This won't help you with understanding the benefits of 2-address instructions (all those extra register addressing bits), but it will show the costs (extra move instructions). Then publish the results. Preston Briggs
johnv@sequent.com (05/08/91)
In article <1991May7.195932.8770@rice.edu> preston@ariel.rice.edu (Preston Briggs) writes: >gary@clipper.ingr.com (Gary Oblock) writes: > >>Come ON guys! >> >>So far only two people have sent me any information. I've also seen >>one posting on the subject. I could have sworn that John Mashey would >>have something to say about the wisdom of this feature in a RISC >>processor. Doesn't anyone like David Wall or John Hennessy (there was >>nothing relevant in Hennessy & Patterson) read this group?? > > >It's not too hard to imagine a simple study (give the right tools). >If you've got a 3-address machine, measure a bunch of code. >Then, mark the arithmetic instructions as 2-address >instead of 3-address (pretty dependent on the particular compiler), >and hack the necessary code that tries to take advantage of >2-address instructions. (Chaitin talks about how to modify >his allocator to handle 2-address instructions). >Then measure the same bunch of code (speed and size). > >This won't help you with understanding the benefits of 2-address >instructions (all those extra register addressing bits), >but it will show the costs (extra move instructions). > >Then publish the results. > >Preston Briggs Several years ago (longer than I'd like to think), my dissertation did an analysis of the efficiency of different instruction set styles in representing programs. The analysis looked at how closely the instruction set could represent the intent of the program without adding unnecessary work. The analysis follows the line of thinking proposed above by Briggs. The results showed that pure 3-address machines (ones without 2-address abbreviated forms) tended to have needless information. Most programs do not require three address instructions. Pure 2-address machines were much better than pure 3-address ones -- occasionally a 2-address machine will need to have a 'useless' instruction just to get the arithmetic done, but it is seldom. The NTIC should have it. It was done at UCSD in 1986, title "The Evaluation of Instruction Sets for Language-Orientd Instruction Set Computers". John Van Zandt ------------------------------------------------------------------------- E-mail: johnv@sequent.com V-mail: (503) 578-3136 P-mail: Sequent Computer Systems, 15450 SW Koll Pkwy, Beaverton, OR 97006 -- John Van Zandt ------------------------------------------------------------------------- E-mail: johnv@sequent.com V-mail: (503) 578-3136 P-mail: Sequent Computer Systems, 15450 SW Koll Pkwy, Beaverton, OR 97006
drh@duke.cs.duke.edu (D. Richard Hipp) (05/08/91)
In article <1991May7.222645.14128@sequent.com> johnv@sequent.com writes: >[P]ure 3-address machines (ones without 2-address >abbreviated forms) tended to have needless information. Most programs >do not require three address instructions. Pure 2-address machines were >much better than pure 3-address ones -- occasionally a 2-address machine will >need to have a 'useless' instruction just to get the arithmetic done, but it >is seldom. I once proposed an architecture that featured "2.5" addresses, instead of the usual 2 or 3. Each instruction contains two complete register numbers, call them A and B, as in a conventional 2-address machine, plus one extra bit which indicated whether the third address (the destination) should be the same as A or should be register zero (the accumulator). In other words, operations could be of two forms: r[A] = r[A] op r[B] or r[0] = r[A] op r[B] The idea was that the lack of a third address saved on instruction bandwidth, but the option of using r[0] as the destination eliminated many unnecessary MOV instructions. This was only an idea, and was never tested. I would be interested to hear if John Van Zandt, or anyone else, has ever looked at this addressing scheme and what they found after looking at it for a while.
firth@sei.cmu.edu (Robert Firth) (05/08/91)
Sigh. Let me point out first that one cannot decide which architecture is "best" in isolation. It is necessary to consider application domain, programming language, language use styles, compiler effectiveness, and a host of hardware issues. However, here are some numbers; make of them what you will. Some are from my own analyses of the Vax-11 and PE-3200 architectures, back when I wrote compilers for those machines; others are taken from Daniel Klein's and my evaluation of the MIPS machine, published as SEI Technical Report CMU/SEI-87-TR-25. First question: if you have a machine with true 2-address and 3-address modes uniformly, how many of each will you use? An approximate answer can be given from the VAX numbers, which tell me that, where the alternatives exist, about 18% of operations take three addresses rather than two. If the 3-address modes did not exist, approxmately 6% more move instructions would be generated, for a code expansion of about 2%. Second question: if you have a machine where the destination and one operand must be registers, how much do you lose if they are required to be the same register? An approximate answer can be found by looking at the PE-3200, which forces destination and left operand to be in the same register, and counting the number of register shuffles or operand reloads caused because the left operand has been destroyed. The answer is that, of all diadic operations, approximately 14% destroy an operand you don't want destroyed. The extra instructions to preserve or reload this operand (you reload if it's no slower since that sometimes saves a register) yield a code expansion of about 4%. Final question: on a machine like the MIPS, where the address modes do not span memory, you must sometimes construct a 'long' address by adding a 32-bit offset to a base register. In other words, you can't say load temp, #abcdefgh(base) you must say add temp, base, #abcd0000 load temp, #efgh(temp) Clearly, this needs a three-operand add instruction. If your machine looks like the MIPS, but has only a two-operand add, you need an extra register shuffle to avoid destroying the value in base. Our figures showed that about 7% of loads, or about 2% of all instructions, fell into this category, so that 2% is the code expansion you are facing. My conclusions: (a) It is not worth implementing full 3-address mode a la Vax (b) For a one-address general-register machine, the cost of always destroying the left operand is reasonable, and I think the gain in architectural simplicity justifies it (c) For a load/store machine, it is better to have more registers and only two-address than fewer registers and three-address. However, that usually isn't the trade-off. Finally: *make a choice and stick to it*. Worse than either choice is a machine with two or more forms of every operation, one that takes two registers and one that takes three. Keep it simple and orthogonal, and make it run like a bat out of ASPLOS. Hope that helps Robert Firth
mark@hubcap.clemson.edu (Mark Smotherman) (05/09/91)
Flynn's work with the Architect's Workbench is relevant, and while I don't find this specific comparison mentioned in the Sept. 1987 IEEE Computer article (Flynn, Mitchell and Mulder, "And now a case for more complex instruction sets"), it would seem easy to do with awb. You could compare FIX32 with your own definition of a two-address instruction format. Main problem would be in how to handle offsets and immediate values. -- Mark Smotherman, CS Dept., Clemson University, Clemson, SC 29634-1906 mark@cs.clemson.edu or mark@hubcap.clemson.edu
jmk@alice.att.com (Jim McKie) (05/10/91)
In response to <673707950@romeo.cs.duke.edu>: I once proposed an architecture that featured "2.5" addresses, instead of the usual 2 or 3. Each instruction contains two complete register numbers, call them A and B, as in a conventional 2-address machine, plus one extra bit which indicated whether the third address (the destination) should be the same as A or should be register zero (the accumulator). ... This was only an idea, and was never tested. I would be interested to hear if John Van Zandt, or anyone else, has ever looked at this addressing scheme and what they found after looking at it for a while. See 'Introduction to the CRISP Instruction Set Architecture', Spring COMPCON 87 Proceedings, by Berenbaum, Ditzel and McLellan for a description of a 2.5 address machine: "An accumulator rather than a full third address was provided because our measurements showed that most of the use of a full third address destination field by the compiler was for a single scratch register." Jim McKie research!jmk -or- jmk@research.att.com Bell Labs
henry@zoo.toronto.edu (Henry Spencer) (05/12/91)
In article <1991May7.175646.25449@clipper.ingr.com> gary@clipper.ingr.com (Gary Oblock) writes: >This is what I am curious about. I am not particularly interested in >the `classical' three address architectures such as the PDP-11... Just as well, since the pdp11 was a two-address architecture! -- And the bean-counter replied, | Henry Spencer @ U of Toronto Zoology "beans are more important". | henry@zoo.toronto.edu utzoo!henry