mash@mips.COM (John Mashey) (08/01/88)
In article <408@ma.diab.se> pf@ma.UUCP (Per Fogelstr|m) writes: >In article <941@srs.UUCP> srs!matt@cs.rochester.edu (Matt Goheen) writes: >>I have always been led to believe that Sun's rating of the Sun 4/200 >>series as 10 MIPS to be "Vax" MIPS (this goes for the 7 MIPS 4/110.... >Recently ELECTRONICS had an article sereies on RISCS. One of the articles >compared different cpu's introducing something called "instruction quality >factor". In this comparision the VAX780 was rated 1.0 with its 0.47 MIPS >performence. The funny thing that puzzels me was that most RISC designs >had an "IQF" of 0.8 - 0.9 exept SPARC wich was rated 0.6. Assuming SPARC is >a native 10MIPS cpu this would yield a 6 normalized mips cpu. I did the same >comparition with my 32532 design. This 25Mhz prototype executes about 6-8 >real MIPS (Yes there is a pin on the chip that pulses every time a new >instruction is starte where i connect the counter). Okay, the IQF for this >cpu must be very close to 1.0 since the architecture resembles the VAX >architecture. In this case my 25Mhz 32532 design is a 15VAX MIPS processor. >Well then, MIPS what the heck is it, really. At least it's useless when >comparing different processor architectures. All of this illustrates how confused things get when you: start with engineering reality, sort of look at it thru the marketing viewpoint and then stir in some absolutely bogus guesses 1) I read the ELECTRONICS article. I don't recall seeing any sensible derivation for the IQF; in fact, there is no earthly reason for SPARC to be that much less than the other RISCs. Those numbers were guesses, as far as I can tell. There is nothing particularly wrong with SPARC instructions [the performance hits come from otehr things, mostly.] 2) It might be reasonable to compute IQFs, but you need access to very good architectural simulations. It is very hard to measure them by running benchmarks. 3) Native MIPS is a useful and interesting number, for the people who build computers and run the architectural simulations. It's fairly useful when comparing the same executable object code across architecturally compatible machines. 4) It is sad but true, that a particular game is played in this business. a) Compute and publish some kind of mips-number for your processor. b) Describe the number by saying it's: native mips integer sustained mips normalized average intger mips c) WHen these things get summarized, in marketing brochures, or in the press, the qualifications usually get dropped. (To be fair to the press, it is HARD to disentangle what's being told to them). d) WHen people read these things, they sort of expect that these have either turned into VAX-relative mips, or that the numbers somehow indicate actual relative performance on *something*. e) Many people claim to use VAX-mips: sometimes using MicroVAXII == 1 (FYI, uVAX != 11/780) without having the foggiest idea how DEC computes the numbers themselves based on one or two benchmarks f) The result is that you should NEVER, EVER believe that mips-ratings mean anything at all unless you can obtain substantial backup data, including enough benchmarks in common that you can compare and figure out what individual mips-ratings really mean. By "enough", just for CPU benchmarks, one would like at least 10 each of integer and floating-point benchmarks, and real programs, not toys. 5) Expect things to be real messy for a while. 1988 is the year of true mipsflation and trade press ooh-and-ahhing over each new entry in the mips-race, and it is very hard to tell what is real, what is almost-real, and what is near-fantasy, or heavy-pre-announcemnt, especially when the press writes about everything in the present tense :-) 6) IS THERE ANY HOPE? Well, not much, but there are a few hopeful signs amidst the mess: a) Some people gather and publish benchmarks as a public service. Notable examples include Richardson [DHrystone], Dongarra [LINPACK], McMahon [LIevermore Loops]. b) Some of the trade press has gotten tired of just quoting vendor-supplied mips-ratings, and are acquiring/developing useful benchmarks to be run. This doesn't mean the benchmarks are necessarily GOOD ones, but sometimes they are reasonable, and in any case, these folks are at least trying, and they deserve support and encouragement from the rest of us. Particular examples (there are more, of course) include: Digital Review magazine has a large FORTRAN benchmark they use to compare systems. It contains 33 sub-benchmarks, some of which have problems. However, the folks at DR continue to analyze the results and improve the usefulness of the suite, and they've started weeding out problems by looking at the statistical effects of the various benchmarks. Byte magazine has done benchmarks for years. Some of the benchmarks are not very applicable to higher-performance systems, but they're also trying hard, by generating more applications-oriented benchmarks. UNIX Review has a column provided by Workstation Laboratories, which runs quite a few benchmarks and compares machines. The July 88 issue covers the Sun 4/200, and a MIPS M/120 will appear soon. Other magazines are looking for help from vendors in selecting good applications benchmarks. c) Everybody [especially those who BUY computers, rather than those of us who sell them :-)] can help by refusing to ceept meaningless numbers 1) From vendors 2) From the press -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
yuval@taux02.UUCP (Gideon Yuval) (08/25/88)
In article <2693@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes: >2) It might be reasonable to compute IQFs, but you need access to very >good architectural simulations. It is very hard to measure them by running >benchmarks. Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has some simulations on two architectures: one is almost the full NS32K architecture (except that memory-to-memory operations are not there); the other is a load/store RISC subset of the NS32K architecture. He finds a 12% average difference in the number of instructions needed to run the same program. In the 5 programs run, this difference was 7% for DC, PTC (a Pascal-to-C converter) and SORT, 17% for GREP, and 23% for SED. -- Gideon Yuval, yuval@taux01.nsc.com, +972-2-690992 (home) ,-52-522255(work) Paper-mail: National Semiconductor, 6 Maskit St., Herzliyah, Israel TWX: 33691, fax: +972-52-558322
mash@mips.COM (John Mashey) (08/26/88)
In article <99@taux02.UUCP> yuval@taux02.UUCP (Gideon Yuval) writes: >In article <2693@winchester.mips.COM> mash@winchester.UUCP (John Mashey) writes: >>2) It might be reasonable to compute IQFs, but you need access to very >>good architectural simulations. It is very hard to measure them by running >>benchmarks. > >Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has some simulations >on two architectures: one is almost the full NS32K architecture (except >that memory-to-memory operations are not there); the other is a load/store >RISC subset of the NS32K architecture. He finds a 12% average difference in >the number of instructions needed to run the same program. In the 5 >programs run, this difference was 7% for DC, PTC (a Pascal-to-C converter) >and SORT, 17% for GREP, and 23% for SED. Interesting numbers. Please say some more on methodology, if possible, and what sort of compiler technology was being used. I think this is saying that deleting the non-RISC-subset adds 12% in number of instructions. Did the thesis say anything about total cycle counts (i.e., incl. memory system?) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
walter@garth.UUCP (Walter Bays) (08/26/88)
>(John Mashey) writes: >>2) It might be reasonable to compute IQFs, but you need access to very >>good architectural simulations. In article <99@taux02.UUCP> yuval@taux02.UUCP (Gideon Yuval) writes: >Ori Danieli's M.Sc. Thesis (Tel Aviv University, 7/88) has some simulations >on [NS32K versus a RISC subset of NS32K] This seems very difficult to me, since you can only measure the compiler + instruction set, not just the instruction set. (Or if you resort to hand-coded assembler, you measure programmer + instruction set.) Even if you use the same compiler front-end, the differences in code generators will influence the results. A RISC code generator that is a modified version of a CISC code generator probably won't do as good a job of register allocation, register targeting, and handling intermediate values. A CISC code generator that is a modified version of a RISC code generator will probably use mostly a RISC subset of the instructions. This is not a criticism of Mr. Danieli's thesis, which I have not read. It's now on my "read sometime" list. If you tell me he went to great lengths to address the compiler effects, it goes on my "read as soon as I can find a copy!" list. -- ------------------------------------------------------------------------------ My opinions are my own. Objects in mirror are closer than they appear. E-Mail route: ...!pyramid!garth!walter (415) 852-2384 USPS: Intergraph APD, 2400 Geng Road, Palo Alto, California 94303 ------------------------------------------------------------------------------