mark@akbar.megatek.uucp (mark thompson) (06/28/89)
Lately, it seems that the integer performance of popularly available (especially RISC) computers seems to be outrunning the floating point performance. A little paper design for a SPARC system using the latest Cypress IU and FPC/FPU gets me a MIPS/MFLOPS ratio of about 10. This seems a little out of whack... it seems that older scientific processors had ratios in the 3-4 range. Looking a published info on MIPS, and some hand waving gets me a ratio of about 5-6, better but still slow (an aside: what are the MIPS guys doing to get the speeds up higher than the SPARC guys? compilers?) Why is the floating point lagging integer performance so much? What is being done to get this back in balance? -mark -- mark thompson uunet!megatek!mark <Opinions expressed herein are not necessarilly those of Megatek Corp> --
khb%chiba@Sun.COM (Keith Bierman - SPD Languages Marketing -- MTS) (06/28/89)
In article <596@megatek.UUCP> mark@megatek.UUCP () writes: >This seems a little out of whack... it seems that older scientific >processors had ratios in the 3-4 range. Current SPARC implementations (chips and system) from Sun were intended for "more general purpose use" hence the (relatively) narrow gap between integer performance on a Cray to a 4/330. While floating point is fun (and is typically my reason for existing on a project) I spend most of my day doing compiles, editing, runing schedtool, and other nonFP things. So using the 80-20 rule... the first machines should be the ones we need 80% of the time. > >Looking a published info on MIPS, and some hand waving gets me a ratio >of about 5-6, better but still slow (an aside: what are the MIPS guys >doing to get the speeds up higher than the SPARC guys? compilers?) Compilers is often stated, but according to my weeks of staring at huge volumes of data, it seems that the compiler differences are minimal on large codes. The current sun compilers are somewhat less clever about certain operations, but not enough to explain the difference in performance. What is interesting is that the benchmarks which SPARC does worst on are highly FP and memory intensive (say 30-50% loads and stores). MIPSco built their own FPU and tightly coupled it to their IU. This resulted in early units which were superior to the SPARC implementation philosophy (let's buy whatever is laying around and glue it in -- in the first implementations that meant a weitek 1164 and 1165 and a controller ... "leftovers" from the sun3/fpa project). At yesterday's IEEE HOT CHIPS conference, we were treated to three papers about dedicated SPARC FPU's in addition to the papers focused on FPU's BIT is already sampling ECL SPARC chips. So the FPU integration/implementation variable is tilting towards SPARC (unless one assumes that MIPSco is smarter than Ross, Fuji.,BIT, LSI, TI, Solb., Prisma and all the others. As for loads and stores current IMPLEMENTATIONS of SPARC use 2 and 3 cycle parts ... this is NOT part of the arch. but was a concession to low cost system design. High performance SPARC systems (i.e. those designed to use all the implementation tricks) are just now appearing (not anything we have announced :< but these "low performance" models are actually quite snappy ... a 4/330GX makes a VERY nice personal workstation). > >Why is the floating point lagging integer performance so much? What is >being done to get this back in balance? Well, ld/sto is key ... linpack is about as kind as it gets, and that is 1.5 memory references for every FLOP. Second, chips designed from the ground up to be SPARC FPU's rather than random bits of sand are just now available (weitek 3170 and 71 TMS390C602, LSI 64814 to name just 3). PRIMSA delievered a system level paper about their 250 MIPS (native, say 100vaxmips) 100Mflop SPARC machine. Not sampling just yet, but probably 2nd qtr next year (my guess, from the presentation; no solid info; last press release said a working machine in Jan ... but I assume that they might want to test it before shipping it :>). Keith H. Bierman |*My thoughts are my own. Only my work belongs to Sun* It's Not My Fault | Marketing Technical Specialist ! kbierman@sun.com I Voted for Bill & | Languages and Performance Tools. Opus (* strange as it may seem, I do more engineering now *)
henry@utzoo.uucp (Henry Spencer) (06/28/89)
In article <596@megatek.UUCP> mark@megatek.UUCP () writes: >... (an aside: what are the MIPS guys >doing to get the speeds up higher than the SPARC guys? compilers?) No -- system designers who really care about floating-point performance. -- NASA is to spaceflight as the | Henry Spencer at U of Toronto Zoology US government is to freedom. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
les@unicads.UUCP (Les Milash) (06/29/89)
In article <112807@sun.Eng.Sun.COM> khb@sun.UUCP (Keith Bierman - SPD Languages Marketing -- MTS) writes: >PRIMSA [...] Not sampling just yet, but... i'd sure like to get on the list for one of the early "samples".
roelof@idca.tds.PHILIPS.nl (R. Vuurboom) (06/29/89)
In article <596@megatek.UUCP> mark@megatek.UUCP () writes: >performance. A little paper design for a SPARC system using the latest >Cypress IU and FPC/FPU gets me a MIPS/MFLOPS ratio of about 10. > >This seems a little out of whack... it seems that older scientific >processors had ratios in the 3-4 range. > >Looking a published info on MIPS, and some hand waving gets me a ratio >of about 5-6, better but still slow (an aside: what are the MIPS guys >doing to get the speeds up higher than the SPARC guys? compilers?) > >Why is the floating point lagging integer performance so much? What is >being done to get this back in balance? > Question is: Is a 3-4 MIPS/MFLOPS balanced? To avoid the eternal "it depends on the application" suppose we agree that for example the SPEC Benchmark suite is a representative model of our application. Can anybody give some sort of (simplistic, I know) rules-of-thumb about MIPS/MFLOPS real estate ratios as a function of performance. Something like: increasing MFLOPS performance x% would mean y% more real estate needed, the corresponding real estate reduction for the IU (and rest) would probably mean z% less MIPS. By varying the MIPS/MFLOPS ratios (given a fixed amount of silicon) a ratio best tuned to the Suite could be calculated using the agreed upon weightings etc. Since we (one sidedly) agreed that the suite was a representative model of our application world this could be a quasi-objective determination of what is a "balanced" processor. Flames anyone? Disclaimer: Opinions are really just onions and pi. -- Roelof Vuurboom SSP/V3 Philips TDS Apeldoorn, The Netherlands +31 55 432226 domain: roelof@idca.tds.philips.nl uucp: ...!mcvax!philapd!roelof
lamaster@ames.arc.nasa.gov (Hugh LaMaster) (06/30/89)
In article <140@ssp1.idca.tds.philips.nl> roelof@idca.tds.PHILIPS.nl (R. Vuurboom) writes: >Question is: Is a 3-4 MIPS/MFLOPS balanced? I personally like a balance of scalar MFLOPS = 1/3 of MIPS and vector MFLOPS = 3* MIPS. The reasons are manifold, but I have found this to be a "cost effective" ratio on older ECL discrete/SSI/MSI vector mainframe systems. More recently, folks in the RISC camp have been saying that this ratio is "obsolete", in the sense that the "extra" scalar MIPS are almost free relative to the cost of providing the extra MFLOPS, so a lower ratio is more appropriate. It appears to me that MIPSCO is doing about the best that can be done with the new R3xxx chips, so I am *not* complaining. But I still think that a vector instruction set provides a cheap way to get the most out of the existing floating point real estate, and can improve performance significantly, using the *same* floating point units, over a machine with only a scalar instruction set. Usually a speedup of about 5 is possible in this case, so I would now like to see MIPS = vector MFLOPS and scalar MFLOPS 1/5 of MIPS. For the purpose of comparing microprocessors, I am satisfied to define MIPS as the ratio of harmonic means of many integer benchmarks relative to their times on a VAX 11/780. And vector MFLOPS as the time on the standard LINPACK benchmark. Thanks to Weitek, MIPSCO, Fairchild/Intergraph, et al. for raising the standard of floating point performance in the micro world. Hugh LaMaster, m/s 233-9, UUCP ames!lamaster NASA Ames Research Center ARPA lamaster@ames.arc.nasa.gov Moffett Field, CA 94035 Phone: (415)694-6117
prc@erbe.se (Robert Claeson) (06/30/89)
In article <596@megatek.UUCP> mark@megatek.UUCP () writes: >Lately, it seems that the integer performance of popularly available >(especially RISC) computers seems to be outrunning the floating point >performance. A little paper design for a SPARC system using the latest >Cypress IU and FPC/FPU gets me a MIPS/MFLOPS ratio of about 10. >This seems a little out of whack... it seems that older scientific >processors had ratios in the 3-4 range. A 17 MIPS Motorola 88100 RISC CPU has a fp performance of about 12 MFLOPS. That gives (at least) me a MIPS/MFLOPS ratio for that chip of only ~1.4. -- Robert Claeson E-mail: rclaeson@erbe.se ERBE DATA AB
rro@bizet.CS.ColoState.Edu (Rod Oldehoeft) (07/03/89)
In article <749@maxim.erbe.se> rclaeson@erbe.se (Robert Claeson) writes: > >A 17 MIPS Motorola 88100 RISC CPU has a fp performance of about 12 MFLOPS. >That gives (at least) me a MIPS/MFLOPS ratio for that chip of only ~1.4. I've usually heard MIPS/MFLOPS ratio discussed as the ratio between the number of instructions (nonFP/FP) actually executed when one runs an application program of interest. This is a function of both the architecture and compiler and is harder to measure than dividing peak numbers. Rod Oldehoeft rro@handel.CS.ColoState.EDU Computer Science Department 303/491-5792 Colorado State University Fort Collins, CO 80523
mash@mips.COM (John Mashey) (07/05/89)
In article <749@maxim.erbe.se> rclaeson@erbe.se (Robert Claeson) writes: >In article <596@megatek.UUCP> mark@megatek.UUCP () writes: >>Lately, it seems that the integer performance of popularly available >>(especially RISC) computers seems to be outrunning the floating point >>performance. A little paper design for a SPARC system using the latest >>Cypress IU and FPC/FPU gets me a MIPS/MFLOPS ratio of about 10. >>This seems a little out of whack... it seems that older scientific >>processors had ratios in the 3-4 range. >A 17 MIPS Motorola 88100 RISC CPU has a fp performance of about 12 MFLOPS. >That gives (at least) me a MIPS/MFLOPS ratio for that chip of only ~1.4. Whenever I've seen MIPS/MFLOPS ratios discussed, I don't ever remember MFLOPS being peak-MFLOPS, but rather, LINPACK DP MFLOPS, usually FORTRAN, I think. Given the well-known fuzziness of mips-ratings, it's a little harder. If you take the currently-published numbers for a 20MHz 88K, they get 1.2MFLOPS FORTRAN DP LINPACK, and 2.2 Coded. If you use 17 MIPS, you're probably using Dhrystone-mips, which are usually 20-25% higher (assuming no strcpy inlining) that what people see for VAX-relative-versus-good-compilers-on-real-programs-mips. To get a bound, assume the LINPACK numbers shown, and 13-17 mips. This gives 13/2.2 = 6, and 17/1.2 = 14 as the limits of this. If I had to bet, as the compilers get better, I'd suspect a realistic number might be 14/2 = 7. All this is highly subject to cache configurations, etc , and so one must be verycareful not to over-interpret such numbers. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086