mohan@uiatma.atmos.uiuc.edu (mohan ramamurthy) (11/08/90)
Given below are the results of a study comparing the performance of various computers (workstations as well as supercomputers) for six floating-point-intensive Fortran codes. These programs represent what you might call a hierarchy of "typical" geophysical fluid dynamics models. Identical codes were used on all machines, and no effort whatsoever was made to explicitly optimize the codes on a particular machine - other than the use of the highest available compiler optimization. Some codes have a large percentage of scalar computations (e.g., the QG model spends over 95 percent of its time in scalar segments), where as the others, like the PE model code and the Jacobian Operator program, contain several kernels that lend themselves very well to vectorizing. Additionally, the latter two codes also contain numerous linked triad operations (e.g., D = A*B + C), which would explain the disproportionately high performance of those codes on computers with vector and super-scalar architectures. Again, it should be reiterated that this benchmark suite is dominated by floating-point-intensive codes; therefore, it is NOT intended to reflect on other measures of CPU performance (like I/O, integer, etc.). Relative Performance in MicroVAX Units ====================================== Shallow Spectral QG Multi- Jacobian Onelevel Geom. Mean water model model level operator channel model PE model model. ============================================================================== Cray-YMP 936.0 114.5 120.4 1155.4 2196.5 354.1 475.8 Cray-2 784.0 79.7 92.5 790.5 1098.2 275.4 333.7 Cray-XMP 549.3 77.7 89.3 442.4 878.6 165.3 250.1 CYBER 205 126.8 38.9 42.8 272.6 399.4 65.2 107.0 Stardent GS2000 164.8 27.5 9.5* 128.0 366.1 43.5 66.7 Convex C220 109.9 19.8 20.8 105.9 244.1 45.9 61.4 IBM RS6000/320 48.5 30.8 38.0 75.8 81.4 42.7 49.6 Stardent GS1000 103.0 7.0 6.9 67.3 122.0 18.0 30.0 DECstation 5000 43.4 17.3 38.6 41.6 22.1 25.0 29.6 Apollo DN10000 53.2 9.8 41.4 29.5 23.2 33.0 28.0 Silicon Gr 4D/25 32.6 10.1* 29.3 25.7 22.9 15.8 21.1 DECstation 3100 28.4 10.0 25.5 24.7 22.1 16.2 20.9 HP 9000/835 16.7 7.9 16.3 16.6 18.3 9.7 13.6 Sparcstation 1 12.6 4.6 10.3 16.0 11.6 8.7 9.9 MicroVAX II 1.0 1.0 1.0 1.0 1.0 1.0 1.0 ============================================================================ *Could not run at the highest optimization level. Cray and Convex timings were obtained using just one processor. ================================================================================ Prof. Mohan K. Ramamurthy |Internet:mohan@uiatma.atmos.uiuc.edu Department of Atmospheric Sciences | DECnet: uiatmb::mohan University of Illinois at Urbana-Champaign | ZAP (FAX): (217) 244-4393 PONY EXPRESS : 105 S. Gregory Avenue | Telephone: (217) 333-8650 Urbana, IL 61801 | ICBM: 40.08N 88.31W "Read my MIPS: No new VAXes" ================================================================================
mike@hpfcso.HP.COM (Mike McNelly) (11/09/90)
Could you please give some specifics on the machines you tested. For example, OS version, hardware configurations, and compiler version all make large differences in performance. Given the rapidly changing levels of performance on all vendor's machines, it's only fair to identify what you're testing and publicizing. Thanks, Mike McNelly mike@fc.hp.com
mohan@uiatma.atmos.uiuc.edu (mohan ramamurthy) (11/13/90)
In article <7370251@hpfcso.HP.COM> mike@hpfcso.HP.COM (Mike McNelly) writes: >Could you please give some specifics on the machines you tested. For >example, OS version, hardware configurations, and compiler version all >make large differences in performance. Given the rapidly changing >levels of performance on all vendor's machines, it's only fair to >identify what you're testing and publicizing. > >Thanks, >Mike McNelly >mike@fc.hp.com Yes, at some point, when I find some spare time, I'd like to write up a more detailed article explaining the different system configurations, compiler versions and the details of the computations, along with an interpretation of the results. But I don't believe any of that is going to change the "ball-park" floating-point performance figures I have posted, which is what most end-users wish to know. I can see why Mr. Mike McNelly is not pleased with the posting, for no vendor likes to see its machine near the bottom of a performance pile, but frankly, that is not my problem.
wunder@orac.HP.COM (Walter Underwood) (11/14/90)
I can see why Mr. Mike McNelly is not pleased with the posting, for no vendor likes to see its machine near the bottom of a performance pile, but frankly, that is not my problem. All of the machines at the bottom of the pile are two or more years old, so they should be at least half the speed of current processors. The MicroVAX II, the Sparcstation 1, the DECstation 3100, and the 9000/835 are all old machines, and each vendor has a much faster machine on the market now. Also, all three of those vendors have been working hard on their compilers, so a mismatch of one or two versions is significant. It didn't look like Mike was complaining -- he just wanted to know more, so he could really compare the data. Running a set of real-world programs on a bunch of a machines is a lot of work, and it would be nice to have enough information about the experiment to figure out what it means. Without the additional information, the reported numbers are mostly noise, with very little signal. wunder PS: Is the code available? Some people might want to re-run the benchmarks with compilers now in development.
mike@hpfcso.HP.COM (Mike McNelly) (11/15/90)
> In article <7370251@hpfcso.HP.COM> mike@hpfcso.HP.COM (Mike McNelly) writes: > >Could you please give some specifics on the machines you tested. For > >example, OS version, hardware configurations, and compiler version all > >make large differences in performance. Given the rapidly changing > >levels of performance on all vendor's machines, it's only fair to > >identify what you're testing and publicizing. > > > >Thanks, > >Mike McNelly > >mike@fc.hp.com > > Yes, at some point, when I find some spare time, I'd like to write > up a more detailed article explaining the different system configurations, > compiler versions and the details of the computations, along with an > interpretation of the results. But I don't believe any of that is going to > change the "ball-park" floating-point performance figures I have posted, > which is what most end-users wish to know. I can see why Mr. Mike > McNelly is not pleased with the posting, for no vendor likes to see its > machine near the bottom of a performance pile, but frankly, that is not > my problem. I'm not personally involved in any of the machines you've reported on but I have seen many such lists over the years. Few, if any, of the authors find spare time to provide either the source code of the benchmark in question or the machine configurations which make substantial differences, your viewpoint to the contrary. Small things like floating point accelerators can change performance 2-5x, for example. We once found in our lab that a minor algorithmic change in I/O policy made a substantial difference in one benchmark (but not in overall system performance). We're also in an era where machine performance is increasing by roughly 2x per year on most machines so it would be nice to know if a benchmark is comparing machines from different time frames. A lot of people at all of the companies represented by benchmark lists work hard to improve the performance of their machines. It's disheartening to these folks when their efforts are ignored or misrepresented. It's also a fact that no package will make a bit of difference if it isn't installed. You've apparently spent a substantial amount of time on your benchmark. Why not finish the job. Neglecting to post the details seems roughly analogous to posting a thesis without relevant attributions. Mike McNelly mike@fc.hp.com