rclark@bgphp1.UUCP (Roger N. Clark) (03/26/88)
I have benchmarked the HP9000 series 825 using number crunching programs and find: The 825 is 5 to 7 times SLOWER than a single cpu 500!!!!! In a multitasking environment the 825 can be at least 15 TIMES SLOWER ^^^^^^^^^^^^^^^ than a 3 cpu 500!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The details: In February a note was posted to comp.sys.hp that the HP9000 series 500 was being discontinued. That caused quite a flurry of responses, including several that said the new HP9000 series 825 is much faster. I have heard several stories about how 3 9000 s500's were replaced with one 825 and everyone was happy. HP is saying the 825 is very fast. Well, on February 19, I posted a rather strong note about the 500 being discontinued. The 500 is no longer being suported in that there will be no more software releases (that is especially disturbing considering that HP-UX 5.21 apparently has many problems). I need features that are not on the 500 (network file system or at least TCP/IP, domain-based mail). HP has said I need to upgrade to an 825 (or higher). Before changing machines, I benchmark it with programs similar to what my group does. Here at the USGS, we do analysis of spectra of rocks and minerals and apply the results to imaging data (remote sensing). I am on 3 NASA planetary spacrcraft teams and the methods will be applied to gigabytes of data in the 1990's. The analyses includes some very sophisticated (and number crunching intensive) modeling programs. The programs are not huge (less than 2 MBytes on a 6.5 MByte system) and we do not have a paging problem. Below are the results of a simple "wierd box filter" program. This program shows a typical response in our shop. It does both array indexing and computation on elements in the arrays. The compiled program is only about 350KBytes in size and it does not page to disk. A Multitasking, CPU intensive Benchmark Real Time ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 HP900/825 HPUX1.2 29.1 58.1 87.2 116.3 145.6 205.0 291.5 350.1 ----------------------------------------------------------------------- CPU Time ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/500 3 CPUs 5.8 11.8 18.4 24.4 30.7 43.0 62.2 81.4 HP900/825 HPUX1.2 29.0 57.9 87.0 116.0 144.7 202.7 288.5 346.5 ----------------------------------------------------------------------- NOTES: HP9000/500: 6.5 MBytes main memory, 3 floating point CPUs, 65MByte system, 55MByte /tmp disk, 132MByte user disk, 571MByte data disk (Used by virtual memory), HP-UX 5.21. HP9000 series 825 (HP Precision Architecture, RISC machine) 16 MBytes of main memory, single 404MByte disk drive. HP Demo, 3/23/88. HP-UX 1.2 (Also tried it on HP-UX 2.0 pre-release with slightly worse results). ----------------------------------------------------------------------- I have several other benchmarks. On number crunching programs that do not have array indexing (just do +, -, *, /, logs, sin, cos, sqrt, powers) the results came out (normalized to s500): single cpu program 825 500 --------------------------------------------------------------- in C 7.6 1 (825 7.6 times SLOWER) single precision Fortran 3.23 1 (825 3.23 times SLOWER) double precision Fortran 6.7 1 (825 6.7 times SLOWER) WHAT DOES ALL THIS MEAN? HP advertises the 825 as a 0.5 megaflop machine. My results show it as about a 0.03 megaflop machine. The benchmarks were done several times wiith different machine configurations at the Neeley sales region (Hal Shearer, hpuecoa!hals). HP has benn very helpful but has not been able to figure out why these results are so bad. HP has a new 835 that is substantially faster. This benchmark has been run at Fort Collins but I haven't gotten the results yet. I have heard that they are faster than a 3 cpu 500 however. A LESSON EVERYONE SHOULD KNOW: BENCHMARK YOUR APPLICATION BEFORE YOU BUY A MACHINE. ^^^^^^^^^^^^^^^^ Is the 825 really that bad? Could there be a problem with the 825 I tested. The sieve benchmark came out 12 times faster than a single cpu 500 and all my I/O benchmarks came out very fast. I think the 825 has a real problem with number crunching. I then looked at alternatives to the 825. I tried a 350 but I currently have about 8 to 10 users on every day. We have 29 RS232 ports, 6 HP-IB cards (4 disks, 2 plotters, 1 9-track tape, 2 cartridge tapes), 2 printers, 3 modems and 3 spectrometers connected to the 500. (The benchmarks were also done on the 500 WHILE a program was locked in memory gathering data from a spectrometer real time!). The 350 does not have enough slots to put all this stuff in it. CONCLUSIONS: The HP9000 series 500 is a DAMN GOOD machine. HP doesn't seem to know how good it is! I gues because they failed to market it those who bought it now have to suffer. HOW GOOD IS IT? As I write this note, we have been up 129 days. We have never had a operating system crash! In 4 years, we have only gone down for adding new boards, occassional disk image backups, or power failures. We have been up for as long as 6 months! We have 8 to 10 users on every day, and gather data from 3 different spectrometers while users are doing compute modeling and interactive analysis with graphics (on HP2623A and HP2393A terminals). The machine is currently the central node in our Branch uucp network and a nationwide uucp network of spectroscopy groups. During power failures, we have never lost data except once: our air conditioner caught fire and I pulled the plug! (we only lost one small text file, and we had many active users on at the time). The process ID rolls over (32000 or processes) every day or two. We have had only two hardware problems: the main power supply went out shortly ofter installation and an 8-channel mux went bad about a year ago. I HAVE NEVER SEEN SUCH A SOLID MACHINE! Contrast the above to our VAXes and PEs: they have to reboot every few days to a couple of weeks or so, and have hadware problems about every month (of course they are getting old and are older technology). ************************************************************************ * * * BRING BACK THE 500 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! * * * ************************************************************************ Below is the "wierd box filter" benchmark. Try it yourself. I would be interested in what you find. Roger N. Clark Research Scientist U.S. Geological Survey, MS 964 Box 25046 Federal Center Denver, CO 80225-0046 (303) 236-1332 FTS 776-1332 {known-world}!hplabs!hpfcla!hpfcse!hpuecoa!bgphp1!rclark #---------------------------------- cut here ---------------------------------- # This is a shell archive. Remove anything before this line, # then unpack it by saving it in a file and typing "sh file". # # Wrapped by rclark at bgphp1 on Fri Mar 25 08:07:26 1988 # # This archive contains: # makefile speedtest.f multi.sh timeit # # Error checking via wc(1) will be performed. # Error checking via sum(1) will be performed. echo x - makefile cat >makefile <<'@EOF' CFLAGS= FFLAGS= LFLAGS= RFLAGS= -6% -C GET= get GFLAGS= a.out: speedtest.f f77 $(FFLAGS) speedtest.f @EOF set -- `sum <makefile`; if test $1 -ne 7217 then echo ERROR: makefile checksum is $1 should be 7217 fi if test "`wc -lwc <makefile`" != ' 9 14 105' then echo ERROR: wc results of makefile are `wc -lwc <makefile` should be 9 14 105 fi chmod 644 makefile echo x - speedtest.f cat >speedtest.f <<'@EOF' C array addressing and number crunching implicit integer*4 (i-n) common array1(200,200), array2(200,200), z(9) limit = 200 ktimes = 1 C initialize arrays do 10 j= 1, 9 z(j) = float(j)+2.0 10 continue x = 1.0 do 30 j = 1, limit do 20 i = 1, limit x = x + 1.0 array1(i,j) = x 20 continue 30 continue do 200 k = 1, ktimes C main computation loop: Weird Box Filter do 100 j = 2, limit-1 do 50 i = 2, limit-1 array2(i,j) = 1 ( array1(i-1,j-1)*2.0*z(1) 1 +array1(i ,j-1)*2.0/z(2) 1 +array1(i+1,j-1)*2.0*z(3) 1 +array1(i-1,j )*2.0/z(4) 1 +array1(i ,j )*2.0*z(5) 1 +array1(i+1,j )*2.0/z(6) 1 +array1(i-1,j+1)*2.0*z(7) 1 +array1(i ,j+1)*2.0/z(8) 1 +array1(i+1,j+1)*2.0*z(9)) 1 /(9.0*(z(1)-z(2)+z(3)- 1 z(4)+z(5)-z(6)+z(7)- 1 z(8)+z(9))) 50 continue 100 continue C main computation loop complete 200 continue stop end @EOF set -- `sum <speedtest.f`; if test $1 -ne 11286 then echo ERROR: speedtest.f checksum is $1 should be 11286 fi if test "`wc -lwc <speedtest.f`" != ' 52 130 1447' then echo ERROR: wc results of speedtest.f are `wc -lwc <speedtest.f` should be 52 130 1447 fi chmod 644 speedtest.f echo x - multi.sh cat >multi.sh <<'@EOF' for i do a.out & done wait @EOF set -- `sum <multi.sh`; if test $1 -ne 2160 then echo ERROR: multi.sh checksum is $1 should be 2160 fi if test "`wc -lwc <multi.sh`" != ' 6 7 29' then echo ERROR: wc results of multi.sh are `wc -lwc <multi.sh` should be 6 7 29 fi chmod 744 multi.sh echo x - timeit cat >timeit <<'@EOF' set -x echo "********** weird box filter *********" /bin/time /bin/sh multi.sh 1 /bin/time /bin/sh multi.sh 1 /bin/time /bin/sh multi.sh 1 /bin/time /bin/sh multi.sh 1 2 /bin/time /bin/sh multi.sh 1 2 /bin/time /bin/sh multi.sh 1 2 /bin/time /bin/sh multi.sh 1 2 3 /bin/time /bin/sh multi.sh 1 2 3 /bin/time /bin/sh multi.sh 1 2 3 /bin/time /bin/sh multi.sh 1 2 3 4 /bin/time /bin/sh multi.sh 1 2 3 4 /bin/time /bin/sh multi.sh 1 2 3 4 /bin/time /bin/sh multi.sh 1 2 3 4 5 /bin/time /bin/sh multi.sh 1 2 3 4 5 /bin/time /bin/sh multi.sh 1 2 3 4 5 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 8 9 10 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 8 9 10 /bin/time /bin/sh multi.sh 1 2 3 4 5 6 7 8 9 10 echo "************ DONE weird box filter benchmark ************" @EOF set -- `sum <timeit`; if test $1 -ne 200 then echo ERROR: timeit checksum is $1 should be 200 fi if test "`wc -lwc <timeit`" != ' 33 175 888' then echo ERROR: wc results of timeit are `wc -lwc <timeit` should be 33 175 888 fi chmod 755 timeit exit 0
diamant@hpfclp.HP.COM (John Diamant) (03/29/88)
> I have benchmarked the HP9000 series 825 using number crunching > programs and find: > > In a multitasking environment the 825 can be at least > > 15 TIMES SLOWER > ^^^^^^^^^^^^^^^ > than a 3 cpu 500!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! These are not the numbers I get. I'm not sure where all the discrepancy came from, but I ran your own, unmodified program on our 825. Our machine has more memory, and a possibly later version of the 2.0 prerelease. However, I have to point out that your benchmark was run unoptimized, which is not a good idea on a RISC-based machine. As you can see from the numbers below, the optimization (HP9000/824 HP-UX2.0opt) makes quite a difference. Compile with "-O" to get optimization. It will make much more of a difference on a RISC machine than a CISC machine, so running unoptimized on both machines is not a fair comparison (it's much less important on the 500). I wasn't sure whether you counted both user and sys time in CPU time. I did in my numbers. System time varied between .3 and 3 seconds. > The details: > > A Multitasking, CPU intensive Benchmark > > Real Time > ----------------------------------------------------------------------- > Number of Tasks > System 1 2 3 4 5 7 10 12 > ----------------------------------------------------------------------- > HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 > HP900/825 HPUX1.2 29.1 58.1 87.2 116.3 145.6 205.0 291.5 350.1 HP9000/825 HP-UX2.0pre 3.0 5.3 7.8 10.4 13.2 18.3 27.4 31.7 HP9000/825 HP-UX2.0opt 2.4 4.3 6.5 9.3 9.9 13.9 18.9 23.0 > ----------------------------------------------------------------------- > CPU Time > ----------------------------------------------------------------------- > Number of Tasks > System 1 2 3 4 5 7 10 12 > ----------------------------------------------------------------------- > HP9000/500 3 CPUs 5.8 11.8 18.4 24.4 30.7 43.0 62.2 81.4 > HP900/825 HPUX1.2 29.0 57.9 87.0 116.0 144.7 202.7 288.5 346.5 HP9000/824 HP-UX2.0pre 2.7 5.2 7.7 10.3 12.9 18.2 26.2 31.1 HP9000/824 HP-UX2.0opt 2.0 3.9 5.8 7.8 9.5 13.3 18.7 19.6 > ----------------------------------------------------------------------- > NOTES: > HP9000/500: 6.5 MBytes main memory, 3 floating point CPUs, 65MByte > system, 55MByte /tmp disk, 132MByte user disk, 571MByte > data disk (Used by virtual memory), HP-UX 5.21. > > HP9000 series 825 (HP Precision Architecture, RISC machine) 16 MBytes of > main memory, single 404MByte disk drive. HP Demo, 3/23/88. > HP-UX 1.2 (Also tried it on HP-UX 2.0 pre-release with slightly > worse results). HP9000 series 825: 32 MBytes of main memory, single 404Mb disk drive. HP-UX 2.00 prerelease (probably more recent than yours). The opt entries were run through the optimizer; the other ones weren't. > ----------------------------------------------------------------------- > > WHAT DOES ALL THIS MEAN? HP advertises the 825 as a 0.5 megaflop > machine. My results show it as about a 0.03 megaflop machine. The > benchmarks were done several times wiith different machine > configurations at the Neeley sales region (Hal Shearer, hpuecoa!hals). > HP has benn very helpful but has not been able to figure out why > these results are so bad. My numbers are coming out over 10 times better than yours, so the .5 megaflop seems about right. I don't know why you were getting so much worse numbers, but I doubt the extra 16 MBytes was the difference, since your program was so small (unless it dynamically allocated a whole bunch of memory). Floating point hardware in the 825 is essentially the same as in the 500, so the multi-CPU 500 could be somewhat better. Other series 800 machines have faster floating point hardware. In operations other than floating point, the 825 is faster than the 500 (even multi-CPU). > > HP has a new 835 that is substantially faster. This benchmark has > been run at Fort Collins but I haven't gotten the results yet. I > have heard that they are faster than a 3 cpu 500 however. The 835 uses faster floating point hardware, so this would be no surprise. > > A LESSON EVERYONE SHOULD KNOW: BENCHMARK YOUR APPLICATION BEFORE YOU > BUY A MACHINE. ^^^^^^^^^^^^^^^^ This is a good lesson in any case, though I think the machine you were testing on may have been misconfigured. > > Is the 825 really that bad? Could there be a problem with the 825 > I tested. The sieve benchmark came out 12 times faster than a > single cpu 500 and all my I/O benchmarks came out very fast. I > think the 825 has a real problem with number crunching. This is not consistent with other benchmarks, so I suspect it was something with the particular 825 you tested on. John Diamant SDE UUCP: {hplabs,hpfcla}!hpfclp!diamant Hewlett-Packard Co. ARPA Internet: diamant%hpfclp@hplabs.HP.COM Fort Collins, CO
shankar@hpclscu.HP.COM (Shankar Unni) (03/30/88)
> > In a multitasking environment the 825 can be at least > > 15 TIMES SLOWER > ^^^^^^^^^^^^^^^ > than a 3 cpu 500!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! > > NOTES: > HP9000/500: 6.5 MBytes main memory, 3 floating point CPUs, 65MByte > system, 55MByte /tmp disk, 132MByte user disk, 571MByte > data disk (Used by virtual memory), HP-UX 5.21. > > HP9000 series 825 (HP Precision Architecture, RISC machine) 16 MBytes of > main memory, single 404MByte disk drive. HP Demo, 3/23/88. > HP-UX 1.2 (Also tried it on HP-UX 2.0 pre-release with slightly > worse results). > Errr..., Before posting such stuff, it might have been nice to consult with the HP support org. 1. On your 500, your swap area and your user data area are on different disks, thus leading to less contention on the disk. On your 825 system, you used only one disk. 2. Did you compile your benchmark with optimization (-O)? From the makefile you attached, obviously not! 3. The 571 meg. disk on your 550 (a 7937, n'est ce pas?) is faster than the 404 meg disk (a 7935) on the 825. The first item is especially damaging, since your multitasking system is obviously very swap-intensive. Also, each system has one or more things that it does well. In the case of the 825, what it does very well indeed is cpu-intensive stuff. The cpu-to-memory bandwidth is good, too, but not on the same scale as the raw cpu speed. The compilers on the s800, therefore, rely heavily on the optimizer to take advantage of multiple registers and reduce physical memory usage as much as possible. Therefore, to get the best performance out of applications on the 825 , THEY NEED TO BE OPTIMIZED!!! > > Is the 825 really that bad? Could there be a problem with the 825 > I tested. The sieve benchmark came out 12 times faster than a ^^^^^^^^^^^^^^^ > single cpu 500 and all my I/O benchmarks came out very fast. I ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > think the 825 has a real problem with number crunching. > See, there are no problems with integer arithmetic and I/O (which is memory mapped). The problems you are having are with floating-point performance. (The famous MFLOPS number). This is being addressed : the 825 is rated at 0.7 MFLOPS (LINPACK?), and there is now a newer model (the 835) out there with a much faster floating-point card (> 2 times). > I HAVE NEVER SEEN SUCH A SOLID MACHINE! > > ************************************************************************ > * * > * BRING BACK THE 500 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! * > * * > ************************************************************************ > Thanx for the compliment. The 825 is pretty solid, too. Our s800 HP-UX (on a different CPU) has not had a crash in over 2 years. The only problem we ever had were a few mysterious swap errors which were traced down to an old, faulty disk with a bad head. The 500 had the unfortunate problem of being squeezed from both above (by the new s800's) and below (by the s300's - 680x0 based boxes). This positioning problem coupled with the (relatively) soft order situation was what led to its obsolescence. The continuing effort to maintain and enhance HP-UX on several radically different architectures was getting a little difficult, and the resource crunch was much too severe to justify such an effort. Maybe if there's enough demand, there would be talk of reviving it, though I feel that there are other alternatives. It won't be long at all before we have an alternative machine that has the price of a 550 with much better performance. In general, reducing the number of disparate machines has the (overwhelming) benefit of offering customers a much wider range of performance in the *same* architecture. The hard part that a company faces in such a situation is choosing which set of customers to hurt: the old loyal customers who are very happy with what they have, or the prospective new customers who are looking for a wide range of models to choose from and a relatively easy migration path. Besides - the s500 HP-UX was sort of an unusual item - its guts are different from the two major flavors out there (sysV, BSD) while trying valiantly to present the same interface. Maintaining it was no simple matter of keeping the guts uptodate across different models (which is what is done for the s300 and the s800 models), but completely re-implementing each new feature. The file system would have been most deeply affected by this - it was radically different from anything in sysV or BSD. The s300/s800 version of HP-UX has an essentially sysV file system with BSD extensions, and keeping track of industry standard features like NFS was a simple matter of taking Sun's code and making relatively minor modifications to fit it into HP-UX. For the s500, it would have been an implement-from-scratch affair. Lots of thought went into the decision to scrap the s500. So bear with us.. --scu
daryl@hpcllcm.HP.COM (Daryl Odnert) (03/30/88)
I would like to second Shankar's plea to make use of the -O (optimization) option before making your final timings on the system. Please try this and let us know what happens. Thanks, Daryl Odnert Code Generation/Optimization Project HP Computer Language Lab {outside world}!hplabs!hpcllcm!daryl
jeffh@weycord.WEYCO.COM (03/30/88)
Ya know, I've ran a few tests with the 825 and didn't see much performance improvement. I'm a s300 user- a couple of users and a lot of I/O and math. So what is "RISK" after the marketing hoopla settles? I haven't seem any "real" performance increase. The s800 reminds me of the 9817, whatever that thing was with a big black and white monitor, and the s500. Seemed like a good idea until it got lost in the background noise. At least s300 and s500 HP-UX was close the the same. It might be a good idea to wait a few years to see if the s800 follows the 9817's path... Jeff Harrell hpubvwa!weycord!jeffh
mash@mips.COM (John Mashey) (03/30/88)
In article <830004@bgphp1.UUCP> rclark@bgphp1.UUCP (Roger N. Clark) writes: >I have benchmarked the HP9000 series 825 using number crunching >programs and find: > The 825 is 5 to 7 times SLOWER than a single cpu 500!!!!! > A Multitasking, CPU intensive Benchmark > > Real Time >----------------------------------------------------------------------- > Number of Tasks >System 1 2 3 4 5 7 10 12 >----------------------------------------------------------------------- >HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 >HP900/825 HPUX1.2 29.1 58.1 87.2 116.3 145.6 205.0 291.5 350.1 MIPS M/1000 .7 1.0 1.3 1.8 2.4 3.1 4.6 HP9000/825 GUESS 2 3 4 6 7 9 14 .... The 825's FPU must be broken or not there. As one calibration, the FORTRAN SP Linpack MFLOPS for these are: .62 HP9000 Series 825S .098 HP9000 Series 500 As another, from other FP benchmarks we've seen, we'd guess an 825S to have about 30% of the performance of one of our MIPS M/1000s, whose numbers were added above. As can be seen, the 825 appears consistently about a factor of 20 slower than you'd expect. Trying this on one of our boxes with no FPU slows it down by about a 40X [kernel emulation], so the 825 may be doing such emulation also. I'd really be surprised if it were anything other than that. I think HP is pretty conservative and realistic on benchmarking: see, for example, the "HP 9000 Series 800 Performance Brief", 5/87, a fine document, well-written, with a broad coverage of useful benchmarks. (This is presumably gettable from local HP offices (?)) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
decot@hpisod2.HP.COM (Dave Decot) (03/31/88)
Shankar Unni writes: > ... The s300/s800 version of HP-UX has an essentially sysV file system > with BSD extensions, and keeping track of industry standard features > like NFS was a simple matter of taking Sun's code and making relatively > minor modifications to fit it into HP-UX. For the s500, it would have > been an implement-from-scratch affair. A slight correction to forestall more misconceptions... Both the Series 300 and Series 800 use a BSD-based (McKusick-style) file system (indeed, most of the kernel is BSD with bug fixes), but the interface above it is compatible with System V (although many BSD features are also present). Clearly, if we had used the System V file system code, NFS and other networking products such as ARPA/BSD services would have been much harder to port. Dave Decot hpda!decot
jarmo@tut.FI (Jarmo Sorvari) (03/31/88)
In article <830004@bgphp1.UUCP> rclark@bgphp1.UUCP (Roger N. Clark) writes: >I have benchmarked the HP9000 series 825 using number crunching >programs and find: > The 825 is 5 to 7 times SLOWER than a single cpu 500!!!!! > In a multitasking environment the 825 can be at least > 15 TIMES SLOWER > ^^^^^^^^^^^^^^^ > than a 3 cpu 500!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! >Below are the results of a simple "wierd box filter" program. This >program shows a typical response in our shop. It does both array >indexing and computation on elements in the arrays. The compiled >program is only about 350KBytes in size and it does not page to >disk. > > A Multitasking, CPU intensive Benchmark > > Real Time >----------------------------------------------------------------------- > Number of Tasks >System 1 2 3 4 5 7 10 12 >----------------------------------------------------------------------- >HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 >HP900/825 HPUX1.2 29.1 58.1 87.2 116.3 145.6 205.0 291.5 350.1 >----------------------------------------------------------------------- > CPU Time >----------------------------------------------------------------------- > Number of Tasks >System 1 2 3 4 5 7 10 12 >----------------------------------------------------------------------- >HP9000/500 3 CPUs 5.8 11.8 18.4 24.4 30.7 43.0 62.2 81.4 >HP900/825 HPUX1.2 29.0 57.9 87.0 116.0 144.7 202.7 288.5 346.5 >----------------------------------------------------------------------- I tried your benchmark also, and got results that look like they should look like, at least to my mind. The 9000/840 is the first RISC model they produced (for HP-UX), implemented in TTL technology, has 4.5 MIPS, while the 825 is an NMOS machine and 7 MIPS (if my memory serves me right). A Multitasking, CPU intensive Benchmark Real Time ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 HP9000/825 HPUX1.2 29.1 58.1 87.2 116.3 145.6 205.0 291.5 350.1 HP9000/840 HPUX1.2- 3.0 7.8 9.9 11.6 14.3 20.2 28.9 37.4 HP9000/840 HPUX1.2+ 2.1 3.9 5.8 7.8 10.9 16.2 26.0 32.3 "-": no optimization for the FORTRAN compilation, "+": full optimization ----------------------------------------------------------------------- CPU Time ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/500 3 CPUs 5.8 11.8 18.4 24.4 30.7 43.0 62.2 81.4 HP9000/825 HPUX1.2 29.0 57.9 87.0 116.0 144.7 202.7 288.5 346.5 HP9000/840 HPUX1.2- 2.8 5.8 8.6 11.3 14.5 20.7 29.0 35.3 HP9000/840 HPUX1.2+ 2.0 3.9 5.9 8.0 9.8 13.6 19.6 24.6 "-": no optimization for the FORTRAN compilation, "+": full optimization ----------------------------------------------------------------------- NOTES: HP9000/500: 6.5 MBytes main memory, 3 floating point CPUs, 65MByte system, 55MByte /tmp disk, 132MByte user disk, 571MByte data disk (Used by virtual memory), HP-UX 5.21. HP9000 series 825 (HP Precision Architecture, RISC machine) 16 MBytes of main memory, single 404MByte disk drive. HP Demo, 3/23/88. HP-UX 1.2 (Also tried it on HP-UX 2.0 pre-release with slightly worse results). HP9000 series 840 (HP Precision Architecture, RISC machine, TTL technology). 8 Mb of main memory, single 570 Mb disk drive. HP-UX 1.2. Tests run with a very small load, five users logged in (using a Bridge terminal server, and ARPA/Berkeley running in the 840). HP gives the machine the nominal performance index 4.5 MIPS, as opposed to the 7 MIPS for the 825. ----------------------------------------------------------------------- >Is the 825 really that bad? Could there be a problem with the 825 >I tested. I suspect there were. -- ----------------------------------------------------------------------------- ! Jarmo Sorvari Control Engineering Laboratory ! ! ...!mcvax!tut.fi!jarmo Tampere University of Technology ! --------------------------------------- BOX 527, 33101 Tampere, Finland -----
rclark@bgphp1.UUCP (Roger N. Clark) (04/01/88)
WELL, my posting has certainly generated a lot of response!! THERE WAS A PROBLEM WITH THE HP9000/825 TESTED!!! HP 825 math IS NOT 15x slower than a 500! ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HP 825 math is about 1.2x FASTER than a 500 (for the box filter benchmark). ^^^^^^^^^^^^ The 825 math is barely faster than a 500 in a multitasking environment. In my opinion, the 825 does not represent a large enough increase to justify trading in the 500 (which in my case would require about $40,000). I am concerned (and so are my users) that any replacement machine be as solid as the 500 (e.g. as of this writing we have been up 136 days and have never had a crash). Can anyone tell me their experience with the HP9000 series 825 or 300 (or for that matter other machines in this price/speed class)? Are there other machines that are as solid as the 500? The problem on the 825 turned out to be that the floating point chip was not working (shouldn't there have been a check of all parts of the system at boot time and the problem reported?) Would a 500 report such a problem? Would a 300? When the problem first became apparent, Hal Shearer (HP) worked very hard to try and find out why. We tried different configurations. We did the test on an 840 with much better results (I didn't keep them because I did not run my entire set and we were trting to figure out why the 825 was so slow). We also tried the benchmarks on a pre-release version of HP-UX 2.0 on the 825. After several weeks of not finding any reason for the slow results, I decided to post a note to the net. I am sorry that the note may have made the 825 and HP look unjustifiably bad (but it was their machine). Hal is now trying to find out why the 825 didn't report something was not working correctly. The numbers below show much better results. I have also included numbers for an 840 (thanks to: Jarmo Sorvari, Finland) and an 835 (thanks to: Bob Montgomery <hpfcse!hpfcmr!bobm>) It looks like the 835 is a fast machine! A Multitasking, CPU intensive Benchmark (03/31/88) Real Time ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/835 HP-UX:2.0 0.5 1.0 1.5 2.0 2.4 3.4 4.9 5.9 HP9000/825 HP-UX:1.2 1.9 3.8 5.7 7.6 9.5 13.3 19.1 22.8 HP9000/500 3 CPUs 5.9 6.0 6.3 8.4 10.5 14.7 21.5 27.8 HP9000/840 HP-UX:1.2 2.1 3.9 5.8 7.8 10.9 16.2 26.0 32.3 ----------------------------------------------------------------------- CPU Time (user + sys) ----------------------------------------------------------------------- Number of Tasks System 1 2 3 4 5 7 10 12 ----------------------------------------------------------------------- HP9000/835 HP-UX:2.0 0.4 1.0 1.4 1.9 2.4 3.4 4.9 5.9 HP9000/825 HP-UX:1.2 1.9 3.8 5.6 7.5 9.4 13.2 18.9 22.7 HP9000/500 3 CPUs 5.8 11.8 18.4 24.4 30.7 43.0 62.2 81.4 HP9000/840 HP-UX:1.2 2.0 3.9 5.9 8.0 9.8 13.6 19.6 24.6 ----------------------------------------------------------------------- NOTES: HP9000 series 835 (HP Precision Architecture, RISC machine) HP-UX 2.0. HP Fort Collins machine, conducted by Bob Montgomery: hpuecoa!hpfcse!hpfcmr!bobm Mar 31, 1988 (the twelve task values were extrapolated from 10 tasks) HP9000 series 825 (HP Precision Architecture, RISC machine) 16 MBytes of main memory, single 404MByte disk drive. HP Demo, 3/31/88. HP-UX 1.2. HP9000/500: 6.5 MBytes main memory, 3 floating point CPUs, 65MByte system, 55MByte /tmp disk, 132MByte user disk, 571MByte data disk (Used by virtual memory), HP-UX 5.21. HP9000 series 840 (HP Precision Architecture, RISC machine, TTL technology). 8 Mb of main memory, single 570 Mb disk drive. HP-UX 1.2. Tests run with a very small load, five users logged in (using a Bridge terminal server, and ARPA/Berkeley running in the 840). HP gives the machine the nominal performance index 4.5 MIPS, as opposed to the 7 MIPS for the 825. FROM: Jarmo Sorvari Control Engineering Laboratory !mcvax!tut.fi!jarmo Tampere University of Technology BOX 527, 33101 Tampere, Finland CONCLUSIONS: The 500 is still a good machine. BRING BACK THE 500!!!! Or: give the 500 owners a deal they can't refuse on 835's! Thanks to Hal and Bob at HP. Everyone has been very courteous. HP has some very good products. It is unfortunate that I ran into what is most likely a very unusual problem that resulted in a lot of confusion. Roger N. Clark U.S. Geological Survey, MS 964 Box 25046 Federal Center Denver, CO 80225-0046 {known-world}!hplabs!hpfcla!hpfcse!hpuecoa!bgphp1!rclark (Any opinions expressed here are mine and not necessarily those of the USGS)
rclark@bgphp1.UUCP (Roger N. Clark) (04/01/88)
I should qualify the results of the 500 versus 825 speeds. If you do not have a fully configured 500 (3 floating point cpus, 6+ megabytes of memory) and run multitasking, then the 825 might be a real benefit. After all it is about 3x faster than a single cpu 500. It just happens that in my case I have about 10 people sharing the cpus. How many scientists/engineers can afford $50k+ machines dedicated to one person, one task?
daryl@hpcllcm.HP.COM (Daryl Odnert) (04/04/88)
Word has been circulating here at HP that the Roger Clark's S825 had a bad floating-point coprocessor in it. One of the features of the system, however, is that if no floating-point coprocessor is present in the hardware, the floating-point instruction are emulated in software. This accounts for the poor performance of this benchmark. Apparently, the system believed that no coprocessor was avaiable. Can you verify this for those of us who are following this on notes, Roger? Thanks, Daryl Odnert HP Computer Language Lab hplabs!hpcllcm!daryl
daryl@hpcllcm.HP.COM (Daryl Odnert) (04/04/88)
> HP 825 math is about 1.2x FASTER than a 500 (for the box filter benchmark).
Roger... did you optimize the application this time (using the -O option)
before doing the timings? Your posting on 3/31 did not say whether or not
optimization was selected.
Daryl Odnert
HP Computer Language Lab
hplabs!hpcllcm!daryl
campbelr@hpsel1.HP.COM (Bob Campbell) (04/05/88)
> . . . . . . . . . . . . . . . I am concerned (and so are my users) > that any replacement machine be as solid as the 500 (e.g. as of this > writing we have been up 136 days and have never had a crash). > > Can anyone tell me their experience with the HP9000 series 825 or > 300 (or for that matter other machines in this price/speed class)? > Are there other machines that are as solid as the 500? > > Roger N. Clark > U.S. Geological Survey, MS 964 > Box 25046 Federal Center > Denver, CO 80225-0046 > {known-world}!hplabs!hpfcla!hpfcse!hpuecoa!bgphp1!rclark > (Any opinions expressed here are mine and not necessarily those of > the USGS) > ---------- We recently celebrated the fact that 800 series computers had been shipping for one year with no failures. The testers of HP have little desire to rest on past accomplishments and would always like to hear of problem areas. I believe that in the area of powerfail recovery, the 800 series may be the most reliable system yet. Of course I am biased and the 300 series folks might have a thing to say :-) Hopefully the responses to the problem left you with a feeling that we do try to work for our reputation. Bob Campbell Some times I wish that I could stop you from campbelr@hpda.hp.com talking, when I hear the silly things you say. Hewlett Packard - Elvis Costello HP-UX System Interface & Recovery Testing
rclark@bgphp1.UUCP (Roger N. Clark) (04/05/88)
> Word has been circulating here at HP that the Roger Clark's S825 had > a bad floating-point coprocessor in it. ... > > Can you verify this for those of us who are following this on notes, Roger? > > Thanks, > Daryl Odnert > HP Computer Language Lab > hplabs!hpcllcm!daryl That is correct (except that it was HP's 825!). I think my last posting should have cleared up the confusion. Again, sorry for the problems. But for HP to answer: shouldn't the 825 gone through some sort of check at boot time and told us if something was wrong? The 825 math seems to be about 3x faster than a single cpu 500 for problems involving normal math + - / *) on arrays. Roger N. Clark
rclark@bgphp1.UUCP (Roger N. Clark) (04/06/88)
> Roger... did you optimize the application this time (using the -O option) > before doing the timings? Your posting on 3/31 did not say whether or not > optimization was selected. I keep no flags turned in as default in the make file because every machine is different. I take every edvantage of the particular machine, so if it has optimization, I use it. If it has a floating point accelerator (e.g. Sun 3s, HP 9000/350s) I use it too. Roger
rclark@bgphp1.UUCP (Roger N. Clark) (04/06/88)
> We recently celebrated the fact that 800 series computers had been shipping > for one year with no failures. > > Bob Campbell Some times I wish that I could stop you from > campbelr@hpda.hp.com talking, when I hear the silly things you say. > Hewlett Packard - Elvis Costello > HP-UX System Interface & Recovery Testing That is incredible! But do I take it the HP Neeley Sales Region 825 that I did my benchmark on with the bad floating point is the FIRST failure? Or does that machine not count (because it wasn't shipped to a customer)? In any event, it is very impressive. Could HP give some indication of how many 825 years that is (at least to an order of magnitude or so)? What is the MTBF for a typical cpu + memory + I/O boards? Roger N. Clark bgphp1!rclark
wunder@hpcea.CE.HP.COM (Walter Underwood) (04/09/88)
But for HP to answer: shouldn't the 825 gone through some sort of check at boot time and told us if something was wrong? Roger N. Clark That question has already come up internally. The system already does the check -- note that it automatically decided to emulate the FP stuff when it noticed that the FP unit was broken. Obviously, the system should log an error. wunder
irf@kuling.UUCP (Bo Thide) (04/20/88)
In article <8870004@hpsel1.HP.COM> campbelr@hpsel1.HP.COM (Bob Campbell) writes: >> . . . . . . . . . . . . . . . I am concerned (and so are my users) >> that any replacement machine be as solid as the 500 (e.g. as of this >> writing we have been up 136 days and have never had a crash). >> >> Can anyone tell me their experience with the HP9000 series 825 or >> 300 (or for that matter other machines in this price/speed class)? >> Are there other machines that are as solid as the 500? >> > >I believe that in the area of powerfail recovery, the 800 series may >be the most reliable system yet. Of course I am biased and the 300 series >folks might have a thing to say :-) Hopefully the responses to the problem I have had my 350 for 5 months by now and, wow, am I pleased with it. Not a single problem so far. The machine booted up in late November and the HP-UX hasn't crashed once. (I also have the Pascal Workstation and HP BASIC operating systems installed on the same disc as HP-UX and Pascal has crashed twice, probably since I have been playing around with special ADC and FFT hardware which I'm trying to connect to the DIO bus). Already from the start I found the 350 to be a VERY FAST machine and after installing the HP FPA it really flies. Below are some simple 350 benchmarks and comparisons with the 540. -Bo ------------------------------------------------------------------------------ Benchmark results for HP9000/350 (8 MByte DRAM) running HP-UX 5.5. Below are printouts from standard FORTRAN77 programs with straight ANSI code (compiled with 'f77 -O') according to "HP 9000 Computers Series 200 and 500 Performance Guide" (HP 5953-9405 11/83). The program contains 5 consecutive DO-loops. The first loop, used for estimating loop overhead, only assigns a constant to a dummy variable. The other loops do the same assigments plus additions, subtractions, multiplications, and divisions, respectively. All loops are run 1 000 000 times and are timed individually by using the internal clock and are corrected for the loop overhead. The programs were run in a 16 user HP-UX Unix environment but with real-time priority ('rtprio') = 0. No assembler code or any other tricks were used. --------------------------------------------------------------------bt-880216- Without floating point accelerator: Loop overhead is .48 seconds Time for 1000000 REAL*4 adds is 3.95 seconds Time for 1000000 REAL*4 subtracts is 3.95 seconds Time for 1000000 REAL*4 multiplys is 4.35 seconds Time for 1000000 REAL*4 divides is 4.72 seconds Loop overhead is .50 seconds Time for 1000000 REAL*8 adds is 3.92 seconds Time for 1000000 REAL*8 subtracts is 3.93 seconds Time for 1000000 REAL*8 multiplys is 4.70 seconds Time for 1000000 REAL*8 divides is 6.25 seconds With floating point accelerator: Loop overhead is .47 seconds Time for 1000000 REAL*4 adds is .82 seconds Time for 1000000 REAL*4 subtracts is .80 seconds Time for 1000000 REAL*4 multiplys is .82 seconds Time for 1000000 REAL*4 divides is 1.97 seconds Loop overhead is .47 seconds Time for 1000000 REAL*8 adds is .83 seconds Time for 1000000 REAL*8 subtracts is .82 seconds Time for 1000000 REAL*8 multiplys is .87 seconds Time for 1000000 REAL*8 divides is 3.33 seconds ------------------------------------------------------------------------ For comparison, here is the same test run on an HP9000/540 with a FOCUS II CPU (including FPA): Loop overhead is 4.52 seconds Time for 1000000 REAL*4 adds is 3.15 seconds Time for 1000000 REAL*4 subtracts is 2.70 seconds Time for 1000000 REAL*4 multiplys is 3.30 seconds Time for 1000000 REAL*4 divides is 4.50 seconds Loop overhead is 4.52 seconds Time for 1000000 REAL*8 adds is 3.90 seconds Time for 1000000 REAL*8 subtracts is 3.60 seconds Time for 1000000 REAL*8 multiplys is 4.18 seconds Time for 1000000 REAL*8 divides is 5.23 seconds ------------------------------------------------------------------------ -- >>> Bo Thide', Swedish Institute of Space Physics, S-755 90 Uppsala, Sweden <<< Phone (+46) 18-300020. Telex: 76036 (IRFUPP S). UUCP: ..enea!kuling!irfu!bt