eugene@eos.arc.nasa.gov (Eugene Miya) (12/19/90)
In article <115440001@hpcuhc.cup.hp.com> spuhler@hpcuhc.cup.hp.com (Tom Spuhler) writes: >Concerned that your 'bc' benchmark results may be skewed by vendor >optimization of the trivial case? Looking for a longer running version >for your faster CPU's? Does management want a richer instruction mix to >be tested? Er, sorry, I must be dense, but where does the "richer instruction mix"(tm) come in (sounds like coffee, thank god I drink tea). Seems like more of the same. Do you work per change in a marketing department? Longer running? Longer is not necessarily better (no sex jokes please). Seems this could be optimized as well. Fortunately (?) I didn't see the beginnings of the 2^n thread. > It is better to have some data, no matter how limited, as long > as you understand it, then no data at all. Nope. Beg to disagree. It can be more damaging. I think some one is suing someone else over performance claims, getting nasty. Note: in a first post, I cited the APL benchmark (Gaussian sum) where the adds were all replaced by the simple (n+1)n/2 formula (n was = 256). It's hard to understand the behavior of some benchmark results, even by some of the programmer who wrote a given benchmark or compiler. --e.n. miya, NASA Ames Research Center, eugene@eos.arc.nasa.gov {uunet,mailrus,most gateways}!ames!eugene AMERICA: CHANGE IT OR LOSE IT.
spuhler@hpcuhc.cup.hp.com (Tom Spuhler) (12/20/90)
# >for your faster CPU's? Does management want a richer instruction mix to # # Er, sorry, I must be dense, but where does the "richer instruction mix"(tm) # come in (sounds like coffee, thank god I drink tea). Seems like more of Come on, Eugene, you're tripping over the easy ones:-) Richer instruction mix means a more varied, or using a larger subset of the machine insructions. Not particullary interesting, as the important criteria is how the tested instruction mix matches your expected workloads(for richer or poorer:-) but, I get more warm fuzzies from tests that exercise the 'richer' mixes then the 'poorers' as real life usage tends to be on the richer side (for the kinds of computers I'm interested in). Was common terminology around here. I didn't invent it (now, as to the concept of "creamier" code, I'll take some blame on that). # the same. Do you work per change in a marketing department? Longer running? # Longer is not necessarily better (no sex jokes please). Seems this could Sorry, no, to the marketing question. Longer is better in that it tends to minimize the lack of precision of the reporting mechanism (in this case /bin/time) and the impact of startup effects (something of conern in the 'bc' benchmark) will be minimized. When the run times drop below a couple of seconds, I personally start to worry about the precision of /bin/time. I like um to run at least 10 seconds. Unfortunately, I didn't achieve that goal with 2^9999/3^6308. On some systems, I expect it can run in less then a second, but I was limited by the 'bc' program and my interests in simplicity. Longer is not 'necessarily' better, but I find it usually is for accuracy in results, although 'longer' may reduce the number or times it's run or it's usefulness, which may be more important. # Longer is not necessarily better (no sex jokes please). Seems this could # be optimized as well. Fortunately (?) I didn't see the beginnings of the Optimizable? Oh sure. This is always true. Vendors could hard code in the answer. It's a question of ease, likelyhood, and dependence. How hard is it to optimize for this case? 2^9999/3^6308 is harder to optimize for then 2^5000/2^5000, assuming for more then just the hard-coded case (easy to detect) and somewhat consistent with the intent of 'bc'. How likely is someone likely to do something like that? Depends on how hung up the world gets on a single benchmark. How likely is someone going to optimize for Dhrystone? (Seems to have hppened). It's all a matter of contest. # > It is better to have some data, no matter how limited, as long # > as you understand it, then no data at all. # # Nope. Beg to disagree. It can be more damaging. I think some one is suing # someone else over performance claims, getting nasty. # Note: in a first post, I cited the APL benchmark (Gaussian sum) where # the adds were all replaced by the simple (n+1)n/2 formula (n was = 256). # We always have to live with imperfect information. True, the results of a benchmark running your applications(s) on a variety of vendor machines with a variety of configurations is ideal, it can be a little expensive to achieve. Something like the bc or nbc benchmarks may be not very good, but they are cheap to run. Results from a good number of machines are available. Note that the results of both efforts may be no more useful (or less useful) to someone else in determining the relative performances of the tested boxes. And guess which one cost less. Using bc, or better nbc can help classify systems and direct other investigative efforts. The combination of bc and nbc results is considerably more useful then either one alone. Keep adding in more benchmarks and you can develop a performance profile of a system. Does SPEC alone allow one to characterize the performance of a system? Definately not. Does it help? Sure. How about TPC-A? For any single characterization, one can cite exceptions. Only the complete universe of information is universally useful. Performance information is damaging only if it is missued (happends a lot). ["there is no enlightenment until there is total enlightenment"]. # It's hard to understand the behavior of some benchmark results, even by # some of the programmer who wrote a given benchmark or compiler. and it's even harder to come up with a single all singing and dancing benchmark which will allow anyone to evalute the performance of a variety of boxes running whatever applications they choose. -Tom Spuhler, Spuhler@cup.hp.com
spuhler@hpcuhc.cup.hp.com (Tom Spuhler) (12/22/90)
# >Most importantly, You didn't get the correct output. Any benchmark # >which doesn't return the expected output is invalid (or at least VERY # >suspect). Work on it until you get '2'. # # perhaps you should work on UN*X then. elementary considerations # show that the numerator must end in "8", and the denominator must # end in "1". How can the answer possibly be "2"? Easy. The program 'bc' assumes no (0) decimal places by default. and as you can see below, '2' is quite reasonable (and correct) given the calculation and the documented behavior of 'bc'. I suppose your 'bc's may vary. When I asked for a more exact answer I got 2.079945.... # My LISP machine gives an answer of: # # 9975..(omit ~ 6300 digits)..4688 / 4795..(omit ~ 6300 digits)..2561 # # i.e. no integral answer. it took 17.4 seconds by the way. Looks correct, The check is an exercise left to the reader. Note that 9975/4795 comes out close to 2.08. Interesting that my workstation (HP9000/360 diskless) was able to generate the (precise) answer in 17.66 seconds using Mathematica (but does take longer using bc). What hardware is your LISP machine based on? :-) I might add, that this illustrates some of the value of 2^9999/3^6308 as it requires significantly more work from the more intelligent packages. Mathematica only took .04 seconds to solve 2^5000/2^5000. -Tom
tac@cs.brown.edu (Ted A. Camus) (12/23/90)
># My LISP machine gives an answer of: ># ># 9975..(omit ~ 6300 digits)..4688 / 4795..(omit ~ 6300 digits)..2561 ># ># i.e. no integral answer. it took 17.4 seconds by the way. >Note that 9975/4795 comes out close to 2.08. Interesting that my workstation >(HP9000/360 diskless) was able to generate the (precise) answer in 17.66 >seconds using Mathematica (but does take longer using bc). Why is this interesting ? Here's my reason why bc is not a good benchmark: > (time (* 1.0 (/ (expt 2 9999) (expt 3 6308)))) Elapsed Real Time = 0.14 seconds . . . 2.079945102751959 using Lucid CL on a SS1. Given this, I find it hard to take bc seriously. - Ted ========================================================== Ted Camus Box 1910 CS Dept tac@cs.brown.edu Brown University tac@browncs.BITNET Providence, RI 02912