kjmcdonell@water.UUCP (05/19/87)
Recent discussions and suggestions have included ... (a) acupuncture tests (dc bashing, getpid() thumping, ...) (b) the stone family (whet, dhry, dhamp, ....) (c) `stone soup'' (some combination of the above) (d) monster recompilation (some/all of the Unix source) If we are serious about measuring system performance, then the following features and factors must be recognized, (a) a single-figure-of-merit is not very useful (b) results must be statistically reliable (c) measurements must reflect computing activity that is representative of the usage at *your* site (d) real-time delays cannot be ignored if the processing includes a significant interactive component (e) *predicted* performance (e.g. saturation, response-time degradation) with varying load is the most useful outcome of any benchmarking I have a constructed a benchmarking suite that provides a testbed for measuring system performance for varying numbers of emulated users. Objective (c) above is met by the specification of workload profiles as shell scripts (for the shell of your choice, e.g. /bin/sh, /bin/csh, an SQL interpreter, your favourite interactive program, ...). This software, known as the MUSBUS suite has circulated widely (given the e-mail I receive!) and is the subject of a talk to be given at the forthcoming Usenix meeting in Phoenix. But because the results for one workload profile provide little useful information for other processing environments, and because *no* globally representative workload profile exists, I have discouraged widespread publication and/or tabulation of comparative results. If you are serious about equipment selection, and plan to use benchmark results as input to the decision making process, then you should be willing to define a representative workload profile for your environment, and run the tests on competitive machines to produce reliable and comparable results -- MUSBUS provides a test-bed environment that makes this possible with minimal effort. No single test, and no battery of discrete tests is going to be of long-term validity in predicting the performance of a system in a particular processing environment -- we need portable performance tools that can be configured to measure total system performance (e.g. throughput or elapsed time) for specific workload profiles. ---- Ken J. McDonell kjmcdonell@er.waterloo.cdn currently visiting the kjmcdonell@water.uucp University of Waterloo from Dept. Computer Science kenj@moncsbruce.oz Monash University, Clayton AUSTRALIA 3168