schreiber@schreiber.asd.sgi.com (Olivier Schreiber) (03/23/91)
Hi!, I am looking for standard workload multiuser musbus results for various machines. Thanks in advance. -- Olivier Schreiber Technical Marketing schreiber@sgi.com (415)335 7353 MS/7L580 Silicon Graphics Inc., 2011 North Shoreline Blvd. Mountain View, Ca 94039-7311 Better to be rich and healthy than poor and sick.
kenj@yarra.oz.au (Ken McDonell) (03/26/91)
schreiber@schreiber.asd.sgi.com (Olivier Schreiber) writes: >Hi!, I am looking for standard workload multiuser musbus results >for various machines. I think the time has come to repeat some things I first said 7 years ago (about 3 years after I developed MUSBUS) ... Of course I have many sets of results, but these have been obtained as a consequence (a) of machines we [the CS Department at Monash University] have, or (b) consulting to vendors or purchasers in tender procedures [since I joined Pyramid full-time in 1988 I have not done much of this for systems other than Pyramids! :-)], or (c) non-disclosure agreements (e.g. products yet to be announced). I receive many requests for copies of previous results, however only those in (a) could be made available, and then I tend NOT to do this for the following reasons; (0) They are quickly out of date and of little relevance. (1) Different h/w configurations, versions of the same Unix port & C compilers and versions of the MUSBUS programs themselves vary with time to such an extend that labelling one set of figures as from Brand X Model Y is misleading to all concerned. Getting the best MUSBUS result is not as trivial as maximizing a *stone number or SPECmark. There are questions of price-performance -- results from a CPU-bound configuration are very different to the results for the same CPU when configured so that the benchmark runs disk I/O bound. Full disclosure of the configuration and environment requires about a screenful of text, and I know that if this were included with the results, after a very short time the results would remain but the other information would be "lost" in the interests if brevity! (2) MUSBUS is intended to be reconfigured in the critical multiuser simulated workload test to reflect the work profile of a particular user site. Whenever different workloads are used the results cannot be compared. MUSBUS was intended as a multiuser benchmark framework -- at the time I did not appreciate how difficult most people regarded the task of workload definition, and with time most results have been created with the default workload I supplied -- this is helping the CS Department at Monash choose an appropriate system to meet their *1980* needs, but not much else. This trap of a default workload plagues most multi-user benchmarks; AIM Suite 3, SDE, and Gaede all have mechanisms for defining an application-specific workload, but this is rarely done. People tend to use the default workloads without ever asking "does this profile look anything like my intended system usage?". The most interesting MUSBUS results are the ones that have NOT used the default workload, but promulgation of these results would only inject further confusion for those looking to compare systems without checking the fine print. (3) Deliberately the MUSBUS tests are in two distinct categories, raw speed and multiuser. The former are useful for diagnostic purposes only and give little useful information for a potential purchaser. The latter test gives good predictions of system performance. Not everyone appreciates this, and some rather silly conclusions have been drawn as a result. I regret every distributing the raw speed tests, and plan to drop them in a future MUSBUS release. (4) In MUSBUS there is (quite deliberately) no single figure of merit (this is not the holy grail) -- rather we see the effects of increasing load (the time-constraining of user input makes the system lightly loaded at low levels of concurrency, unlike other multiuser benchmarks) as we move towards some form of resource depletion. The level of concurrency is a free variable and there is no reason to suspect that useful results will be derived by picking the same concurrency levels for all systems. Given a *set* of results for each system, comparisons between systems are difficult, and require selection of a computed metric appropriate to one's needs, e.g. CPU time per user in the limit, user load at which elapsed time increases by X%, user load at which CPU utilization exceeds Y%, aggregate throughput in commands (processes) per unit time, ... This has been something of a philosophical diatribe, but I hope it goes some way to explaining why I think people should not promulgate MUSBUS results without paying special attention to the points I have raised. Since KENBUS is basically the MUSBUS multiuser test (default workload, no comms I/O -- but that is another story), people should bear in mind the points I've raised when SPEC Release 2.0 results start to be circulated. -- Ken McDonell E-mail: kenj@pyramid.com kenj@yarra.oz.au Performance Analysis Group Phone: +61 3 820 0711 Pyramid Technology Corporation Disclaimer: I speak for me alone, of course. Melbourne, Australia