pb@idca.tds.PHILIPS.nl (Peter Brouwer) (09/18/89)
In the discussion on which benchmarks to use as a reaction on a comparison done on machines by a unix magazin , neal nelson was mentioned as a benchmark to use. I have two points of comments on this bench mark. Some month back we used neal nelson benchmarks to compare machines. When i looked at the sources I saw that the time of the tests are measured with the system call time, so resolution is in seconds. This makes the fast tests on fast machines inacurate. I test that runs in 10 sec has an accuracy of 10%. This has to be thought of when looking at machine comparisons. An other remark I have is that all runs ( for n copies of the test ) run paralel. ( For those who are not familiar with neal neson bench mark, a number of runs can be executed paralel, each run consists of a series of tests that are executed in a fixed sequence. ) This means that the code of that test will fit in the system cache. The cache hit ratio will be high. This is not an realistic situation on a multi user machine. It would be better to make the order of the tests random for each or a number of copies of a run. The musbus benchmark does something like this. It created a number of execution scripts from a master script. Each script executes the test from the master script in a different order. -- Peter Brouwer, # Philips Telecommunications and Data Systems, NET : pb@idca.tds.philips.nl # Department SSP-P9000 Building V2, UUCP : ....!mcvax!philapd!pb # P.O.Box 245, 7300AE Apeldoorn, The Netherlands. PHONE:ext [+31] [0]55 432523, # Never underestimate the power of human stupidity
jcm@mtunb.ATT.COM (was-John McMillan) (09/18/89)
I concur with everything Peter Brouwer stated. At the time I examined Neal Nelson data, over a year ago, another feature stood out: there were no "knees" to the throughput curves. There was no point at which total throughput began to noticably decrease because of inter-process competition -- throughput was quite linear to the extremes of the load tests. AT&T throughput tests -- using common command sets, randomized within different task threads -- had no problems identifying levels at which total throughput began degrading. I hope that the rising breadth of UNIX usage will result in a considerable improvement in the quality of metrics being taken/published. I'm not holding my breath, however. john mcmillan -- att!mtunb!jcm -- speaking for himself, only