gnu@hoptoad.uucp (John Gilmore) (05/15/87)
In article <4294@nsc.nsc.com>, grenley@nsc.nsc.com (George Grenley) writes: > So, here's the deal. I invite Mot, Intel, and other interested parties > to work with me in defining some sort of realistic benchmark, which we'll > run (in public). I expect to have system level hardware late this year, > so if we get started now, we'll have very interesting Xmas presents... > May the best CPU win! I'd like to join in the hoopla, rahrah, etc that has followed this suggestion, and make a further one: Let's have the bake-off in the trade show at, say, next Winter Usenix. Probably the actual setup and running of the benchmarks can be done a day or two before the show, so the results can be printed for distribution, and to give the losers time to think up (and print up) good explanations before we descend on them :-). Let's also make the same setup of machines available for people to run their own benchmarks. It'd be easiest if they were all on a network, of course, though the benchmarks should in general be run from local disk to eliminate networking delays. Except for multi-CPU machines which want to show off their multitasking, only one person should be running on any machine at once. But you could load a tape of benchmarks onto a server machine somewhere, then as time became free on each machine of interest, rcp over (or cp via NFS) your data, compile, and run. It might even be possible to just have a bank of terminals where you could rlogin to each system in turn, using some simple scheme to avoid multiple people getting through at once, rather than schlepping around the trade show floor. Anyone could verify the benchmark results from the bake-off by rerunning them themselves. Each such machine should have its full configuration posted prominently, with the list price of the configuration, if for sale, or its ballpark price and expected availability if the machine is not announced or not shipping. If I go to Usenix and run some benchmarks, then go home and buy a system based on them, I want to be able to reproduce the configuration that won on my purchase order. Anybody who's willing to bring a machine to the trade show floor (and pay for the booth...) should be able to enter, e.g. there's no reason to restrict it to "interesting new micros". Of course, the benchmark machines should be available for benchmarking full time while the show is open, so the vendors should bring a second machine for demos unless they want to degrade their benchmark results. To encourage prototypes to appear, there should be no requirement that the stuff be for sale yet either. If they'll bring it, we'll benchmark it! Any other Usenix members interested in this? Think we can get the conference committee to go for it? -- Copyright 1987 John Gilmore; you may redistribute only if your recipients may. (This is an effort to bend Stargate to work with Usenet, not against it.) {sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu gnu@ingres.berkeley.edu
daveb@rtech.UUCP (05/15/87)
In article <2128@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: > > Let's have the bake-off in the trade show at, say, next Winter > Usenix. Probably the actual setup and running of the benchmarks > can be done a day or two before the show, so the results can be > printed for distribution, and to give the losers time to think > up (and print up) good explanations before we descend on them :-). > > Let's also make the same setup of machines available for people > to run their own benchmarks... At last winter's Uniforum, I went around to a number of booths trying to run the infamous /bin/time bc << ! 2^4096 ! At a distressing number of places the sales creatures in the booth would say things like, "I don't believe we're interested in running any benchmarks today. Let me show you vi." Now there are some good reasons for this, but it sure sounded like there was something being hidden. Problem 1 is getting some benchmarks run. Problem 2 is trying to get a straight answer on the price of the system. What you really want is the bang/buck of different benchmarks on different boxes. The results would be an embarrassing to many people wearing suits, which is why it may be difficulty to get a lot of cooperation. -dB PS: Given my druthers, I'd like to see: * the bc benchmark above * Dhrystone * Whetstones * A paging thrasher. * A system call overhead checker (looped getpid()s maybe). * A process thrasher. I'd probably give up on disk speed and tty i/o. -- {amdahl, cbosgd, mtxinu, ptsfa, sun}!rtech!daveb daveb@rtech.uucp
grenley@nsc.nsc.com (George Grenley) (05/15/87)
A while back I invited other chip manufacturers to join me in a CPU horse race. I'm happy to see this request is generating interest. However, I haven't heard much from Mot or Intel. Are you guys listening? Iknow you're out there... In article <826@rtech.UUCP> daveb@rtech.UUCP (Dave Brower) writes: >At last winter's Uniforum, I went around to a number of booths trying to >run the infamous > /bin/time bc << ! > 2^4096 > ! >At a distressing number of places the sales creatures in the booth would >say things like, "I don't believe we're interested in running any >benchmarks today. Let me show you vi." Now there are some good reasons >for this, but it sure sounded like there was something being hidden. No, they just don't want to risk a system crash, or other malicious use. Most show-people aren't Unix gurus, and are hesitant to let a hacker play with the system on the show floor. >Problem 1 is getting some benchmarks run. Problem 2 is trying to get a >straight answer on the price of the system. What you really want is the >bang/buck of different benchmarks on different boxes. The results would >be an embarrassing to many people wearing suits, which is why it may be >difficulty to get a lot of cooperation. >PS: Given my druthers, I'd like to see: > > * the bc benchmark above > * Dhrystone > * Whetstones > * A paging thrasher. > * A system call overhead checker (looped getpid()s maybe). > * A process thrasher. The problem you're experiencing in getting what you call straight answers lies in your methods. As a working engineer who knows how to wear a suit and tie (not common, I know, but some of us manage it) who has logged many hours of booth duty, I sympathize with the booth person who is hesitent to allow you to run any program they're not familiar with. There are a lot of *ssholes at tradeshows who delight in trying (and succeeding, occasionally) in crashing systems. Personally, I never let such a person near a machine, period, no matter how much he protests the "innocence" of his program. If you want cooperation, I suggest you work with some of the others on this net (including myself) to define a reasonable benchmark before the show, run it under realistic conditions, and let the manufacturers publish the results jointly. I have seen more than one instance where one cooperative manufacturer ran a "real-world" benchmark, only to be pilloried later by having it compared to some bs benchmark put out by a less scrupulous competitor. As to your suggested benchmark list: Dhrystone has come under fire as not being very reliable, due to compiler optimization problems. LIkewise, most Unix machines aren't floating point oriented. Consider also that those of us who are chip manufacturers are primarily interested in CPU benchmarks, not system benchmarks. Unix is NOT the entire world, yet. TO BETTER BENCHMARKS! George
mkhaw@teknowledge-vaxc.ARPA (Michael Khaw) (05/16/87)
In article <826@rtech.UUCP> daveb@rtech.UUCP (Dave Brower) writes: ... >At last winter's Uniforum, I went around to a number of booths trying to >run the infamous > > /bin/time bc << ! > 2^4096 > ! ... What exactly does this exercise (i.e., how does "dc" do "unlimited precision arithmetic")?. Why not run /bin/time dc << X 2 4096 ^ p X (possibly less overhead than the "bc" version but 4 more characters to type?) Mike Khaw -- internet: mkhaw@teknowledge-vaxc.arpa usenet: {hplabs|sun|ucbvax|decwrl|sri-unix}!mkhaw%teknowledge-vaxc.arpa USnail: Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303
larry@mips.UUCP (05/16/87)
In article <826@rtech.UUCP> daveb@rtech.UUCP (Dave Brower) writes: >In article <2128@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >> >> Let's have the bake-off in the trade show at, say, next Winter >> Usenix. Probably the actual setup and running of the benchmarks >> can be done a day or two before the show, so the results can be >> printed for distribution, and to give the losers time to think >> up (and print up) good explanations before we descend on them :-). >> >> Let's also make the same setup of machines available for people >> to run their own benchmarks... > >At last winter's Uniforum, I went around to a number of booths trying to >run the infamous > > /bin/time bc << ! > 2^4096 > ! > >At a distressing number of places the sales creatures in the booth would >say things like, "I don't believe we're interested in running any >benchmarks today. Let me show you vi." Now there are some good reasons >for this, but it sure sounded like there was something being hidden. I think we should aim for the bake-off to be done through respective engineering staffs. I really like the sales folks but this is really a technical endeavor. Having the benchmarks at a show is a wonderful idea. It gives the engineering staffs a chance to explain, brag, boast or promise their results to lots of people. By having each machine start with a 'clean' benchmark tape we can remove all doubt about whether everyone used exactly the same sources and were run under the same conditions. >Problem 1 is getting some benchmarks run. Problem 2 is trying to get a >straight answer on the price of the system. What you really want is the >bang/buck of different benchmarks on different boxes. The results would >be an embarrassing to many people wearing suits, which is why it may be >difficulty to get a lot of cooperation. I think the benchmark should be made available well in advance and be made available to the 'world'. There is too much comparison of machines using different definitions of performance. This activity would perform a valuable service for the industry. >PS: Given my druthers, I'd like to see: > > * the bc benchmark above > * Dhrystone > * Whetstones > * A paging thrasher. > * A system call overhead checker (looped getpid()s maybe). > * A process thrasher. > >I'd probably give up on disk speed and tty i/o. The benchmarks should strive to illustrate how real world programs run on the machines. Dhrystone, as maligned as it is, is useful only if it is one of a number of larger programs - we will need to carefully document the program with a range of optimizations. A page thrasher would be wonderful BUT it is highly dependent on I/O system, configurations, page size, MMU ... in fact so many things that I suspect it wouldn't be useful. I encourage the readers of this group to search for real programs that range from modest to large size (maybe a couple of hundred Kbytes) that can be run without elaborate setup. They should be: Easily checked for correctness Not rely on system files (eg, grep of passwd) Not use any system commands, if you want to grep, then the code should be part of the benchmark. Be examples of integer, single and double precision float, character oriented, pointer oriented - in short a nice mix of different application areas. Run long enough to be meaningful - none of this 0.1u times that have more timing error than meaning. My suggestions include: Common benchmarks Dhrystone,Whetstone,Linpack,Stanford Real Programs Doduc,Timberwolf,UCB Spice,YACC,C compiler (from Stallman), We should agree ahead of time how the results are to be reported. I suggest that we list individual results under specific conditions and have some weighting method to give a simple result. Maybe, the organizing group could select a base machine and weight the values so that the base machine is one. The VAX 11/780 is often used for this - so why not use it. It is very good that non-vendors get involved to make sure that the fair representation is preserved. Maybe the Uniforum organizing committee can help identify the leaders. Or maybe one of you wants take the lead. Prehaps it will be know as the X suite, where X is YOU. LETS DO IT...
jack@mcvax.cwi.nl (Jack Jansen) (05/17/87)
In article <396@gumby.UUCP> larry@gumby.UUCP (Larry Weber) writes: >... A page thrasher would be >wonderful BUT it is highly dependent on I/O system, configurations, page >size, MMU ... in fact so many things that I suspect it wouldn't be >useful. For me, it would be useful. If I'm looking for a machine to put 30 first-year students on, I'm not interested in CPU performance at all. I just want the system to run fast with 30 vi/cc/a.out users. As an example: the 3B15 comes out of most benchmarks as a truly lousy machine. However, something these benchmarks never show is that the I/O system is very fast. The same more-or-less holds for VAXen. My favorite benchmark: time ex /usr/dict/words <<FOO 10000,10060d w /tmp/words FOO time diff /usr/dict/words /tmp/words >/dev/null This gives you a reasonable idea of how fast vi starts (*very* important), and how fast the I/O system (plus disks, etc) is. -- Jack Jansen, jack@cwi.nl (or jack@mcvax.uucp) The shell is my oyster.
spaf@gatech.edu (Gene Spafford) (05/17/87)
Let me make an actual offer: if suitable benchmarks are available by February 1988, we can make them available for a "bench-off" (as per John Gilmore's article, et al) at the 1988 Annual ACM Computer Science Conference in Atlanta (February 23-25). I'm on the program committee for the CSC and would be happy to try to arrange something, either in conjunction with the exhibit or as one of the program events...but only if there is likely to be vendor participation. Contact me if you're interested, either by E-mail or at (404) 894-3807. If I don't get evidence of sufficient interest by about July 15, it won't happen. ...and just wait til you hear some of the other things we've got lined up for the program! Mark those dates on your calendar. -- Gene Spafford Software Engineering Research Center (SERC), Georgia Tech, Atlanta GA 30332 CSNet: Spaf @ GATech ARPA: Spaf@gatech.EDU uucp: ...!{akgua,decvax,hplabs,ihnp4,linus,seismo,ulysses}!gatech!spaf
newton@cit-vax.Caltech.Edu (Mike Newton) (05/18/87)
The list of benchmarks given seems to ignore several important cases. One of these -- a large graphics program -- was pointed out by Tim Kay. Another thing that I would like to see bench marked is a large AI program running through either a Lisp or Prolog compiler/interpreter. I personally use Prolog a lot -- and often take up running the naive reverse benchmark on a version of C-Prolog that I have ported to many machines. While licensing restrictions may prevent this at USENIX, the benchmark has proved useful. For example: [1] The C-Prolog interpreter was FASTER than the HP prolog COMPILER on HP 9000/200 series machines!! (The compiler did have many nice features though...) [2] On many machines there exists a rough correspondance between Dhrystone benchmarks and LIPS. However, with increasing pipelines, this may no longer be true. Prolog uses LOTS of pointers. With micros capable of ``17 Million sustained MIPS'' it would probably be possible to write a Prolog compiler that got a fair fraction of 1 million LIPS (current systems on vaxen and suns get ~5000 LIPS interpreted). This claim of speed would be especailly true on machines like the proposed AMD 29000 with 64 general use registers -- one of the main things that slowed down the Prolog compiler I wrote the code generator for on was the lack of registers on the IBM. Our compiler ran at about one million LIPS on a IBM 3090. It would be a lot more economically feasible for the average Prolog user to buy a microprocessor based system than a 3090 :-) ! Until that time a standard AI program could be: [1] A mini-prolog interpreter running a parsing problem, or, [2] xlisp 1.7? running a small program, or...?? - mike newton@csvax.caltech.edu 818 356 6771 (afternoons, nights) amdahl!cit-vax!newton Caltech 256-80, Pasadena CA 91125
ralph@ralmar.UUCP (Ralph Barker) (05/18/87)
Assuming succes in obtaining manufacturer participation in the proposed "Bench-Off", what are the thoughts of using Neal Nelson's Business Benchmark(tm) for this purpose? The series of 18 tests (and, multiple copies of each test) which the Nelson benchmark runs appears to provide an excellent picture of not only the various aspects of system performance ("raw" computing power, I/O, disk speed, etc.), but also multi-user performance as well. The Business Benchmark(tm) does not cover the graphics or compiler performance (i.e. Prolog) issues raised earlier in this discussion, but it does seem to cover most of the other areas. Although I have personally used Nelson's Business Benchmark in connection with a hardware review done for UNIX/World Magazine, and found the results to be highly useful, others with more extensive benchmarking experience may have other thoughts on the issue (or the validity of Nelson's approach). (NOTE: Due to the number of iterations within each test, Nelson's benchmark typically takes several hours to run, thus a special subset might be more appropriate within the context of a Usenix conference.) -- Ralph Barker, RALMAR Business Systems, 640 So Winchester Blvd, San Jose,CA 95128 uucp: ...{ucbvax,hplabs}!sun!idi---\!ralmar!ralph ...pyramid!amdahl!unixprt----/ Voice: (408) 248-8649
johnw@astroatc.UUCP (John F. Wardale) (05/18/87)
In article <396@gumby.UUCP> larry@gumby.UUCP (Larry Weber) writes: >In article <826@rtech.UUCP> daveb@rtech.UUCP (Dave Brower) writes: >> * A paging thrasher. > A page thrasher would be >wonderful BUT it is highly dependent on I/O system, configurations, page >size, MMU ... in fact so many things that I suspect it wouldn't be >useful. Just the opposite!! People use WHOLE systems, so a benchmark that loads the WHOLE system IS a very usefull benchmark!!! before you flame, note: I agree that a loop like for ( i=start; i<end; i+=big ) { sum += array[i]; } would NOT be very useful, but 50-500 lines of code that includes computaion and frequent, random accesses to an array of several megabytes would be GOOD John W - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Name: John F. Wardale UUCP: ... {seismo | harvard | ihnp4} !uwvax!astroatc!johnw arpa: astroatc!johnw@rsch.wisc.edu snail: 5800 Cottage Gr. Rd. ;;; Madison WI 53716 audio: 608-221-9001 eXt 110 To err is human, to really foul up world news requires the net!
howard@cpocd2.UUCP (Howard A. Landman) (05/20/87)
Reply-To: howard@cpocd2.UUCP (Howard A. Landman) Organization: Intel Corp. ASIC Services Organization, Chandler AZ Lines: 63 Xref: mipos3 comp.arch:1331 comp.org.usenix:182 >In article <826@rtech.UUCP> daveb@rtech.UUCP (Dave Brower) writes: >>At last winter's Uniforum, I went around to a number of booths trying to >>run the infamous >> /bin/time bc << ! >> 2^4096 >> ! >>At a distressing number of places the sales creatures in the booth would >>say things like, "I don't believe we're interested in running any >>benchmarks today. Let me show you vi." Now there are some good reasons >>for this, but it sure sounded like there was something being hidden. In article <4329@nsc.nsc.com> grenley@nsc.UUCP (George Grenley) writes: >No, they just don't want to risk a system crash, or other malicious use. >Most show-people aren't Unix gurus, and are hesitant to let a hacker play >with the system on the show floor. > >There are a lot of *ssholes at tradeshows who delight in trying (and >succeeding, occasionally) in crashing systems. Personally, I never let >such a person near a machine, period, no matter how much he protests >the "innocence" of his program. I'm glad SOMEONE has a way to tell a customer from an *sshole. ;-) At Electro a few years back, I was allowed to type on a teensy UNIX box in someone's booth. I decided to see if there was anyone else logged on (it was claimed to be a multi-user system, and there were other terminals scattered about) and try "write" or "talk". There was only one user logged in, however: "root". This kind of idiocy is probably why many "malicious" crashes occur. Far too many sales people leave themselves logged in as root so they don't ever run into permission problems. I was almost tempted to do "rm -r /". DEC uVAX-IIs come from the factory with an account "field" that has no password and has root privileges. I found several machines at DAC last year which still had that account active, with no password, and random users were logging in on them. All it would have taken is "cat /etc/passwd", and ... Also at last year's DAC, it was lots of fun hanging around the booth of a Very Very Large Computer Company and watching their so-called RISC workstation crash. For example, their csh crashed when fed "set i = 1; @ i++; echo $i", which should simply echo "2". (It's the @ i++ that died. To be fair, the latest release of their OS fixes this bug.) And they left one machine dead with a panic message on its screen for over 10 minutes before one of the sales people noticed me peering at it; his solution was to stand between me and the screen! No *ssholes were required, just bugs! A computer needs to be *RELIABLE*. You find out how reliable by, among other methods, stress testing the system, trying to exercise *ALL* the features, not just the ones in the canned demo. If I can crash a system in five minutes doing things that are normal, legal, and *NECESSARY* for everyday function, then I know it can't possibly be reliable. Does this make me malicious? I am reminded of the account in Richard Feynman's biography of his exploits as an amateur safecracker. Once, he told a military officer that his security procedures were lax, because it was possible to figure out the combination of a safe by playing with it while the door was open. He then recommended that people be told to keep their safes closed except when necessary. The officer's solution to the problem was to tell everyone in the facility to change their lock combination each time Feynman had been seen in their area! -- Howard A. Landman ...!intelca!mipos3!cpocd2!howard howard%cpocd2%sc.intel.com@RELAY.CS.NET "You just ask them?"
davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (05/20/87)
In article <401@ralmar.UUCP| ralph@ralmar.UUCP (Ralph Barker) writes: | |Assuming succes in obtaining manufacturer participation in the proposed |"Bench-Off", what are the thoughts of using Neal Nelson's Business |Benchmark(tm) for this purpose? As the developer of a benchmark suite of my own I would love to cast bricks at the Nelson suite. In truth it's a pretty good set of benchmarks, and has been run on hundreds of configurations. I agree that it would be a suitable measure of machines. After doing benchmarks for about 15 years now, I will assure everyone that the hard part is not getting reproducable results, but in (a) deciding how these relate to the problem you want to solve, and (b) getting people to believe that there is no "one number" which can be used to characterize performance. If pressed I use the reciprocal of the total real time to run the suite. It's as good as any other voodoo number... -- bill davidsen sixhub \ ARPA: wedu@ge-crd.arpa ihnp4!seismo!rochester!steinmetz -> crdos1!davidsen chinet / "Stupidity, like virtue, is its own reward"
mike@hcr.UUCP (Mike Tilson) (05/21/87)
There has been a lot of discussion about running some good benchmarks at Usenix. Many people have made suggestions of what should be run, with no consensus emerging, except a general feeling that benchmarks at the vendor exhibit would be a good thing, and that a grand comparison involving all vendors should be conducted. What people are really trying to do is to generate a single "performance" number, and to circumvent the "less-than-trustworthy" guys in suits who prevent ready access to machines at a show. However, I think a good benchmark requires a lot of thought, and it can only be run in reference to an intended application. That is why some commercial benchmark products take hours to run -- they gather lots of data that a customer can interpret depending on the intended application. (Maybe you want I/O performance, or MIPs, or FLOPs, or paging rate, etc.) This kind of careful analysis can't be done properly on the floor of a trade show. I think a "Usenix Benchmark Contest" would tend to perpetuate the misuse of benchmarks, and for that reason I'd suggest it isn't a good idea. Benchmarks are readily available for use by customers who are serious about comparing performance. A trade show benchmark war would encourage increased marketing hype with very little hard technical information. I'd like to see less of this at Usenix, rather than more. /Michael Tilson, HCR Corporation, {utzoo,ihnp4,...}!hcr!mike
ed@plx.UUCP (Ed Chaban) (05/21/87)
In article <4329@nsc.nsc.com>, grenley@nsc.nsc.com (George Grenley) writes: > A while back I invited other chip manufacturers to join me in a CPU > horse race. I'm happy to see this request is generating interest. > However, I haven't heard much from Mot or Intel. Are you guys > listening? Iknow you're out there... > > >At a distressing number of places the sales creatures in the booth would > >say things like, "I don't believe we're interested in running any > >benchmarks today. Let me show you vi." Now there are some good reasons > >for this, but it sure sounded like there was something being hidden. It seems that Neal Nelson's Benchmark has been getting a *LOT* of ink lately. Most of my observations about RISC and CISC systems come from my work with this particular chunk of code. Those who have seen it will attest to the almost Fortranlike abuse of GOTOs, but I can't imagine why this should give a CISC machine an advantage. -ed-
csg@pyramid.UUCP (05/21/87)
I've always used a vendor's willingness to let me hack on their trade-show machines as an indicator of how solid that machine was. Some vendors (notably Symmetric, and recently DEC) actually grab people off the floor and encourage them to come play with their machine. I also recall bringing up Interleaf on an Apollo DOMAIN/IX node at a show, and subsquently found myself demoing the product since I knew more about Interleaf than the sales critters did.... (Of course, I explained to the observers that Interleaf was also available on Sun and MicroVAX workstations.... :-)) I also know of at least one person who checks out all the machines and tries to break security -- an impromptu version of Gould's Secure UNIX challange. The variations from vendor to vendor on the show floor are astonishing. >If I can crash a system in five minutes doing things that are normal, legal, >and *NECESSARY* for everyday function, then I know it can't possibly be >reliable. At the above Apollo demo, I innocently managed to crash the node. That did not lower my opinion of the system, though; I would rather they encouraged me to hack on a system that they openly admitted was a pre-release product, instead of hiding behind a cloak of secrecy and insisting they had a finished product. (And the bugs I found were fixed before FCS.) >I am reminded of the account in Richard Feynman's biography of his exploits >as an amateur safecracker.... This is not at all different from trade-show booths that "blackball" certain people, knowing they are competitors or crackers. Yes, they really do this. <csg>
mash@mips.UUCP (John Mashey) (05/22/87)
In article <7387@boring.mcvax.cwi.nl> jack@boring.UUCP (Jack Jansen) writes: >In article <396@gumby.UUCP> larry@gumby.UUCP (Larry Weber) writes: >>... A page thrasher would be >>wonderful BUT it is highly dependent on .... >For me, it would be useful. If I'm looking for a machine to put >30 first-year students on, I'm not interested in CPU performance at >all. I just want the system to run fast with 30 vi/cc/a.out users.... >My favorite benchmark: >time ex /usr/dict/words <<FOO >10000,10060d >w /tmp/words >FOO >time diff /usr/dict/words /tmp/words >/dev/null As larry says, real page-thrashers are highly dependent on a lot of attributes. That doesn't mean they're bad tests, merely that they're extremely hard to do in a controlled way. In particular, you often see radically different results according to buffer cache sizes, for example. Also, on this test, CPU (user + sys) is about 70-85% of real time, i.e., I think you would care about CPU performance, since it's as important as the disks in the timing. What sort of numbers were you getting on the 3B15s? -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
roger@celtics.UUCP (05/26/87)
In article <691@cpocd2.UUCP> howard@cpocd2.UUCP (Howard A. Landman) writes: >Far too many sales people leave themselves logged in as root so they don't >ever run into permission problems. I was almost tempted to do "rm -r /". > Why is this unreasonable? It's THEIR demo... if they don't bolt down a laser printer they're exhibiting and turn their backs, do you have the right to steal it because it's "unprotected"? People running a booth at a trade show are often (a) technically out of their league, and (b) there to perform sales-oriented activities, which is their skill. We often cannot afford to have heavy tech types in booths; in fact, it's often counterproductive. (I think of the technical marketing person who stood in our booth a few years ago, and when asked: "Do you have NFS?" "Do you have LISP?" "Do you have MACSYMA?" "Do you have a version of TeX?" "Do you run GNU Emacs?"... responded, "NO! These are our products, just look at the list." Made a lot of friends, she did... and, by the way, all the requested stuff was either about to be released or being worked on at customer sites...) I can understand the temptation to exercise known bugs. But there's no reason to interfere with people's livelihood when your test is either destructive or time-wasting. If you want to test these things, either make arrangements to do them at a local office or during slow booth-time, or check with the booth staff and let them know the possible consequences of your acts. The public does need to be protected from genuinely bad products, but the sort of "I'm gonna trash you - you deserve it because you haven't fixed an obscure bug or you left your system wide open to me" games often played by hackers who are in an exhibition hall to exhibit themselves and not to see and evaluate the products legitimately are just indefensible. Those hackers generally show themselves off, all right, in the most appropriate light. >And they left one machine dead >with a panic message on its screen for over 10 minutes before one of the >sales people noticed me peering at it; his solution was to stand between me >and the screen! No *ssholes were required, just bugs! > Odds are the salesperson COULDN'T reboot the system. Given a choice between my reps knowing how to boot my system and knowing how to prospect, I'll take the latter any day. You're such a big shot as to take pleasure in bringing their demo system down, bring it up again... if I owned a grocery store and you knocked down a display, I'd expect you to at least offer to pick it up. >A computer needs to be *RELIABLE*. You find out how reliable by, among other >methods, stress testing the system, trying to exercise *ALL* the features, >not just the ones in the canned demo. If I can crash a system in five minutes >doing things that are normal, legal, and *NECESSARY* for everyday function, >then I know it can't possibly be reliable. Does this make me malicious? > If you're doing it in a public exhibition, yes. The point of security is to protect systems and data THAT IS REASONABLY AT RISK. At a show, the risk is not reasonable; it's imposed by crybabies who have nothing better to do. Systems at a trade show are physically secure, in that their owners control physical access. If you are granted access, you're a guest, and should behave like one. By all means, exercise the systems (within the time and resource limits given you by the vendor), but if you feel the urge to destroy, go out and punch a Bo-Bo doll. -- ///==\\ (No disclaimer - nobody's listening anyway.) /// Roger B.A. Klorese, CELERITY (Northeast Area) \\\ 40 Speen St., Framingham, MA 01701 +1 617 872-1552 \\\==// celtics!roger@seismo.CSS.GOV - seismo!celtics!roger
bzs@bu-cs.UUCP (05/28/87)
Although I have nothing *against* a benchmark suite I still claim that this is becoming less of an issue when compared against the richness of the environment. Going full throttle for the flames, I wouldn't trade my (mere :-) 2MIP Sun3/160 on (beside) my desk for a 10 MIP, vanilla System V dumb terminal, no network, no job-control environment, you'd have to pry the Sun out of my dead hands (tho I'd take a 1 MIP SYSV over a 100MIP VMS system, it's all relative.) I suspect I'm not alone in my opinion (not the particular systems, but the idea that the quality of the software is beginning to outweigh mere speed improvement.) Think of it this way, I have an IBM3090/200 with two vector processors (that's around 40MIPs and Cray-1S on floating point) at my disposal, trust me, it's faster 'n hell, it's astoundingly fast, but the software environment is so primitive I rarely log into it (and I certainly know my way around an IBM system, it's not some fundamental problem.) Oh, some number crunchers use it, and good for them, but boy is that crowd getting relatively small (there are plenty of number crunchers around here who would rather wait for their SUN3 or similar box as far as I can tell.) How we gonna measure that? I honestly think beyond some lower bounds software is getting very important (and besides, they go hand in hand to a great extent, you don't see too many window-oriented systems on .5 MIP boxes, then again, the Mac comes close and I'd be happy argue its virtues for getting one thru the day, we have those also, blows the doors off the 3090 on people-performance for many daily tasks.) And what about things like upgradeability (like my Encore that I can jack up to around 40 (parallel) MIPs by just adding CPU boards)? I know one major vendor who's only idea of an "upgrade" is you throw the 'old' $400K box away and buy a new $800K one...swell. Or a coherent plan to spread the MIPs into the user's offices? I still think there's a certain air of unreality to this whole "my iron is bigger than your iron" thing. Oh, it's important, it's just no longer a sufficient claim, certainly not enough to sell me on a box. You can say "well, then assuming the two software environments are equal..." But they rarely are, often they're disasterously different between two boxes. I agree it's a much harder measure, but is that what we're after? The cheap shots? -Barry Shein, Boston University
ps@celerity.UUCP (Pat Shanahan) (05/28/87)
In article <2128@hoptoad.uucp> gnu@hoptoad.uucp (John Gilmore) writes: >In article <4294@nsc.nsc.com>, grenley@nsc.nsc.com (George Grenley) writes: >> So, here's the deal. I invite Mot, Intel, and other interested parties >> to work with me in defining some sort of realistic benchmark, which we'll >> run (in public). I expect to have system level hardware late this year, >> so if we get started now, we'll have very interesting Xmas presents... >> May the best CPU win! Defining a realistic benchmark set seems a good objective. I do think that it is important to realize that performance is a multi-dimensional quantity. The only situation in which it is realistic to say "machine X is twice as fast as machine Y" is if X and Y have identical architectures. Any benchmark that produces a single number is likely to be misleading, unless it is understood to be very specialized. >... >I'd like to join in the hoopla, rahrah, etc that has followed this >suggestion, and make a further one: > > Let's have the bake-off in the trade show at, say, next Winter > Usenix. Probably the actual setup and running of the benchmarks > can be done a day or two before the show, so the results can be > printed for distribution, and to give the losers time to think > up (and print up) good explanations before we descend on them :-). >... >-- >Copyright 1987 John Gilmore; you may redistribute only if your recipients may. >(This is an effort to bend Stargate to work with Usenet, not against it.) >{sun,ptsfa,lll-crg,ihnp4,ucbvax}!hoptoad!gnu gnu@ingres.berkeley.edu I doubt whether I could get anything unusual run on a system that is going to a trade show during the last few days before the show. The system would normally be being loaded with known demo programs and crated up for transport. It is also a very busy time for the people involved in the show. I think it would be better to define the benchmark in advance and encourage manufacturers to run it on a variety of configurations, rather than just on what happens to be at one show. I suggest using existing benchmarks where possible. For example, measure Fortran loop performance with Livermore loops, rather than writing a new benchmark. -- ps (Pat Shanahan) uucp : {decvax!ucbvax || ihnp4 || philabs}!sdcsvax!celerity!ps arpa : sdcsvax!celerity!ps@nosc
rcopm@yabbie.oz (Paul Menon) (05/30/87)
> In-reply-to: mike@hcr.UUCP's message of 20 May 87 22:59:51 GMT > > Although I have nothing *against* a benchmark suite I still claim that > this is becoming less of an issue when compared against the richness > of the environment. Going full throttle for the flames, I wouldn't > trade my (mere :-) 2MIP Sun3/160 on (beside) my desk for a 10 MIP, > vanilla System V dumb terminal, no network, no job-control > environment, you'd have to pry the Sun out of my dead hands (tho I'd > take a 1 MIP SYSV over a 100MIP VMS system, it's all relative.) I totally agree! It all boils down do how productive these machines allow you to be, AND STILL LET YOU SMILE AT THE END OF THE DAY! That's my definition of a friendly user interface. After all, it is *you*, the programmer or enduser, that is the biggest factor affecting progress. It doesn't matter how hefty a machine you have in front of you, if it don't make you smile, it don't make you work. If you don't work - it don't work. Simple. It seems the trend with computers that the more powerful they are, the more moronic they get. What good is brawn without brains? Sure they run heaps faster, but who develops the programs for them? Surely we can't count on all but extinct lanuages (FORTRAN, COBOL) to hang around forever running on state of the art hardware? Then again, I guess it depends on mentality, eh? Apologies if people deem my response to be in the wrong newsgroup, but some of us would be in the dark if it wasn't for bignoses catching the light. Paul Menon. Dept of Communication & Electronic Engineering, Royal Melbourne Institute of Technology, 124 Latrobe St, Melbourne, 3000, Australia ACSnet: rcopm@yabbie UUCP: ...!seismo!munnari!yabbie.rmit.oz!rcopm CSNET: rcopm@yabbie.rmit.oz ARPA: rcopm%yabbie.rmit.oz@seismo BITNET: rcopm%yabbie.rmit.oz@CSNET-RELAY PHONE: +61 3 660 2619.