davet@oakhill.UUCP (David Trissel) (02/11/90)
In article <1850@cbnewsi.ATT.COM> ca@cbnewsi.ATT.COM (christopher.arnone) writes: > >Initial information about the 040 indicate that it will be faster than >the SPARC at 25Mhz. Of course, this remains to be seen. The fastest Dhrystone 2.1 I have seen reported for SPARC (obtained from the report file delivered with the 4/89 Usenet distribution of the benchmark) is 23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. -- Dave Trissel - Motorola Semiconductor, Austin Texas
jkrueger@dgis.dtic.dla.mil (Jon) (02/12/90)
davet@oakhill.UUCP (David Trissel) writes: >The fastest Dhrystone 2.1 I have seen reported for SPARC (obtained from the >report file delivered with the 4/89 Usenet distribution of the benchmark) is >23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. Inside Moto or elsewhere? What compiler? -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger The Philip Morris Companies, Inc: without question the strongest and best argument for an anti-flag-waving amendment.
wbeebe@rtmvax.UUCP (Bill Beebe) (02/12/90)
In article <2938@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: > >The fastest Dhrystone 2.1 I have seen reported for SPARC (obtained from the >report file delivered with the 4/89 Usenet distribution of the benchmark) is >23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. > > -- Dave Trissel - Motorola Semiconductor, Austin Texas Protestations aside, nothing will lend truth to the Moto numbers until some independant non-Moto numbers come back from *real* working silicon and systems. To re-quote a very tired old paraphrase, "there are lies, damn lies, and then there are vendor benchmarks". I am intriqued by Heurikon's (please forgive the spelling) 25 Mhz 68040 VME card in which they claim only 14 or so VAX mips. Questions: what is the correspondance between VAX mips and 040 mips (is it 1:1?); what board level architecture did Heurikon use on their board? Something else that's interesting. In the February 7th Microprocessor Report, page 4, under new SPEC numbers, a Moto system with a 33 Mhz 88K came up with a 17.8 SPECmark. Congratulations. However, the article goes on to note that the 88K SPECmark was only 1% over the SPARC's 17.6 SPECmark (as well as the MIPS). I would be most interested to see SPECmarks for the Heurikon board (or any other system) running the 040 at 25 Mhz or even 33 Mhz. The old Chinese curse has indeed come true. We do indeed live in interesting times.
mash@mips.COM (John Mashey) (02/12/90)
In article <3085@rtmvax.UUCP> wbeebe@rtmvax.UUCP (Bill Beebe) writes: >In article <2938@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: >> >>The fastest Dhrystone 2.1 I have seen reported for SPARC (obtained from the >>report file delivered with the 4/89 Usenet distribution of the benchmark) is >>23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. >> >> -- Dave Trissel - Motorola Semiconductor, Austin Texas > >Protestations aside, nothing will lend truth to the Moto numbers until some >independant non-Moto numbers come back from *real* working silicon and >systems. To re-quote a very tired old paraphrase, "there are lies, damn >lies, and then there are vendor benchmarks". I am intriqued by Heurikon's >(please forgive the spelling) 25 Mhz 68040 VME card in which they claim >only 14 or so VAX mips. Questions: what is the correspondance between VAX >mips and 040 mips (is it 1:1?); what board level architecture did Heurikon >use on their board? > >Something else that's interesting. In the February 7th Microprocessor >Report, page 4, under new SPEC numbers, a Moto system with a 33 Mhz 88K came >up with a 17.8 SPECmark. Congratulations. However, the article goes on to >note that the 88K SPECmark was only 1% over the SPARC's 17.6 SPECmark (as >well as the MIPS). I would be most interested to see SPECmarks for the >Heurikon board (or any other system) running the 040 at 25 Mhz or even 33 >Mhz. A bunch of people at various companies are busily stuffing SPEC numbers into spreadsheets, plus published mips-ratings, and analyzing. I'm also trying to calibrate i486 and 68040 numbers into this scheme. NOTE: regarding Dhrystone: a) A bunch of Motorola people have been working hard along with the rest of the SPECers to get better benchmarks, and have started getting good compiler gains by analyzing real programs. Talking about Dhrystone is a step backwards... b) In this newsgroup has been discussed many times why one has to be careful with Dhrystone ratings. Also, I quote from the author's directions: "In any case, for serious performance evaluation, users are advised to ask for code listings and to check them carefully." EVERYBODY knows that inlining strcpy&strcmp can boost the number strongly without giving anything like that boost on real programs. SO POST THE CODE WHERE THE CRUCIAL STRCPY/STRCMP calls are made; otherwise, the number is simply meaningless, because anybody can boost the performance substantially on Dhrystone by an optimization that has relatively little effect on real programs. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (02/12/90)
In article <35825@mips.mips.COM> mash@mips.COM (John Mashey) writes: > >NOTE: regarding Dhrystone: > a) A bunch of Motorola people have been working hard along with the > rest of the SPECers to get better benchmarks, and have started getting > good compiler gains by analyzing real programs. > Talking about Dhrystone is a step backwards... Tuning compilers for it is also a step backwards. In a previous life, I worked on hightly optimizing compilers. Adjusting our product to make one benchmark run better, would make others run worse. I am convinced that compilers tuned for Dhrystone, are in fact badly tuned. We will be doing ourselves a favor by demanding good Specmarks instead of good Dhrystones. -- Don D.C.Lindsay Carnegie Mellon Computer Science
davet@oakhill.UUCP (David Trissel) (02/12/90)
In article <35825@mips.mips.COM> mash@mips.COM (John Mashey) writes: >>> is 23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. >A bunch of people at various companies are busily stuffing SPEC numbers into >spreadsheets, plus published mips-ratings, and analyzing. I'm also trying >to calibrate i486 and 68040 numbers into this scheme. What does this have to do with Dhrystone? > a) A bunch of Motorola people have been working hard along with the > rest of the SPECers to get better benchmarks, and have started getting > good compiler gains by analyzing real programs. > Talking about Dhrystone is a step backwards... As has been discovered Dhrystones are an excellent way to find out how fast string operations go and to what extent C compilers incoporate them in-line. And the 1.1 version, since it is essentially a big NOP, can go a long way towards indicating how good a compiler is at removing dead code. The Dhrystone benchmarks have known weaknesses. The SPEC benchmarks have their own. Many people are interested in Dhrystone so it gets talked about. If you don't care for discussions on Dhrystone then simply ignore them. > b) In this newsgroup has been discussed many times why one has to > be careful with Dhrystone ratings. Also, I quote from the author's > directions: >"In any case, for serious performance evaluation, users are advised to >ask for code listings and to check them carefully." This true for ALL benchmarks. Do you think it only applies to Dhrystone? Do you think it does not apply to the SPEC benchmarks? I know I wouldn't be choosing a computer architecture without looking at code the compiler produces. > EVERYBODY knows that inlining strcpy&strcmp can boost the number > strongly without giving anything like that boost on real programs. > SO POST THE CODE WHERE THE CRUCIAL STRCPY/STRCMP calls are made; > otherwise, the number is simply meaningless, because anybody can > boost the performance substantially on Dhrystone by an optimization > that has relatively little effect on real programs. I fail to understand your tone here. By your own admission in a posting you did to this newsgroup on March 15, 1989: "Now, according to the letter or the law of Herr Doktor Weicker's Dhrystone 2.1 writeup, it's OK to in-line strcpy and strcmp. and this is what the MC68040 compiler does. So just what is the problem? Here is one of the string copies (they all look similar) directly from the benchmark's .s file: lea.l (12,%sp),%a5 mov.l %a5,%a1 mov.l &L%93,%a0 mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.l (%a0)+,(%a1)+ mov.w (%a0)+,(%a1)+ mov.b (%a0)+,(%a1)+ Now let's see you post the code that your MIPS compiler produces. Then tell us what you find to be relevant about the two postings. -- Dave Trissel - Motorola, Austin
mash@mips.COM (John Mashey) (02/13/90)
In article <2943@oakhill.UUCP> davet@oakhill.UUCP (David Trissel) writes: >In article <35825@mips.mips.COM> mash@mips.COM (John Mashey) writes: > >>>> is 23,148 KDhrys. The MC68040 runs that benchmark at twice the speed. > >>A bunch of people at various companies are busily stuffing SPEC numbers into >>spreadsheets, plus published mips-ratings, and analyzing. I'm also trying >>to calibrate i486 and 68040 numbers into this scheme. > >What does this have to do with Dhrystone? Sorry, among other things, when you start looking at such data, you see that: a) Dhrystone correlates with integer performance on real benchmarks within machine lines, with same compilers, at least somewhat. b) It has some correlation among machines lines. c) If one machine uses the inline, and one doesn't, the difference in performance badly mispredicts the performance on realistic programs. >The Dhrystone benchmarks have known weaknesses. The SPEC benchmarks have their >own. Many people are interested in Dhrystone so it gets talked about. If you >don't care for discussions on Dhrystone then simply ignore them. Impossible: it casues too much confusion, and I have to keep explaining to financial analysts, and I'm tired of that. The SPEC benchmarks have their own weakenesses of course, but they're hardly in Dhrystone's class. >>"In any case, for serious performance evaluation, users are advised to >>ask for code listings and to check them carefully." >This true for ALL benchmarks. Do you think it only applies to Dhrystone? >Do you think it does not apply to the SPEC benchmarks? I know I wouldn't >be choosing a computer architecture without looking at code the compiler >produces. Of course, but in Dhrystone's case, if all you ahve is the nubmers for two machines, you know very little about their relative performance, without looking at the code; it is especially irksome that it contains an optimization that improves it's performance greatly, that simply does not improve realistic programs significantly. (That doesn't mean that selective inlining of strings is bad; in fact, if Dhrystone contined a REPRESENTATIVE set of string operations, I wouldn't object so much, but it doesn't.) > >> EVERYBODY knows that inlining strcpy&strcmp can boost the number >> strongly without giving anything like that boost on real programs. >> SO POST THE CODE WHERE THE CRUCIAL STRCPY/STRCMP calls are made; >> otherwise, the number is simply meaningless, because anybody can >> boost the performance substantially on Dhrystone by an optimization >> that has relatively little effect on real programs. > >I fail to understand your tone here. By your own admission in a posting >you did to this newsgroup on March 15, 1989: > > "Now, according to the letter or the law of Herr Doktor Weicker's > Dhrystone 2.1 writeup, it's OK to in-line strcpy and strcmp. Yes, but subject to the comment above,which most people will not do, i.e., hardly anyone shows the code for this. The SPEC benchmarks were chosen to allow any optimization you like, but have the effect that there are very few optimizations you can do that won't help lots of real programs. > >and this is what the MC68040 compiler does. So just what is the problem? >Here is one of the string copies (they all look similar) directly from the >benchmark's .s file: > > lea.l (12,%sp),%a5 > mov.l %a5,%a1 > mov.l &L%93,%a0 > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.l (%a0)+,(%a1)+ > mov.w (%a0)+,(%a1)+ > mov.b (%a0)+,(%a1)+ Good! you at least did it correctly in general case, unlike the i860 that pads this to 32-bytes so it can do 2 quad-word loads & stores... > >Now let's see you post the code that your MIPS compiler produces. Then tell >us what you find to be relevant about the two postings. Dhrystone usually overpredicts VAX-relative performance; on most machines, if I know that this inlining is being done, I can estimate that it overpredicts it another 20-30%. That's what's relevant. The numbers we use all come from: jal strcpy and I've seen the SPARC code as the equivalent; meaning, I think such things don't overpredict as much (they still overpredict, and this has been well-documented for years in published materials.) And the reason (we don't inline str*) is: a) When you inline code it gets bigger. b) You might want to inline it only in those places it's called a lot. c) But there's acomplicated set of rules for when it's really a good idea in general. Among other things, MOST strcpy's aren't of constants, they're of pointers to things whose alignment can't be predicted, or at least the target is some arbitrary pointer, and then this optimization doesn't work very well. The only one I've seen that looked like it would really pay off is inlining strcpy's of small constants (1-2 bytes), or ones where you happen to know the alignment, and then up to a few words. d) Remember, we actually do full-bore inlining in the general case.... but are forbidden by the rules from using it.... and we don't. Here's the bottom line: either Dhrystone is a good predictor of integer performance on real programs, or it isn't. If it is (and it once almost used to be), then it's a Good Thing, because it's simple and easy to use. If it doesn't correlate well with performance on real programs, then it's become obsolete. Rather than replowing ground that has been plowed for years, let's try something else, as a bottom line, and get something concrete: QUIZ: It is claimed that a 25MHz 68040 is 2X faster than a 25MHz SPARC on Dhrystone; for concreteness, consider a 68040 with at least 64K external cache, a) Will it be 2X faster on the Geometric Mean of the 4 SPEC C benchmarks? (Using same compiler as Dhrystone.) b) Will it be more than 2X? c) Will it be less than 2X? d) Will it be a lot less than 2X, in fact, maybe closer to 1X? I'd encourage anyone who posts to post an Analysis to back up their opinion, with some data; I'm working on a Guesstimate for about a week from now. If someone prefers other realistic benchmarks, that would be a good exercise as well. In any case, thanx to Mr. Trissell for properly qualifying the Dhrystone number; this actually helps a lot. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
aglew@oberon.csg.uiuc.edu (Andy Glew) (02/13/90)
>Something else that's interesting. In the February 7th Microprocessor >Report, page 4, under new SPEC numbers, a Moto system with a 33 Mhz 88K came >up with a 17.8 SPECmark. Congratulations. However, the article goes on to >note that the 88K SPECmark was only 1% over the SPARC's 17.6 SPECmark (as >well as the MIPS). I would be most interested to see SPECmarks for the >Heurikon board (or any other system) running the 040 at 25 Mhz or even 33 >Mhz. I was waiting for some poster to notice this. I felt honour-bound to type in the first SPECmark report, even though I knew that the 88K results for early, untuned, compilers and OS were being compared with more mature products. Since then I have left Motorola and no longer receive the SPEC reports. Would somebody care to type in the latest report (somebody from MIPS, perhaps? :-) Prediction: MIPS and the 88K will keep swapping places (the same way the 80x86 and 68k families do). They are basically very similar chips, with minor differences that are important to specific applications (not true for 80x86 vs. 68K!). I know that there are some blockbusters in the 88K camp coming, but I'm sure that the same goes for MIPS. The real differences in ranking will come from systems level issues: how good a cache you have, how good your memory bus interface is, how good your compilers are. -- Andy Glew, aglew@uiuc.edu
alan@oz.nm.paradyne.com (Alan Lovejoy) (02/13/90)
In article <AGLEW.90Feb12170628@oberon.csg.uiuc.edu> aglew@oberon.csg.uiuc.edu (Andy Glew) writes: >Would somebody care to type >in the latest [SPECmark] report (somebody from MIPS, perhaps? :-) Well, I'm not from MIPS, but I typed it in and posted it this morning. You're welcome :-). >Prediction: MIPS and the 88K will keep swapping places (the same way >the 80x86 and 68k families do). They are basically very similar chips, >with minor differences that are important to specific applications I hope this is true. That way, it won't matter so much which architecture "wins" (except to MIPS and Moto), which was most definitely NOT the case in the 68k vx. x86 conflict. The 88k and the Rx000 both are CPUs that I can use without holding my nose. ____"Congress shall have the power to prohibit speech offensive to Congress"____ Alan Lovejoy; alan@pdn; 813-530-2211; AT&T Paradyne: 8550 Ulmerton, Largo, FL. Disclaimer: I do not speak for AT&T Paradyne. They do not speak for me. Mottos: << Many are cold, but few are frozen. >> << Frigido, ergo sum. >>
phil@aimt.UU.NET (Phil Gustafson) (02/21/90)
In article <43279@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes: > Reminds me of about 10 years ago, when I wrote some programs to test > branch speeds. I had to add some bogus assignments and outputs which were > never executed, but might have been, in order to get the CDC and Cray > compilers of the time to create the loop. And that was *ten years ago*. > If someone ever puts AI into a compiler, we might as well give up on > benchmarking :-) > > Hugh LaMaster, m/s 233-9, UUCP ames!lamaster Yes. I was responsible for helping make a [nameless] commercial benchmark optimizer resistant. Each test required a bogus assignment and output. One very simple but effective technique involved reading _all_ the constants used by the program from an external file to make sure the compilers didn't do the arithmetic once at compilation time and never do it again. The constant values were listed in comments in the source code -- I figured that if the compiler could optimize using the comments it deserved to win :-} . The same trick might well help with the famous dhrystone strcmp problem. [Aside-- I keep hearing apocryphal stories about compilers that looked for such strings as "dongarra" in source code and acted accordingly. I'd appre- ciate mail from anyone who knows of a real compiler or preprocessor that did this.] -- Opinions outside attributed quotations are mine alone. Satirical material may not be labeled as such. I don't work at this site anymore -- they just let me read their news. -- -- Phil Gustafson, Graphics/UN*X Consultant {uunet,ames!coherent}!aimt!phil phil@aimt.uu.net 1550 Martin Ave, San Jose, Ca 95126