murphy@eric.mpr.ca (Gail Murphy) (03/28/90)
Recently, Kim Rochat posted an article about the performance of Eiffel vs. C++ vs. Smalltalk for the Mosaic program. Included was source code for the benchmark. Being curious, I tried this benchmark on: + a VAXStation 3100 running Ultrix 3.0 with Eiffel V2.2 (Level A) + a VAXStation 3100 running Ultrix 3.0 and G++ (Version 1.36.2 based on GCC 1.36) The Eiffel generated C-package was also moved to the following platforms: + an Apollo 3500 running SR10.2 + a Sun-3/60 running SunOS 4.0.1 Both of these packages were compiled using the respective platforms CC compilers. The results were: Host Language Compiler Execution Time (secs) Executable Size User System Wall Text Data BSS ---- -------- -------- ---- ------ ---- ---- ---- --- VAX 3100 Eiffel gcc 154.2 176.4 5:55 64512 6144 5028 VAX 3100 c++ g++ 49.1 5.7 1:00 29696 3072 1868 Ap 3500 Eiffel Apollo CC 137.1 4.5 2:27 68060 11204 2552 Sun-3/50 Eiffel Sun CC 163.9 273.7 7:21 90112 16384 0 These results are significantly different from the results posted previously, where the authors found: > While bearing in mind that mosaic is a small program exercising a > limited subset of the languages, two major conclusions can be drawn. > First, any perception that C++ has superior performance to Eiffel may > be invalid. Second, if you're using Eiffel, a different C compiler may > result in significantly increased performance. The executions were conducted as described in the previous article. (ie. Eiffel C-package generation was used with no assertions, the program was run twice in succession, etc.). Similar results were found using the Ultrix CC compiler. The following differences to the previous benchmark (besides Host) exist: + the platforms have 8Mb memory as compared to 12Mb Hosts in the previous benchmark + different versions of g++ and Eiffel were used. Use of prof on the Eiffel C-package outputs (the first 10 lines): %time cumsecs #call ms/call name 36.3 210.061637611 0.13 _sigblock 12.1 279.99 mcount 10.8 342.59 997002 0.06 __0a8005_row_contrastcolorcanappearat 5.8 375.97 _c1_item 5.4 407.151637611 0.02 _setjmp 5.3 437.85 1000 30.69 __0a8000_row_create 3.6 458.61 _c1_put 2.9 475.46 633603 0.03 __0a9001_random_dont_care 2.7 490.96 633603 0.02 _random 2.4 504.76 _c_put A profile of the Eiffel C-package on the Apollo, however, did not reveal anytime spent in _sigblock. A profile of the g++ code was not possible as the -p and -pg options are not yet supported. The call to sigblock (an operating system routine) seems to come from the SETJMP2 macro. The SETJMP2 and OUTJMP2 calls in the row_contrastcolorcanappearat routine were commented out (I think this disables proper exception handling). This resulted in the following: Host Language Compiler Execution Time (secs) User System Wall ---- -------- -------- ---- ------ ---- VAX 3100 Eiffel gcc 109.8 66.5 2:58 Ap 3500 Eiffel Apollo CC 111.1 4.4 1:58 Sun-3/50 Eiffel Sun CC 109.2 106.6 3:36 There is a notable cost, then, for performing the SETJMP2 and OUTJMP2 macros. Is this a change from Eiffel Version 2.1? Why else are the results of the benchmarks remarkably different? Is there a more performance cost efficient means for performing these actions? Has anyone else tried these benchmarks on the same or other platforms? What results have you obtained? Gail Murphy | murphy@joplin.mpr.ca Microtel Pacific Research | joplin.mpr.ca!murphy@uunet.uu.net 8999 Nelson Way, Burnaby, BC | murphy%joplin.mpr.ca@relay.ubc.ca Canada, V5A 4B5, (604) 293-5462 | ...!ubc-vision!joplin.mpr.ca!murphy Gail Murphy | murphy@joplin.mpr.ca Microtel Pacific Research | ubc-cs!eric!murphy@UUNET.UU.NET 8999 Nelson Way, Burnaby, BC | murphy%joplin.mpr.ca@relay.ubc.ca Canada, V5A 4B5, (604) 293-5462 | ...!ubc-vision!joplin.mpr.ca!murphy
nosmo@eiffel.UUCP (Vince Kraemer) (03/29/90)
>In article <21110@kiwi.mpr.ca>, Gail Murphy (murphy@eric.mpr.ca) writes: >[Some intro comments and some really embarassing statistic deleted] > >These results are significantly different from the results posted previously, >where the authors found: > >[Quote from Kim Rochat's article deleted] > >The executions were conducted as described in the previous article. >(ie. Eiffel C-package generation was used with no assertions, the >program was run twice in succession, etc.). Similar results were found >using the Ultrix CC compiler. The following differences to the previous >benchmark (besides Host) exist: > >+ the platforms have 8Mb memory as compared to 12Mb Hosts in the > previous benchmark > >+ different versions of g++ and Eiffel were used. > >Use of prof on the Eiffel C-package outputs (the first 10 lines): > > %time cumsecs #call ms/call name > 36.3 210.061637611 0.13 _sigblock > 12.1 279.99 mcount > 10.8 342.59 997002 0.06 __0a8005_row_contrastcolorcanappearat > 5.8 375.97 _c1_item > 5.4 407.151637611 0.02 _setjmp > 5.3 437.85 1000 30.69 __0a8000_row_create > 3.6 458.61 _c1_put > 2.9 475.46 633603 0.03 __0a9001_random_dont_care > 2.7 490.96 633603 0.02 _random > 2.4 504.76 _c_put > From the evidence that I see here, I think that assertion checking was not turned completely off. The way to do this, which is a tad bit obtuse in the documentation is to set up the SDF as follows: NO_ASSERTION_CHECK (Y): ALL PRECONDITIONS (N): ALL ALL_ASSERTIONS (N): ALL C_PACKAGE (Y): {some dir name} This is the only explanation for the presence of the _sigblock and _setjmp calls. >A profile of the Eiffel C-package on the Apollo, however, did not reveal >anytime spent in _sigblock. A profile of the g++ code was not possible >as the -p and -pg options are not yet supported. > >The call to sigblock (an operating system routine) seems to come from the >SETJMP2 macro. The SETJMP2 and OUTJMP2 calls in the >row_contrastcolorcanappearat routine were commented out (I think >this disables proper exception handling). This disables the "default" exception handling behavior; printing out an exception stack. Rescues will still be taken care of - if present with all assertions off. (It should be noted here that the use of rescue inside a loop will also produce many calls to setjmp. Therefore, don't use rescues inside inner loops is a good rule to remember when doing Eiffel programming "for speed".) > >This resulted in the following: > >Host Language Compiler Execution Time (secs) > User System Wall >---- -------- -------- ---- ------ ---- > >VAX 3100 Eiffel gcc 109.8 66.5 2:58 > >Ap 3500 Eiffel Apollo CC 111.1 4.4 1:58 > >Sun-3/50 Eiffel Sun CC 109.2 106.6 3:36 > >There is a notable cost, then, for performing the SETJMP2 and OUTJMP2 >macros. Is this a change from Eiffel Version 2.1? Why else are the >results of the benchmarks remarkably different? Is there a more performance >cost efficient means for performing these actions? The above I'll take in order: 1. There has been very little change in the implementation of assertion handling from Eiffel 2.1. 2. I'm not too sure why these benchmark results are so different. One possible reason may be, was the SDF OPTIMIZE line set to true, for all classes? What was the performance on the VAX 3100, using the Ultrix C compiler? I know that the gcc optimizer is a real winner: if the gcc vs Ucc time is comparable, it would indicate that C compiler optimization was not being done. I have a gut feeling that the time difference between the Vax3100/gcc combo vs. Sun-3/50/Sun cc in this second set of trials should be higher, if gcc is optimizing. Another cost that is due to preconditions being used is the use of a routine to access row members, instead of a more direct access method through a macro expansion. This is still being done here (See the note above about turning off all assertions), thus adding to the number of functions being called by about 2,000,000 3. There are ways of implementing exception handling with a better cost/benefit. The problem is they aren't very portable. Our goal was portability for the systems produced. Also worth noting, we see assertions as a development tool. They are to be stripped out (via recompilation) at delivery time. I hope this has helped shed some light on the subject and not cloudy it. I too would be interested in find out about the results of this benchmark on other platforms -- especially 386 boxes using the implementations of c++ available for them. Vince Kraemer ISE Jack-of-all-Trades business related reply-to: eiffel@eiffel.com personal correspondences reply-to: nosmo@eiffel.com
jimad@microsoft.UUCP (Jim ADCOCK) (04/10/90)
In article <KIM.90Apr2202123@helios.enea.se> kim@helios.enea.se (Kim Wald`n) writes: > >The only reason to turn off all assertion checking when running benchmarks >against C++ should be to get more comparable figures. > >As we all know, there is no such thing as a thoroughly tested software >system, and the resulting safety is well worth the extra cpy cycles. I believe there is a contradiction in these two statements. Timing comparisons should be representative of final executable code as delivered to customers. If Eiffel code is to be delivered with assertions in the code, and C++ code is not [typically] delivered with assertions in the code, then the relative timings between the languages should reflect this fact. If the cost in Eiffel of the assertions is worthwhile, then you should be able to successfully argue for these costs, even if they make Eiffel code bigger and slower. Speed and size is only two [but important] measures of a language. Typically one pays for the features of a language. Timings and code sizes let a user decide if he/she considers the costs of the features represent a good value for that user. Alternately, show size and timings for code both with and without assertion checking. Then users can decide if they agree with your assessment that runtime checking is worth the cost. Don't "cook" comparisons to make them unrepresentative of how people actually use languages. Comparisons should be representative.
sakkinen@tukki.jyu.fi (Markku Sakkinen) (04/10/90)
In article <54007@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: >In article <KIM.90Apr2202123@helios.enea.se> kim@helios.enea.se (Kim Wald`n) writes: >> >>The only reason to turn off all assertion checking when running benchmarks >>against C++ should be to get more comparable figures. >> >>As we all know, there is no such thing as a thoroughly tested software >>system, and the resulting safety is well worth the extra cpy cycles. > >I believe there is a contradiction in these two statements. Timing comparisons >should be representative of final executable code as delivered to customers. >If Eiffel code is to be delivered with assertions in the code, and C++ >code is not [typically] delivered with assertions in the code, then the >relative timings between the languages should reflect this fact. I disagree, i.e. agree with the original poster. See below. > ... >Alternately, show size and timings for code both with and without >assertion checking. Then users can decide if they agree with your >assessment that runtime checking is worth the cost. Yes, _this_ is a sensible suggestion. >Don't "cook" comparisons to make them unrepresentative of how people >actually use languages. Comparisons should be representative. No, comparisons should above all be as comparable as possible, i.e. no apples and oranges. Of course, the less similar languages one has, the harder it is to design a meaningful and fair comparison. Eiffel vs. C++ is a lot easier than Prolog vs. RPG. Markku Sakkinen Department of Computer Science University of Jyvaskyla (a's with umlauts) Seminaarinkatu 15 SF-40100 Jyvaskyla (umlauts again) Finland SAKKINEN@FINJYU.bitnet (alternative network address)
jimad@microsoft.UUCP (Jim ADCOCK) (04/13/90)
In article <4083@tukki.jyu.fi> sakkinen@jytko.jyu.fi (Markku Sakkinen) writes: >No, comparisons should above all be as comparable as possible, >i.e. no apples and oranges. Of course, the less similar languages >one has, the harder it is to design a meaningful and fair comparison. >Eiffel vs. C++ is a lot easier than Prolog vs. RPG. I don't consider it apples and oranges to compare two languages as they are actually used. To compare "apples to apples" is one to gin up a hashed dispatcher in C++ to slow it down enough to compare to Smalltalk dispatching? This makes C++ more like Smalltalk -- does that make the comparison more fair? Should one add bounds checking on C++ arrays to make it more comparable to some Pascal compilers? When comparing Pascal code to C code does one write the C code using a Pascal-like coding style? -- If you keep playing these kinds of games to try to make languages look more and more similar, then you end up with two identical sets of features -- only the syntax is different. Then you are not comparing the languages nor the compilers, but only the two back-end code generators. Which seems pretty silly -- especially if you're attempting to compare two languages/compilers both using the same C compiler as the back-end code generator! Surprise: Given two half reasonable front-ends both implementing the same exact set of features, you get exactly the same code out the C back-end compiler. The choices made in languages: safety vs speed vs flexibility etc, are reflected in in the size and speed of the resulting code. If one believes one's language has made the right choices, then one should be willing to live by the results. I think user's of Smalltalk [rightly or wrongly] would be willing to claim their language is worth the speed hit.
bruce@menkar.gsfc.nasa.gov (Bruce Mount) (04/13/90)
In article <54060@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: >[Stuff deleted] >I don't consider it apples and oranges to compare two languages as >they are actually used. To compare "apples to apples" is one to >gin up a hashed dispatcher in C++ to slow it down enough to compare to >Smalltalk dispatching? Yes it does, if hashed dispatching is part of your application. The point of a benchmark is to compare different platforms (languages or hardware), not different implementations. It is up to the reader to determine if the benchmark applies to their application. The Mosaic Benchmark compares a few specific features---this was freely admitted by the author---not the whole range of OO activity. I agree with the poster that stated that a more comprehensive benchmark would be invaluble (testing dynamic binding, etc), however, this does not diminish the value of the Mosaic benchmarch. Rather than griping about the Mosaic benchmark, why don't we all think of a more comprehensive one(s)? --Bruce ================================================= | Bruce Mount "Brevity is best" | | bruce@atria.gsfc.nasa.gov | =================================================
bruce@menkar.gsfc.nasa.gov (Bruce Mount) (04/13/90)
Please forgive the multiple spelling mistakes in my last posting. I was typing *WAY* too fast. ================================================= | Bruce Mount "Brevity is best" | | bruce@atria.gsfc.nasa.gov | =================================================