[comp.arch] SUN procedure inlining - Dhrystone numbers

schmitz@fas.ri.cmu.edu (Donald Schmitz) (04/05/89)

I wrote:
>I could easily write a strcmp and strcpy .il file and compile the source 
>(Dhrystone) with them.

For the fun of it, I did this.  I actually assembly coded strcpy() and
strcmp() myself, as I couldn't find SUN's sources for them - I'm assuming
SUN did at least as I good as I did in a half hour's effort.  As someone
else mentioned, the compiler does a better job on inlining than I
remembered, it eliminates most argument passing overhead as well as the
subroutine startup overhead. Also, I used only scratch registers a0, a1, d0
and d1, eliminating the need to save and restore registers.  If there is any
interest, I can post the inline sources.  Following are the numbers I got on
a SUN3/60, /160, and /260.  The benchmark is Dhrystone version 2.0, sources
taken from the net a while back (I guess 2.1 is out, but the numbers should
still mean something in a relative sense), I used 100000 iterations, and the
results listed are the best of 3 runs.  All runs were done with the -O
switch, the results include runs with and without register declarations.

Machine/		3/60		3/160		3/260
CC Options

wo regs, no inline	3338		2947		5882
wo regs, inline on	3462		3061		6090
w regs, no inline	3722		3264		6472
w regs, inline on	3934		3391		6696

Bottom line, there is an ~4% speedup in all cases using the inline mode.
This probably only matters if you are a real-time hacker (me), or write
advertizing copy, but it is a nice feature to have.  All that is missing
(anyone at SUN listening) is a compiler switch to generate .il files 
directly from .c files.

Don Schmitz
--