[comp.windows.x] A collection of Xstones for a B/W Sun 3/60

prl@iis.UUCP (Peter Lamb) (06/02/89)

Shortly after Claus Gittinger published his Xbench program and posted
a collection of xstones ratings, he was criticised for not providing
enough information about the compiler/optimisation flags for the servers
he benchmarked. In order to fill in the gap a little, here are some timings
done on a Sun 3/60M, diskless, 8Mb memory, server and client on the same
machine.

The server was compiled with SunOS4.0 cc -O, except for cfb and mfb,
which were compiled with the options shown with each set of results.
There are benchmarks for both the vanilla MIT X11R3 (patchlevel 9)
server and the same server with the Purdue2 B/W speedups applied.

They are sorted on increasing xstones.

The timings are best-of-3 runs for each benchmarked operation (Xbench
default) and a timegoal of 10 sec.

SunOS4.0 cc -O; Vanilla MIT
TOTAL     17849 lineStones
TOTAL     14780 fillStones
TOTAL     15850 blitStones
TOTAL     27915 arcStones
TOTAL     14917 textStones
TOTAL     16013 complexStones
TOTAL     15971 xStones

SunOS4.0 cc -O; Purdue2 speedups
TOTAL     18841 lineStones
TOTAL     13950 fillStones
TOTAL     22691 blitStones
TOTAL     27983 arcStones
TOTAL     15391 textStones
TOTAL     16143 complexStones
TOTAL     17037 xStones

SunOS4.0 cc -O4; Purdue2 speedups
TOTAL     21984 lineStones
TOTAL     16575 fillStones
TOTAL     19704 blitStones
TOTAL     28059 arcStones
TOTAL     21727 textStones
TOTAL     17124 complexStones
TOTAL     19872 xStones

GNU gcc -O -fpcc-struct-return -fstrength-reduce; Vanilla MIT
TOTAL     21483 lineStones
TOTAL     19146 fillStones
TOTAL     23191 blitStones
TOTAL     28111 arcStones
TOTAL     19457 textStones
TOTAL     17464 complexStones
TOTAL     20349 xStones

GNU gcc -O -fpcc-struct-return -fstrength-reduce; Purdue2 speedups
TOTAL     23036 lineStones
TOTAL     17692 fillStones
TOTAL     25457 blitStones
TOTAL     28039 arcStones
TOTAL     20106 textStones
TOTAL     17588 complexStones
TOTAL     20785 xStones

GNU gcc -O -fpcc-struct-return -fstrength-reduce; Purdue2 speedups, no asm()'s
TOTAL     23022 lineStones
TOTAL     17720 fillStones
TOTAL     24938 blitStones
TOTAL     28085 arcStones
TOTAL     20863 textStones
TOTAL     17588 complexStones
TOTAL     20955 xStones

Interestingly, there is almost no difference between the Purdue2 speedups
using the assembly language hacks (using the 68020 bfins and bfext instructions
for inserting bit fields) and the performance without them.

Not surprisingly, the biggest gain is if you have to use `cc -O';
here the fact that the Purdue2 speedups make more sensible use of
register variables than the sample server gives you a big advantage.

One surprise is that Purdue2 seems to *slow down* fill: looking
at the detailed results, Purdue2 is about the same or slightly better
in most of the area fill tests, except plain fill. 

          size		   10      100    400

Purdue2
filled rectangles 	7892.49	1642.91	 252.12		rectangles/sec
tiled rectangles 	5482.12	 734.05	 103.33		rectangles/sec
stippled rectangles 	2050.97	 231.11	  53.82		rectangles/sec
invert rectangles 	5936.20	1403.98	 192.54		rectangles/sec

Vanilla MIT
filled rectangles 	8170.75	2031.36	 276.66		rectangles/sec
tiled rectangles 	5503.75	 736.69	 103.43		rectangles/sec
stippled rectangles 	2019.77	 231.11	  53.86		rectangles/sec
filled polygon		 	 143.53	 		fills/sec
invert rectangles 	5870.02	1661.68	 207.27		rectangles/sec

I suspect that this is due to the use of Duff's device causing
Icache misses. Will probably do something else on machines with
different cache sizes and/or replacement strategies.




-- 
Peter Lamb
uucp:  uunet!mcvax!ethz!prl	eunet: prl@ethz.uucp	Tel:   +411 256 5241
Integrated Systems Laboratory
ETH-Zentrum, 8092 Zurich