[comp.sys.amiga.tech] "nsieve" performance on Amy

kim@amdahl.uts.amdahl.com (Kim DeVaughn) (08/22/88)
[ A free line is worth what you paid for it! ]

Whilst taking a break from the ever present "Computer Wars" in comp.sys.amiga,
I happened across the attached article in comp.arch (where "Computer Wars"
are at least sprinkled with lots of tech-speak ...)

As may be, this looked interesting enough to repost here, and just maybe
provoke some discussion of compiler efficiency, whether a (scaled) Turbo-
Amiga is really up to taking on a Sun-3/280, or a VAX 8600, etc.  :-)

Al doesn't say whether the 020's cache was enabled or dis, nor does he state
that the Turbo-Amiga was using 32-bit memory (though it clearly must have).
I did email Al to ask though, and also to request a copy of his "nsieve" code.
Be interesting to see what Lattice vs. Manx does on my 010 machine.  I'll
post the results if there is interest, when & if I get the code.  And I'll be
sure to send the code off to the moderators at purdue, too ...

One observation.  For the most part, the rusults for all the machine/compiler
configs tested, shows a roughly linear decrease in performance of ~50% for
each doubling of the array size.  Guess that indicates that mostly what is
being tested, is the operand EAG (effective address generation) of the
processor, and how well the compiler utilizies it.  Anyone want to run this
on a "segmented" architecture processor?

And now ... back to your favorite "bashing" topic ...

/kim


vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

>From: aburto@marlin.NOSC.MIL (Alfred A. Aburto)
Newsgroups: comp.arch
Subject: nsieve benchmark
Date: 18 Aug 88 19:08:59 GMT
Organization: Naval Ocean Systems Center, San Diego


In the June BYTE benchmark article 'Problems and Pitfalls' I showed some
Sieve benchmark results for various array sizes.  The sieve program I used
wasn't Gilbreath's original program but a version modified for register
long variables and large arrays (via 'malloc()').  I also put in a loop
counter within the outer ('iter') loop to hopefully prevent optimizing
compilers from deleting this loop altogether.  I called this version
'nsieve'.  I would be happy to post nsieve.c if anyone is interested.  It
includes timing routines for UNIX (I tested it on the SUN and VAX), Amiga,
and the IBM PC systems using TURBO C.  My mail address:
aburto@marlin.nosc.mil.UUCP  .


 These are some NSIEVE Results for various array sizes (10 Iterations):

 Array Size   --------------------- BenchTime (sec) -----------------------
  (Bytes)     SUN 3/280      VAX 8600      Turbo-Amiga         Amiga
	    (68020 @25MHz)              (68020 @14.32MHz)  (68000 @ 7.16 MHz)
      8191	  0.250 	0.250	      0.461 (0.264)    2.280
     10000	  0.317 	0.333	      0.578 (0.331)    2.820
     20000	  0.667 	0.800	      1.195 (0.684)    5.700
     40000	  1.333 	1.817	      2.383 (1.365)   11.560
     80000	  2.967 	3.700	      4.820 (2.761)   23.340
    160000	  7.933 	8.133	      9.758 (5.589)   47.200
    320000	 17.533        18.100	     ------    ^      ------
						       |
						       |
	    Scaled to 25 MHz SUN clock speed ----------+

 Average Run Time (sec) relative to 10 Iterations and the 8191 array size:
      8191	  0.316 	0.354	      0.484 (0.277)    2.350



 (1) SUN 3/280, 25 MHz 68020 CPU.  SUN UNIX 4.2 Release 3.4.
     compiled using 'cc -O'.

 (2) VAX 8600,  ?? MHz ????? CPU.  UNIX 4.3 BSD. compiled using 'cc -O'.

 (3) Turbo-Amiga, 14.32 MHz 68020 CPU. Compiled using Manx Aztec C V3.4B
     and 'cc +2 +L +ff' (no 'optimize' option available).

 (4) Amiga, 7.16 MHz 68000 CPU. Compiled using Manx Aztec C V3.4B and
     'cc +L +ff' (no 'optimize' option available).

 The VAX 8600 results are different from those that appeared in the
 Jun Byte Article because I thought I could not define 'register'
 variables at all, but I was in error.	 'register long' variables
 seem to work fine on our VAX 8600 UNIX 4.3 C compiler.  It is the
 register 'short' types that (apparently) cause problems.

 The VAX and SUN run times show a slight non-linear trend as the array
 size increases.  This is due I suspect to an increasing number of 'page
 faults' with increasing array size.  I suspect this because the Turbo-Amiga
 has no memory management while the UNIX systems do and the Turbo-Amiga run
 times are very linear (almost) with respect to increasing array size.

 It is very interesting to me that the Turbo-Amiga results (for small arrays)
 scaled to 25 MHz are in very close agreement to the SUN 3/280 results.
 This indicates to me that the Turbo-Amiga non-optimizing Manx Aztec C
 compiler was generating just as efficient nsieve code as the optimizing
 SUN 3/280 C compiler.	This also indicates that nsieve is probably not
 very susceptable to optimization.  I know, these results could mean other
 things, but I'm assuming the SUN 3/280 does indeed do some optimizations
 that Manx Aztec C doesn't do (Aztec C doesn't claim to do any optimizations
 at all).

 NSIEVE is, I think, a farely reasonable test program.	The ratio of NSIEVE
 performance between the SUN 3/280 and the Turbo-Amiga for example is not
 too much different than the ratio of Dhrystone performance.  The Turbo-Amiga
 with Manx Aztec C does about 3000 Dhrystones/sec and on a SUN 3/280 I would
 expect this compiler to give me 3000 * 0.461 / 0.267 = 5180 Dhrystones/sec
 or so based upon the NSIEVE results.

 Al Aburto
 aburto@marlin.nosc.mil.UUCP



-- 
UUCP:  kim@amdahl.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25