[comp.sys.amiga] Floating-Point Benchmark Results

aburto@marlin.NOSC.MIL (Alfred A. Aburto) (01/20/88)

------------

	A while back I posted some Savage Benchmark comparison results 
	here.  The Savage Results were useful but limited in that only
	(some of) the transcendental and trigonometric functions were
	tested.  I needed a floating-point add, sub, mul, and divide
	test ( a simple one ).  I tried the C FLOAT (and other programs)
	but they had problems with some of the optimizing compilers (e.g.,
	zero run times).  Anyway, I finally settled on a reasonable test
	program that seems to hold its own against most optimizing compilers
	(that I have tested).  The program calculates PI using the series   
	expansion of 4 * atan(1).  Pretty simple, but it gives reasonably
	accurate comparisons of system performance with double precision
	floating-point operations.  The main output is expressed in 
	thousands of double precision floating-point operations per second
	(KFLOPS).  KPLOPS is based on the average time it takes to do the 
	double precision +,-,*, and / operations.  This is a bit unrealistic
	because the '/' operation usually dominates since it takes the most
	time.  So the program also outputs a maximum KFLOPS based on the '+'
	operation.  With the 68881 and the Weitek 1167 this results in a 
	measure of the time to do an FADD.D register-to-register (about
	600 nanoseconds for the Weitek and 2.6 microseconds 68881 at 20 MHz).

	If there is any interest I'll post the results I have accumulated
	so far.  Most of the results are for the Amiga and Turbo-Amiga but
	there are some Sun and VAX and PC results as well.  The table of
	results is about 55 lines long.  I will also post the Fortran and
	C FLOPS programs if others would like to check it out.

	Thanks,
	Al Aburto

	aburto@marlin.nosc.mil.UUCP
	nosc!marlin!aburto       
        aburto@NOSC.MIL

aburto@marlin.NOSC.MIL (Alfred A. Aburto) (01/21/88)

----------

	The following are some double precision floating-point test results
--- mostly for the Amiga and Turbo-Amiga.  The KFLOPS results (thousands of
floating-point operations per second) are based on the average time it takes
to do a set of '+', '-', '*', and '/' operations.  As such, the results are
somewhat biased by the divide operation which normally takes the most time to
execute.  In any event, it was the relative comparisons in performance that I
wanted to to examine, and I thought you might find the results interesting
as well.
	The W1167 FPU is the Weitek 1167 floating-point processor chip set
which can do a floating-point add in 600 nanoseconds (1600 KFLOPS Max).  The
divide operation for the Weitek 1167 seems to take about 3 microseconds at
20 MHz (333 KFLOPS Max).  I don't have documentation on the Weitek so I can't
confirm these results but it is a very fast (from my viewpoint) floating-point
processor. 
	If a compiler had an optimize flag ('-O') then I ran the FLOPS program
with and without the flag set.  The 'R' option with the Manx Aztec C is my
notation to indicate I ran with 'register double' variables.  Designating some
variables to 'live' in 68881 registers makes quite a bit of difference in the
results as shown by the Manx Aztec C results.  A huge difference really.
The use of 'register double' variables causes some problems though, because
there is no 'memory address' for these variables.  Absoft Fortran 77 really
gets confused in that it forced 68881 register variables to have an address
(for a subroutine call) by moving (pushing) the 68881 register variables onto
the stack before a subroutine call.  The variables were retrieved from the stack
after the subroutine call and put back into the original 68881 registers. All 
the time of course the data within the 68881 registers was totally valid and
unchanged.  A very inefficient and unnecessary procedure.
	The Lattice C V4.0 results are really a *vast* improvements over
earliar compiler versions (good ol V3.03).  
	It looks to me that the 68000/68881 combo just doesn't do justice to
the 68881's inherent processing capability (The StarBD II results from      
Langeveld (BIX)).  The 68000/68881 provides a significant improvement over
the 68000 with software floating-point but its performance is well below
what can be achieved with the 68020/68881 at the same clock speeds with 16-bit
memory.  The 68000/68882 pair may improve things quite a bit because the
68882 is 2 times faster with FMOVE instructions than the 68881.  Also further
improvements can be achieved with compilers that can reduce or eliminate the
library or subroutine call overhead delays.  Lattice C V4.0 has this 
capability but I haven't seen any results and I don't have a StarBoard II to
test.
	Also it appears that the 68000/68881 just can't measure up to the
more tightly coupled 80286/80287 systems like the Zenith Z-248 or other PC-AT
type systems (with respect to the floating-point results here).  To step ahead
of these systems (from the floating-point viewpoint) the 68020/68881 or 
68020/68882 or 68030/68882 is needed (apparently). 

System             Language                   CPU/FPU     CPU/FPU    KFLOPS
1  Sun 3/280       Sun F77 V3.4 (f77-O)     68020/W1167  25.0/20.0    652.5
2  Compaq DeskPro  High C 386 V?.?          80386/W1167  16.0/16.0    602.4 
3  VAX 8600        4.3 BSD UNIX (f77-O)                               464.9
4  VAX 8600        4.3 BSD UNIX (f77  )                               436.8
5  Sun 3/280       Sun F77 V3.4 (f77  )     68020/W1167  25.0/20.0    330.0
6  DSI-785         SVS C V2.6               68020/68881  30.0/30.0    306.8
7  Sun 3/280       Sun F77 V3.4 (f77-O)     68020/68881  25.0/20.0    238.1
8  Compaq DeskPro  High C 386 V?.?          80386/80387  16.0/16.0    212.8
9  Sun 3/160       Sun F77 V3.4 (f77-O)     68020/68881  16.7/16.7    199.6
10 Turbo-Amiga     Aztec C V3.4B (m8.lib,R) 68020/68881  14.3/14.3    185.7
11 Turbo-Amiga     Aztec C V3.4B (m8.lib,R) 68020/68881   7.2/14.3    185.3
12 Turbo-Amiga     Absoft F77 V2.2C         68020/68881  14.3/14.3    135.5
13 Turbo-Amiga     Absoft F77 V2.2C         68020/68881   7.2/14.3    135.5
14 Sun 3/280       Sun F77 V3.4 (f77  )     68020/68881  25.0/20.0    109.4
15 Sun 3/160       Sun F77 V3.4 (f77  )     68020/68881  16.7/16.7     93.1
16 Turbo-Amiga     Aztec C V3.4B (m8.lib  ) 68020/68881  14.3/14.3     71.2
17 Turbo-Amiga     Aztec C V3.4B (m8.lib  ) 68020/68881   7.2/14.3     57.1
18 PC's Limited286 Ryan-MacFarland F77      80286/80287  12.0/12.0     48.9
19 Zenith Z-248    Ryan-MacFarland F77      80286/80287   8.0/ 8.0     27.0
20 Sun 3/280       Sun F77 V3.4 (f77  )     68020/-----  25.0/----     25.2
21 Sun 3/280       Sun F77 V3.4 (f77-O)     68020/-----  25.0/----     24.9
22 Turbo-Amiga     Aztec C V3.4B (ma.lib,R) 68020/68881  14.3/14.3     22.0
23 Turbo-Amiga     Aztec C V3.4B (ma.lib  ) 68020/68881  14.3/14.3     21.0
24 Tandy 4000      QuickBASIC V4.0          80386/80287  16.0/ 8.0     18.1
25 Turbo-Amiga     Lattice C V4.0( m.lib,R) 68020/-----  14.3/----     
26 Turbo-Amiga     Lattice C V4.0( m.lib  ) 68020/-----  14.3/----     15.3
27 Sun 3/160       Sun F77 V3.4 (f77  )     68020/-----  16.7/----     14.2
28 Sun 3/160       Sun F77 V3.4 (f77-O)     68020/-----  16.7/----     13.9
29 Turbo-Amiga     Absoft F77 V2.2C         68020/-----  14.3/----     12.5
30 Turbo-Amiga     Aztec C V3.4B (ma.lib,R) 68020/68881   7.2/14.3     12.7
31 Amiga/StarBD II Aztec C V3.4B (mi.lib  ) 68000/68881   7.2/12.5     11.7
32 Turbo-Amiga     Aztec C V3.4B (ma.lib  ) 68020/68881   7.2/14.3     11.9
33 Amiga/StarBD II Aztec C V3.4B (ma.lib  ) 68000/68881   7.2/12.5     10.5
34 Turbo-Amiga     Aztec C V3.4B (mx.lib,R) 68020/-----  14.3/----      8.4
35 Turbo-Amiga     Aztec C V3.4B (mx.lib  ) 68020/-----  14.3/----      8.3
36 Turbo-Amiga     Absoft F77 V2.2C         68020/-----   7.2/----      6.0
37 Turbo-Amiga     Lattice C V4.0( m.lib,R) 68020/-----   7.2/----      
38 Turbo-Amiga     Lattice C V4.0( m.lib  ) 68020/-----   7.2/----      5.7
39 Amiga           Lattice C V4.0( m.lib  ) 68000/-----   7.2/----      5.2
40 Turbo-Amiga     Lattice C V3.03          68020/-----  14.3/----      4.7
41 Turbo-Amiga     Aztec C V3.4B (mx.lib,R) 68020/-----   7.2/----      4.4
42 Turbo-Amiga     Aztec C V3.4B (mx.lib  ) 68020/-----   7.2/----      4.3
43 Amiga           Aztec C V3.4B (mx.lib,R) 68000/-----   7.2/----      4.1
44 Amiga           Aztec C V3.4B (mx.lib  ) 68000/-----   7.2/----      4.0
45 Amiga           Absoft F77 V2.2C         68000/-----   7.2/----      3.2
46 Turbo-Amiga     Lattice C V3.03          68020/-----   7.2/----      2.5
47 Amiga           Aztec C V3.4B (mx.lib,R) 68000/-----   7.2/----      2.3
48 Amiga           Aztec C V3.4B (mx.lib  ) 68000/-----   7.2/----      2.3
49 Tandy 3000      QuickBASIC V4.0          80386/-----  16.0/----      2.2
50 Turbo-Amiga     AmigaBASIC V1.2          68020/68881  14.3/----      1.9
51 Turbo-Amiga     AmigaBASIC V1.2          68020/-----  14.3/----      1.5
52 Amiga           AmigaBASIC V1.2          68000/-----   7.2/----      1.4
53 Amiga           Lattice C V3.03          68000/-----   7.2/----      1.1