[comp.sys.next] floating point performance

mahesh@ndmath.UUCP (Mahesh) (06/13/89)

 A member of the faculty ran some tests on the performance of
 the NeXT, with respect to floating point performance, scalar
 performance, and general "real world" problems.  I have
 attatched the results below.  But first:

DISCLAIMER:  These results were btained by running two
programs, loops7 and dqtest.  Source for these can be obtained
from me if necessary.  However, the results may not indicate
any belief held by my organization regarding relative
performance of different systems, i.e., they are strictly
private.

	Now that that is out of the way, as you can see, the cube is
not all that bad regarding f.p. performance.  However,
limitations seem to be those inherent to the 68882, i.e.,
great transcendental, and exponential performance, but
mediocre addition, subtraction, and multiplication.

	The results were obtained on 0.8.  I do not expect any
dramatic improvements in 0.9, since most of the coding was
done at a pretty basic level.  


loops7 ----------------------------------------------------

Timing trials of basic arithmetic and math functions on various Unix based
machines, Units are millions of operations per second. All calculations are
scalar as opposed to vector (as is possible on the Convex.)

The "group" times are, for each of the three groups, the number of operations
in total divided by the the total time for all members of the group.

The fastest for each trial are marked with an asterisk (the Dec3100 excepting
two unexpected cases.)

operation           MacII A/UX    NeXT    Dec3100 Sun3-280 Sun3-280    Convex
                                                    no fpa   fpa         C-120

double precision
  decrement             0.0704   0.4211   16.5517*   0.2453    1.0643  1.3370
  addition              0.0686   0.7754   15.4839*   0.4163    2.5668  1.3408
  multiplication        0.0628   0.4913    5.3333*   0.3288    2.2018  2.3762
  division              0.0543   0.2970    1.0084*   0.2083    0.2709  0.2614

  group                 0.0659   0.4968    5.7992*   0.3031    1.2009  1.1321

double precision
  exp()                 0.0140   0.0552    0.1579*   0.0095    0.0253  0.0578
  sin()                 0.0072   0.0732    0.1611*   0.0059    0.0176  0.0727
  atan()                0.0132   0.0665    0.1290*   0.0056    0.0195  0.0565
  sqrt()                0.0163   0.1356*   0.1116    0.0034    0.0040  0.0406

  group                 0.0115   0.0738    0.1367*   0.0053    0.0101  0.0545

integer
  increment             7.7922  12.3077    15.7895* 12.1212   12.1827  9.5238
  addition              7.6238  12.2449*    8.1301  10.7143   11.9522  9.4192
  multiplication        0.3070   0.5128     1.3475   5.8065*   2.7000  1.8832
  mod 17                0.1523   0.2384     0.4619*  0.2512    0.2541  0.3138
  one loop              1.4585   2.4107     4.0374*  2.3789    2.4215  2.3504

  group                 0.9830   1.6029    3.0783*   3.0088    2.8030  2.7368






  dqtest-----------------------------------------------------------


Timing trials of dqtest on various machines (times rounded to nearest second.)
dqtest is a program (in C and fortran) that solves the 3-body problem with
two of the bodies stationary. This involves 4 simultaneous differential
equations. This is well representative of many applicationa in science. The
four equations are solved using a standard routine. That routine is included
in the code. It has not been written with vectorization in mind. Maximum
optimization possible was used for each run.

cpu             time        compiler    processor       clock  options
                (secs)                                    MHz

Sun2-120        1325        fortran     NS68010           ?     -03
Vax-750         425         fortran     Digital           ?     none
Sun3-280        170             C       NS68020          25     -03, -f68881
Mac A/UX        106             C       NS68020        15.7     -03, -f68881
MicroVaxII      92          fortran     Digital           ?     coprocessor
Sun3-280        75              C       NS68020          25     -03, -ffpa
Sun3-280        66          fortran     NS68020          25     -03, -ffpa
NeXT            46              C       NS68030          25     -0, -f68882
Vax-3200        31          fortran     Digital           ?     coprocessor
Convex-120      15          fortran     Convex            ?     -01 (scalar)
Convex-120      14          fortran     Convex            ?     -02 (vector)
Dec-3100        10              C       MipS              ?     default





My thanks to Dr. Ken Grant for doing all the above.  

Mahesh Subramanya
Office of University Computing
University of Notre Dame
Notre Dame,  IN  46556

mahesh@darwin.cc.nd.edu


#include <.signature>

Subject: Re: Floating point performance
Newsgroups: comp.sys.next
Distribution: usa

david@sun.com (I refuse to say) (06/14/89)

In article <1440@ndmath.UUCP> mahesh@ndmath.UUCP (Mahesh) writes:

>operation           MacII A/UX    NeXT    Dec3100 Sun3-280 Sun3-280    Convex
>                                                    no fpa   fpa         C-120
>
>double precision
>  decrement             0.0704   0.4211   16.5517*   0.2453    1.0643  1.3370
>  addition              0.0686   0.7754   15.4839*   0.4163    2.5668  1.3408

>integer
>  increment             7.7922  12.3077    15.7895* 12.1212   12.1827  9.5238
>  addition              7.6238  12.2449*    8.1301  10.7143   11.9522  9.4192

The discrepancies between decrement/increment and addition make me very
suspicious of this benchmark.

-- 
David DiGiacomo, Sun Microsystems, Mt. View, CA  sun!david david@sun.com

rick@hanauma (Richard Ottolini) (07/31/89)

The 68040 NeXT will be in the 12-15 MIPS, 2-4 MFLOP performance range,
comparable to a DEC 3100 or SparcStation, whenever it ships.
Consider more your software investment, whether NeXTStep and Display
PostScript will help you write better software in an easier manner.