[comp.benchmarks] Linpack on SPARCstation 2 vs. SPARCstation 1+ vs. Sun 4/490

tn@leadsv.UUCP (Tristan Nefzger) (12/05/90)

Jack Dongarra's* single and double precision linpack programs
(linpacks and linpackd**) were compiled at several optimization levels
and run on the SPARCstation 2 (Calvin), SPARCstation 1+ and the Sun
4/490.  Both the SPARCstations were running SunOS 4.1.1 Beta while the
4/490 was running SunOS 4.1.  The compilers used were Sun's FORTRAN
1.3.1 and the Sun C compiler bundled with SunOS 4.1.1 Beta.  The
"-Bstatic" option was used in all cases to insure compile time
(static) linking with libraries.  The optimization levels used were
(0) no optimization; (1) level 1 (-O1); (2) level 2 (-O2); and (3)
level 3 (-O3).  The C compiler was used only to compile a module
encapsulating the SunOS times() system call for measurement of (user)
time.  This C module and its FORTRAN interface, second.f, which
defines the second() function in the linpack programs, were supplied
by Paul Hansen, and are included below.  Results are based on the last
of three consecutive executions of each appropriately compiled linpack
program the output of which is also included below.  Oddly enough
the (arithmetic) average of the single and double precision Mflops 
for the SPARCstation 2 at level 3 optimization is 4.2, which is
exactly what Sun claims it to be in SunFLASH Vol 23 #6, "SPARCstation 2
Family," November 1990 (distributed via email from Sun).


Mflops averages
---------------

optimization    SPARCstation 2      SPARCstation 1        Sun4/490
   level        single  double      single  double      single  double
                   average             average             average

     0          1.7     1.3         1.0     0.7         1.4     1.1
                    1.5                 .85                 1.2

     1          1.7     1.3         1.0     0.7         1.4     1.1
                    1.5                 .85                 1.2

     2          3.7     2.3         2.1     1.1         2.9     2.0
                    3.0                 1.6                 2.4
        
     3          5.1     3.3         2.7     1.6         4.9     3.1
                    4.2                 2.1                 4.0

(All averages are arithmetic means.)


* Jack J. Dongarra
  Computer Science Department
  University of Tennessee
  Knoxville, Tennessee 37996-1300
  Fax: 615-974-8296
  Internet: dongarra@cs.utk.edu

**Available from netlibd@surfer.epm.ornl.gov via email with the respective
  subject lines:

  (1) send linpacks from benchmark
  (2) send linpackd from benchmark

===============================================================================
/* mclock.c 
 * 
 * module written by P. Hansen
 * encapsulates SunOS times() call
 *
 */

long mclock_()
  {
  long  buf[4];
  times(buf);
  return(buf[0]);
  }
===============================================================================
c     second.f
c     module written by P.Hansen
c     calls mclock()

      real function second(t)
c
c     this routine will gather the user time for a process.
c     it has resolution of 1/60 of a second
c     and uses the unix c program times.
c     see the unix manual for details
c     reports time in seconds.
c
      itime = mclock(i)
      second = float(itime)/60.
c
c     this statement is here to bump the time by a bit
c     incase no the interval was too small.
c
      second = second + second*1.0e-6
      return
      end
===============================================================================
linpacks results for SPARCstation 2/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  4.000E-01  0.000E+00  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.917E-01  1.167E-02  4.033E-01  1.702E+00  1.175E+00  7.202E+00

 times for array with leading dimension of 200
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  4.000E-01  1.667E-02  4.167E-01  1.648E+00  1.214E+00  7.440E+00
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.967E-01  1.000E-02  4.067E-01  1.689E+00  1.184E+00  7.262E+00

level 1 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  4.000E-01  1.667E-02  4.167E-01  1.648E+00  1.214E+00  7.440E+00
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.833E-01  0.000E+00  3.833E-01  1.791E+00  1.117E+00  6.845E+00
  3.800E-01  1.000E-02  3.900E-01  1.761E+00  1.136E+00  6.964E+00

 times for array with leading dimension of 200
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.833E-01  1.667E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.667E-01  3.333E-02  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.867E-01  1.167E-02  3.983E-01  1.724E+00  1.160E+00  7.113E+00

level 2 optimization:

   norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.000E-01  0.000E+00  2.000E-01  3.433E+00  5.825E-01  3.571E+00
  1.833E-01  0.000E+00  1.833E-01  3.745E+00  5.340E-01  3.274E+00
  1.833E-01  0.000E+00  1.833E-01  3.745E+00  5.340E-01  3.274E+00
  1.850E-01  5.000E-03  1.900E-01  3.614E+00  5.534E-01  3.393E+00

 times for array with leading dimension of 200
  1.833E-01  0.000E+00  1.833E-01  3.745E+00  5.340E-01  3.274E+00
  1.833E-01  0.000E+00  1.833E-01  3.745E+00  5.340E-01  3.274E+00
  1.667E-01  0.000E+00  1.667E-01  4.120E+00  4.854E-01  2.976E+00
  1.800E-01  5.000E-03  1.850E-01  3.712E+00  5.388E-01  3.304E+00

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.267E-01  5.000E-03  1.317E-01  5.215E+00  3.835E-01  2.351E+00

 times for array with leading dimension of 200
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.300E-01  5.000E-03  1.350E-01  5.086E+00  3.932E-01  2.411E+00
===============================================================================
linpacks results for SPARCstation 1+/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  6.500E-01  1.667E-02  6.667E-01  1.030E+00  1.942E+00  1.190E+01
  6.500E-01  3.333E-02  6.833E-01  1.005E+00  1.990E+00  1.220E+01
  6.667E-01  1.667E-02  6.833E-01  1.005E+00  1.990E+00  1.220E+01
  6.617E-01  2.000E-02  6.817E-01  1.007E+00  1.985E+00  1.217E+01

 times for array with leading dimension of 200
  6.667E-01  1.667E-02  6.833E-01  1.005E+00  1.990E+00  1.220E+01
  6.500E-01  1.667E-02  6.667E-01  1.030E+00  1.942E+00  1.190E+01
  6.500E-01  3.333E-02  6.833E-01  1.005E+00  1.990E+00  1.220E+01
  6.600E-01  2.000E-02  6.800E-01  1.010E+00  1.981E+00  1.214E+01


level 1 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  6.333E-01  1.667E-02  6.500E-01  1.056E+00  1.893E+00  1.161E+01
  6.333E-01  1.667E-02  6.500E-01  1.056E+00  1.893E+00  1.161E+01
  6.500E-01  1.667E-02  6.667E-01  1.030E+00  1.942E+00  1.190E+01
  6.350E-01  2.000E-02  6.550E-01  1.048E+00  1.908E+00  1.170E+01

 times for array with leading dimension of 200
  6.333E-01  1.667E-02  6.500E-01  1.056E+00  1.893E+00  1.161E+01
  6.500E-01  1.667E-02  6.667E-01  1.030E+00  1.942E+00  1.190E+01
  6.500E-01  1.667E-02  6.667E-01  1.030E+00  1.942E+00  1.190E+01
  6.433E-01  2.000E-02  6.633E-01  1.035E+00  1.932E+00  1.185E+01


level 2 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  3.000E-01  1.667E-02  3.167E-01  2.168E+00  9.223E-01  5.655E+00
  3.167E-01  0.000E+00  3.167E-01  2.168E+00  9.223E-01  5.655E+00
  3.167E-01  0.000E+00  3.167E-01  2.168E+00  9.223E-01  5.655E+00
  3.200E-01  1.000E-02  3.300E-01  2.081E+00  9.612E-01  5.893E+00

 times for array with leading dimension of 200
  3.333E-01  0.000E+00  3.333E-01  2.060E+00  9.709E-01  5.952E+00
  3.167E-01  1.667E-02  3.333E-01  2.060E+00  9.709E-01  5.952E+00
  3.167E-01  1.667E-02  3.333E-01  2.060E+00  9.709E-01  5.952E+00
  3.183E-01  1.000E-02  3.283E-01  2.091E+00  9.563E-01  5.863E+00

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.333E-01  1.667E-02  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.333E-01  1.667E-02  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.500E-01  0.000E+00  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.383E-01  6.667E-03  2.450E-01  2.803E+00  7.136E-01  4.375E+00

 times for array with leading dimension of 200
  2.333E-01  1.667E-02  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.500E-01  0.000E+00  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.500E-01  0.000E+00  2.500E-01  2.747E+00  7.282E-01  4.464E+00
  2.450E-01  6.667E-03  2.517E-01  2.728E+00  7.330E-01  4.494E+00
===============================================================================
linpacks results for Sun 4-490/SunOS 4.1/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  5.000E-01  0.000E+00  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.833E-01  3.333E-02  5.167E-01  1.329E+00  1.505E+00  9.226E+00
  4.833E-01  3.333E-02  5.167E-01  1.329E+00  1.505E+00  9.226E+00
  4.950E-01  1.500E-02  5.100E-01  1.346E+00  1.485E+00  9.107E+00

 times for array with leading dimension of 200
  5.000E-01  0.000E+00  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  5.000E-01  0.000E+00  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.833E-01  1.667E-02  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.933E-01  1.500E-02  5.083E-01  1.351E+00  1.481E+00  9.077E+00

level 1 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  4.833E-01  1.667E-02  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.833E-01  0.000E+00  4.833E-01  1.421E+00  1.408E+00  8.631E+00
  4.833E-01  1.667E-02  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.817E-01  1.333E-02  4.950E-01  1.387E+00  1.442E+00  8.839E+00

 times for array with leading dimension of 200
  4.667E-01  0.000E+00  4.667E-01  1.471E+00  1.359E+00  8.333E+00
  4.833E-01  1.667E-02  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.833E-01  1.667E-02  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  4.800E-01  1.333E-02  4.933E-01  1.392E+00  1.437E+00  8.810E+00

level 2 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.333E-01  0.000E+00  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.167E-01  1.667E-02  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.167E-01  1.667E-02  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.233E-01  8.333E-03  2.317E-01  2.964E+00  6.748E-01  4.137E+00

 times for array with leading dimension of 200
  2.167E-01  1.667E-02  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.167E-01  1.667E-02  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.333E-01  0.000E+00  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.233E-01  6.667E-03  2.300E-01  2.986E+00  6.699E-01  4.107E+00

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.59498179E+00  3.80277634E-05  1.19209290E-07  9.99986172E-01  9.99992490E-01


    times are reported for matrices of order   100
      sgefa      sgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  1.500E-01  0.000E+00  1.500E-01  4.578E+00  4.369E-01  2.679E+00
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.500E-01  0.000E+00  1.500E-01  4.578E+00  4.369E-01  2.679E+00
  1.367E-01  5.000E-03  1.417E-01  4.847E+00  4.126E-01  2.530E+00

 times for array with leading dimension of 200
  1.333E-01  0.000E+00  1.333E-01  5.150E+00  3.883E-01  2.381E+00
  1.500E-01  0.000E+00  1.500E-01  4.578E+00  4.369E-01  2.679E+00
  1.500E-01  0.000E+00  1.500E-01  4.578E+00  4.369E-01  2.679E+00
  1.417E-01  3.333E-03  1.450E-01  4.736E+00  4.223E-01  2.589E+00
===============================================================================
linpackd results for SPARCstation 2/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  5.167E-01  1.667E-02  5.333E-01  1.287E+00  1.553E+00  9.524E+00
  5.167E-01  1.667E-02  5.333E-01  1.287E+00  1.553E+00  9.524E+00
  5.167E-01  1.667E-02  5.333E-01  1.287E+00  1.553E+00  9.524E+00
  5.083E-01  1.500E-02  5.233E-01  1.312E+00  1.524E+00  9.345E+00

 times for array with leading dimension of 200
  5.333E-01  0.000E+00  5.333E-01  1.287E+00  1.553E+00  9.524E+00
  5.833E-01  1.667E-02  6.000E-01  1.144E+00  1.748E+00  1.071E+01
  5.667E-01  0.000E+00  5.667E-01  1.212E+00  1.650E+00  1.012E+01
  5.333E-01  1.500E-02  5.483E-01  1.252E+00  1.597E+00  9.792E+00

level 1 optimization:

   times are reported for matrices of order   100
   dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  5.167E-01  0.000E+00  5.167E-01  1.329E+00  1.505E+00  9.226E+00
  5.000E-01  1.667E-02  5.167E-01  1.329E+00  1.505E+00  9.226E+00
  5.000E-01  0.000E+00  5.000E-01  1.373E+00  1.456E+00  8.929E+00
  5.000E-01  1.667E-02  5.167E-01  1.329E+00  1.505E+00  9.226E+00

 times for array with leading dimension of 200
  5.167E-01  1.667E-02  5.333E-01  1.288E+00  1.553E+00  9.524E+00
  5.333E-01  0.000E+00  5.333E-01  1.287E+00  1.553E+00  9.524E+00
  5.500E-01  1.667E-02  5.667E-01  1.212E+00  1.650E+00  1.012E+01
  5.417E-01  1.500E-02  5.567E-01  1.234E+00  1.621E+00  9.940E+00

level 2 optimization:

        norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.667E-01  0.000E+00  2.667E-01  2.575E+00  7.767E-01  4.762E+00
  2.833E-01  1.667E-02  3.000E-01  2.289E+00  8.738E-01  5.357E+00
  2.500E-01  1.667E-02  2.667E-01  2.575E+00  7.767E-01  4.762E+00
  2.633E-01  1.000E-02  2.733E-01  2.512E+00  7.961E-01  4.881E+00

 times for array with leading dimension of 200
  2.833E-01  1.667E-02  3.000E-01  2.289E+00  8.738E-01  5.357E+00
  2.833E-01  1.667E-02  3.000E-01  2.289E+00  8.738E-01  5.357E+00
  2.833E-01  1.667E-02  3.000E-01  2.289E+00  8.738E-01  5.357E+00
  2.983E-01  1.000E-02  3.083E-01  2.227E+00  8.981E-01  5.506E+00

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.000E-01  0.000E+00  2.000E-01  3.433E+00  5.825E-01  3.571E+00
  1.833E-01  1.667E-02  2.000E-01  3.433E+00  5.825E-01  3.571E+00
  2.000E-01  0.000E+00  2.000E-01  3.433E+00  5.825E-01  3.571E+00
  1.983E-01  6.667E-03  2.050E-01  3.350E+00  5.971E-01  3.661E+00

 times for array with leading dimension of 200
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.217E-01  6.667E-03  2.283E-01  3.007E+00  6.650E-01  4.077E+00
===============================================================================
linpackd results for SPARCstation 1+/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  9.333E-01  1.667E-02  9.500E-01  7.228E-01  2.767E+00  1.696E+01
  9.333E-01  1.667E-02  9.500E-01  7.228E-01  2.767E+00  1.696E+01
  9.333E-01  1.667E-02  9.500E-01  7.228E-01  2.767E+00  1.696E+01
  9.350E-01  2.833E-02  9.633E-01  7.128E-01  2.806E+00  1.720E+01

 times for array with leading dimension of 200
  9.667E-01  3.333E-02  1.000E+00  6.867E-01  2.913E+00  1.786E+01
  9.500E-01  3.333E-02  9.833E-01  6.983E-01  2.864E+00  1.756E+01
  9.833E-01  1.667E-02  1.000E+00  6.867E-01  2.913E+00  1.786E+01
  9.650E-01  3.000E-02  9.950E-01  6.901E-01  2.898E+00  1.777E+01

level 1 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  9.167E-01  1.667E-02  9.333E-01  7.357E-01  2.718E+00  1.667E+01
  9.333E-01  1.667E-02  9.500E-01  7.228E-01  2.767E+00  1.696E+01
  9.333E-01  3.333E-02  9.667E-01  7.103E-01  2.816E+00  1.726E+01
  9.183E-01  2.833E-02  9.467E-01  7.254E-01  2.757E+00  1.690E+01

 times for array with leading dimension of 200
  9.500E-01  3.333E-02  9.833E-01  6.983E-01  2.864E+00  1.756E+01
  9.500E-01  3.333E-02  9.833E-01  6.983E-01  2.864E+00  1.756E+01
  9.333E-01  3.333E-02  9.667E-01  7.103E-01  2.816E+00  1.726E+01
  9.517E-01  3.000E-02  9.817E-01  6.995E-01  2.859E+00  1.753E+01

level 2 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  5.667E-01  3.333E-02  6.000E-01  1.144E+00  1.748E+00  1.071E+01
  5.667E-01  1.667E-02  5.833E-01  1.177E+00  1.699E+00  1.042E+01
  5.667E-01  1.667E-02  5.833E-01  1.177E+00  1.699E+00  1.042E+01
  5.700E-01  1.833E-02  5.883E-01  1.167E+00  1.714E+00  1.051E+01

 times for array with leading dimension of 200
  5.833E-01  1.667E-02  6.000E-01  1.144E+00  1.748E+00  1.071E+01
  5.833E-01  1.667E-02  6.000E-01  1.144E+00  1.748E+00  1.071E+01
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  5.917E-01  2.000E-02  6.117E-01  1.123E+00  1.782E+00  1.092E+01

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  4.000E-01  0.000E+00  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.833E-01  3.333E-02  4.167E-01  1.648E+00  1.214E+00  7.440E+00
  4.000E-01  0.000E+00  4.000E-01  1.717E+00  1.165E+00  7.143E+00
  3.933E-01  1.333E-02  4.067E-01  1.689E+00  1.184E+00  7.262E+00

 times for array with leading dimension of 200
  4.167E-01  1.667E-02  4.333E-01  1.585E+00  1.262E+00  7.738E+00
  4.333E-01  1.667E-02  4.500E-01  1.526E+00  1.311E+00  8.036E+00
  4.167E-01  1.667E-02  4.333E-01  1.585E+00  1.262E+00  7.738E+00
  4.267E-01  1.333E-02  4.400E-01  1.561E+00  1.282E+00  7.857E+00
===============================================================================
linpackd results for Sun 4-490/SunOS 4.1/Sun FORTRAN 1.3.1:

no optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  6.167E-01  1.667E-02  6.333E-01  1.084E+00  1.845E+00  1.131E+01
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.167E-01  1.667E-02  6.333E-01  1.084E+00  1.845E+00  1.131E+01
  6.117E-01  1.667E-02  6.283E-01  1.093E+00  1.830E+00  1.122E+01

 times for array with leading dimension of 200
  6.167E-01  1.667E-02  6.333E-01  1.084E+00  1.845E+00  1.131E+01
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.167E-01  1.667E-02  6.333E-01  1.084E+00  1.845E+00  1.131E+01
  6.200E-01  1.833E-02  6.383E-01  1.076E+00  1.859E+00  1.140E+01


level 1 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00

    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.000E-01  0.000E+00  6.000E-01  1.144E+00  1.748E+00  1.071E+01
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.000E-01  1.833E-02  6.183E-01  1.111E+00  1.801E+00  1.104E+01

 times for array with leading dimension of 200
  6.167E-01  1.667E-02  6.333E-01  1.084E+00  1.845E+00  1.131E+01
  6.167E-01  0.000E+00  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.000E-01  1.667E-02  6.167E-01  1.114E+00  1.796E+00  1.101E+01
  6.050E-01  1.833E-02  6.233E-01  1.102E+00  1.816E+00  1.113E+01

level 2 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  3.333E-01  1.667E-02  3.500E-01  1.962E+00  1.019E+00  6.250E+00
  3.167E-01  1.667E-02  3.333E-01  2.060E+00  9.709E-01  5.952E+00
  3.333E-01  0.000E+00  3.333E-01  2.060E+00  9.709E-01  5.952E+00
  3.300E-01  1.167E-02  3.417E-01  2.010E+00  9.951E-01  6.101E+00

 times for array with leading dimension of 200
  3.500E-01  0.000E+00  3.500E-01  1.962E+00  1.019E+00  6.250E+00
  3.333E-01  1.667E-02  3.500E-01  1.962E+00  1.019E+00  6.250E+00
  3.333E-01  1.667E-02  3.500E-01  1.962E+00  1.019E+00  6.250E+00
  3.400E-01  1.000E-02  3.500E-01  1.962E+00  1.019E+00  6.250E+00

level 3 optimization:

    norm. resid      resid           machep         x(1)          x(n)
 1.67005097E+00  7.41628980E-14  2.22044605E-16  1.00000000E+00  1.00000000E+00


    times are reported for matrices of order   100
      dgefa      dgesl      total     mflops       unit      ratio
 times for array with leading dimension of 201
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.150E-01  6.667E-03  2.217E-01  3.098E+00  6.456E-01  3.958E+00

 times for array with leading dimension of 200
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.167E-01  1.667E-02  2.333E-01  2.943E+00  6.796E-01  4.167E+00
  2.167E-01  0.000E+00  2.167E-01  3.169E+00  6.311E-01  3.869E+00
  2.150E-01  6.667E-03  2.217E-01  3.098E+00  6.456E-01  3.958E+00

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tristan Nefzger
Tel.: 408-743-0220
Email: lad-shrike!leadsv!tn

"If you are who you think you are, then who is doing the thinking?"

   -Saying of you, or me...Well, whomever--it's still a saying.

bbc@rice.edu (Benjamin Chase) (12/06/90)

>The optimization levels used were
>(0) no optimization; (1) level 1 (-O1); (2) level 2 (-O2); and (3)
>level 3 (-O3).

>Mflops averages
>---------------
>optimization    SPARCstation 2      SPARCstation 1        Sun4/490
>   level        single  double      single  double      single  double
>                   average             average             average
>
>     0          1.7     1.3         1.0     0.7         1.4     1.1
>                    1.5                 .85                 1.2
>
>     1          1.7     1.3         1.0     0.7         1.4     1.1
>                    1.5                 .85                 1.2
>
>     2          3.7     2.3         2.1     1.1         2.9     2.0
>                    3.0                 1.6                 2.4
>
>     3          5.1     3.3         2.7     1.6         4.9     3.1
>                    4.2                 2.1                 4.0
>
>(All averages are arithmetic means.)

What I found interesting here was the small difference between
optimization level 0 and level 1.  Checking my Sun f77 manual page, it
says that the difference between no optimization and -O1 is peephole
optimization.  What sort of peephole optimization are we doing?  Just
filling those delay slots?

Generating some code generated from a small C program on my
SPARCstation 1, I see that no-ops are generated for all the delay
slots.  On a RISC, there's not much more to do at the peephole level,
if your code generator has half a brain.

Looking further, it seems that "as -O1" doesn't fill the delay slots
of branches either.  Very odd.  What sort of peephole optimization is
this?  The "as" manual page says that -O[n] "enables peephole
optimization corresponding to optimization level n (1 if n not
specified) of the Sun high-level language compilers".  There are
different levels of peephole optimization?  Different sizes of
peepholes, perhaps?

Perhaps Sun only does full-blown filling of delay slots, through a
large-scale (rather than peephole) analysis of the generated code?
Admittedly, this elephant gun approach is necessary to fill those
hard-to-fill slots (ie. when you're be turning on the "annul" bit of
the branch, inhibiting execution of the instruction in the delay slot
when the branch is not taken).  And if you've got the elephant gun
approach working, why let a popgun (ie. peephole optimizer) look for
the easy marks?

Looks like I need to teach my cute SPARC disassembler to use symbolic
labels for branch targets, so I can get a meaningful diff between
disassembled versions of each flavor of code, to actually see what the
peephole optimizer is or isn't doing.

I suspect any followup to this post probably needs to go somewhere
other than comp.benchmarks, though I don't know which other group to
pick.  I seemed to have wandered into the land of instruction
scheduling and SPARC assembly language...
--
	Ben Chase <bbc@rice.edu>, Rice University, Houston, Texas