tn@leadsv.UUCP (Tristan Nefzger) (12/05/90)
Jack Dongarra's* single and double precision linpack programs
(linpacks and linpackd**) were compiled at several optimization levels
and run on the SPARCstation 2 (Calvin), SPARCstation 1+ and the Sun
4/490. Both the SPARCstations were running SunOS 4.1.1 Beta while the
4/490 was running SunOS 4.1. The compilers used were Sun's FORTRAN
1.3.1 and the Sun C compiler bundled with SunOS 4.1.1 Beta. The
"-Bstatic" option was used in all cases to insure compile time
(static) linking with libraries. The optimization levels used were
(0) no optimization; (1) level 1 (-O1); (2) level 2 (-O2); and (3)
level 3 (-O3). The C compiler was used only to compile a module
encapsulating the SunOS times() system call for measurement of (user)
time. This C module and its FORTRAN interface, second.f, which
defines the second() function in the linpack programs, were supplied
by Paul Hansen, and are included below. Results are based on the last
of three consecutive executions of each appropriately compiled linpack
program the output of which is also included below. Oddly enough
the (arithmetic) average of the single and double precision Mflops
for the SPARCstation 2 at level 3 optimization is 4.2, which is
exactly what Sun claims it to be in SunFLASH Vol 23 #6, "SPARCstation 2
Family," November 1990 (distributed via email from Sun).
Mflops averages
---------------
optimization SPARCstation 2 SPARCstation 1 Sun4/490
level single double single double single double
average average average
0 1.7 1.3 1.0 0.7 1.4 1.1
1.5 .85 1.2
1 1.7 1.3 1.0 0.7 1.4 1.1
1.5 .85 1.2
2 3.7 2.3 2.1 1.1 2.9 2.0
3.0 1.6 2.4
3 5.1 3.3 2.7 1.6 4.9 3.1
4.2 2.1 4.0
(All averages are arithmetic means.)
* Jack J. Dongarra
Computer Science Department
University of Tennessee
Knoxville, Tennessee 37996-1300
Fax: 615-974-8296
Internet: dongarra@cs.utk.edu
**Available from netlibd@surfer.epm.ornl.gov via email with the respective
subject lines:
(1) send linpacks from benchmark
(2) send linpackd from benchmark
===============================================================================
/* mclock.c
*
* module written by P. Hansen
* encapsulates SunOS times() call
*
*/
long mclock_()
{
long buf[4];
times(buf);
return(buf[0]);
}
===============================================================================
c second.f
c module written by P.Hansen
c calls mclock()
real function second(t)
c
c this routine will gather the user time for a process.
c it has resolution of 1/60 of a second
c and uses the unix c program times.
c see the unix manual for details
c reports time in seconds.
c
itime = mclock(i)
second = float(itime)/60.
c
c this statement is here to bump the time by a bit
c incase no the interval was too small.
c
second = second + second*1.0e-6
return
end
===============================================================================
linpacks results for SPARCstation 2/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
4.000E-01 0.000E+00 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.917E-01 1.167E-02 4.033E-01 1.702E+00 1.175E+00 7.202E+00
times for array with leading dimension of 200
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
4.000E-01 1.667E-02 4.167E-01 1.648E+00 1.214E+00 7.440E+00
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.967E-01 1.000E-02 4.067E-01 1.689E+00 1.184E+00 7.262E+00
level 1 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
4.000E-01 1.667E-02 4.167E-01 1.648E+00 1.214E+00 7.440E+00
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.833E-01 0.000E+00 3.833E-01 1.791E+00 1.117E+00 6.845E+00
3.800E-01 1.000E-02 3.900E-01 1.761E+00 1.136E+00 6.964E+00
times for array with leading dimension of 200
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.833E-01 1.667E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.667E-01 3.333E-02 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.867E-01 1.167E-02 3.983E-01 1.724E+00 1.160E+00 7.113E+00
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
2.000E-01 0.000E+00 2.000E-01 3.433E+00 5.825E-01 3.571E+00
1.833E-01 0.000E+00 1.833E-01 3.745E+00 5.340E-01 3.274E+00
1.833E-01 0.000E+00 1.833E-01 3.745E+00 5.340E-01 3.274E+00
1.850E-01 5.000E-03 1.900E-01 3.614E+00 5.534E-01 3.393E+00
times for array with leading dimension of 200
1.833E-01 0.000E+00 1.833E-01 3.745E+00 5.340E-01 3.274E+00
1.833E-01 0.000E+00 1.833E-01 3.745E+00 5.340E-01 3.274E+00
1.667E-01 0.000E+00 1.667E-01 4.120E+00 4.854E-01 2.976E+00
1.800E-01 5.000E-03 1.850E-01 3.712E+00 5.388E-01 3.304E+00
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.267E-01 5.000E-03 1.317E-01 5.215E+00 3.835E-01 2.351E+00
times for array with leading dimension of 200
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.300E-01 5.000E-03 1.350E-01 5.086E+00 3.932E-01 2.411E+00
===============================================================================
linpacks results for SPARCstation 1+/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
6.500E-01 1.667E-02 6.667E-01 1.030E+00 1.942E+00 1.190E+01
6.500E-01 3.333E-02 6.833E-01 1.005E+00 1.990E+00 1.220E+01
6.667E-01 1.667E-02 6.833E-01 1.005E+00 1.990E+00 1.220E+01
6.617E-01 2.000E-02 6.817E-01 1.007E+00 1.985E+00 1.217E+01
times for array with leading dimension of 200
6.667E-01 1.667E-02 6.833E-01 1.005E+00 1.990E+00 1.220E+01
6.500E-01 1.667E-02 6.667E-01 1.030E+00 1.942E+00 1.190E+01
6.500E-01 3.333E-02 6.833E-01 1.005E+00 1.990E+00 1.220E+01
6.600E-01 2.000E-02 6.800E-01 1.010E+00 1.981E+00 1.214E+01
level 1 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
6.333E-01 1.667E-02 6.500E-01 1.056E+00 1.893E+00 1.161E+01
6.333E-01 1.667E-02 6.500E-01 1.056E+00 1.893E+00 1.161E+01
6.500E-01 1.667E-02 6.667E-01 1.030E+00 1.942E+00 1.190E+01
6.350E-01 2.000E-02 6.550E-01 1.048E+00 1.908E+00 1.170E+01
times for array with leading dimension of 200
6.333E-01 1.667E-02 6.500E-01 1.056E+00 1.893E+00 1.161E+01
6.500E-01 1.667E-02 6.667E-01 1.030E+00 1.942E+00 1.190E+01
6.500E-01 1.667E-02 6.667E-01 1.030E+00 1.942E+00 1.190E+01
6.433E-01 2.000E-02 6.633E-01 1.035E+00 1.932E+00 1.185E+01
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
3.000E-01 1.667E-02 3.167E-01 2.168E+00 9.223E-01 5.655E+00
3.167E-01 0.000E+00 3.167E-01 2.168E+00 9.223E-01 5.655E+00
3.167E-01 0.000E+00 3.167E-01 2.168E+00 9.223E-01 5.655E+00
3.200E-01 1.000E-02 3.300E-01 2.081E+00 9.612E-01 5.893E+00
times for array with leading dimension of 200
3.333E-01 0.000E+00 3.333E-01 2.060E+00 9.709E-01 5.952E+00
3.167E-01 1.667E-02 3.333E-01 2.060E+00 9.709E-01 5.952E+00
3.167E-01 1.667E-02 3.333E-01 2.060E+00 9.709E-01 5.952E+00
3.183E-01 1.000E-02 3.283E-01 2.091E+00 9.563E-01 5.863E+00
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
2.333E-01 1.667E-02 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.333E-01 1.667E-02 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.500E-01 0.000E+00 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.383E-01 6.667E-03 2.450E-01 2.803E+00 7.136E-01 4.375E+00
times for array with leading dimension of 200
2.333E-01 1.667E-02 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.500E-01 0.000E+00 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.500E-01 0.000E+00 2.500E-01 2.747E+00 7.282E-01 4.464E+00
2.450E-01 6.667E-03 2.517E-01 2.728E+00 7.330E-01 4.494E+00
===============================================================================
linpacks results for Sun 4-490/SunOS 4.1/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
5.000E-01 0.000E+00 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.833E-01 3.333E-02 5.167E-01 1.329E+00 1.505E+00 9.226E+00
4.833E-01 3.333E-02 5.167E-01 1.329E+00 1.505E+00 9.226E+00
4.950E-01 1.500E-02 5.100E-01 1.346E+00 1.485E+00 9.107E+00
times for array with leading dimension of 200
5.000E-01 0.000E+00 5.000E-01 1.373E+00 1.456E+00 8.929E+00
5.000E-01 0.000E+00 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.833E-01 1.667E-02 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.933E-01 1.500E-02 5.083E-01 1.351E+00 1.481E+00 9.077E+00
level 1 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
4.833E-01 1.667E-02 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.833E-01 0.000E+00 4.833E-01 1.421E+00 1.408E+00 8.631E+00
4.833E-01 1.667E-02 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.817E-01 1.333E-02 4.950E-01 1.387E+00 1.442E+00 8.839E+00
times for array with leading dimension of 200
4.667E-01 0.000E+00 4.667E-01 1.471E+00 1.359E+00 8.333E+00
4.833E-01 1.667E-02 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.833E-01 1.667E-02 5.000E-01 1.373E+00 1.456E+00 8.929E+00
4.800E-01 1.333E-02 4.933E-01 1.392E+00 1.437E+00 8.810E+00
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
2.333E-01 0.000E+00 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.167E-01 1.667E-02 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.167E-01 1.667E-02 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.233E-01 8.333E-03 2.317E-01 2.964E+00 6.748E-01 4.137E+00
times for array with leading dimension of 200
2.167E-01 1.667E-02 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.167E-01 1.667E-02 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.333E-01 0.000E+00 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.233E-01 6.667E-03 2.300E-01 2.986E+00 6.699E-01 4.107E+00
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.59498179E+00 3.80277634E-05 1.19209290E-07 9.99986172E-01 9.99992490E-01
times are reported for matrices of order 100
sgefa sgesl total mflops unit ratio
times for array with leading dimension of 201
1.500E-01 0.000E+00 1.500E-01 4.578E+00 4.369E-01 2.679E+00
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.500E-01 0.000E+00 1.500E-01 4.578E+00 4.369E-01 2.679E+00
1.367E-01 5.000E-03 1.417E-01 4.847E+00 4.126E-01 2.530E+00
times for array with leading dimension of 200
1.333E-01 0.000E+00 1.333E-01 5.150E+00 3.883E-01 2.381E+00
1.500E-01 0.000E+00 1.500E-01 4.578E+00 4.369E-01 2.679E+00
1.500E-01 0.000E+00 1.500E-01 4.578E+00 4.369E-01 2.679E+00
1.417E-01 3.333E-03 1.450E-01 4.736E+00 4.223E-01 2.589E+00
===============================================================================
linpackd results for SPARCstation 2/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
5.167E-01 1.667E-02 5.333E-01 1.287E+00 1.553E+00 9.524E+00
5.167E-01 1.667E-02 5.333E-01 1.287E+00 1.553E+00 9.524E+00
5.167E-01 1.667E-02 5.333E-01 1.287E+00 1.553E+00 9.524E+00
5.083E-01 1.500E-02 5.233E-01 1.312E+00 1.524E+00 9.345E+00
times for array with leading dimension of 200
5.333E-01 0.000E+00 5.333E-01 1.287E+00 1.553E+00 9.524E+00
5.833E-01 1.667E-02 6.000E-01 1.144E+00 1.748E+00 1.071E+01
5.667E-01 0.000E+00 5.667E-01 1.212E+00 1.650E+00 1.012E+01
5.333E-01 1.500E-02 5.483E-01 1.252E+00 1.597E+00 9.792E+00
level 1 optimization:
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
5.167E-01 0.000E+00 5.167E-01 1.329E+00 1.505E+00 9.226E+00
5.000E-01 1.667E-02 5.167E-01 1.329E+00 1.505E+00 9.226E+00
5.000E-01 0.000E+00 5.000E-01 1.373E+00 1.456E+00 8.929E+00
5.000E-01 1.667E-02 5.167E-01 1.329E+00 1.505E+00 9.226E+00
times for array with leading dimension of 200
5.167E-01 1.667E-02 5.333E-01 1.288E+00 1.553E+00 9.524E+00
5.333E-01 0.000E+00 5.333E-01 1.287E+00 1.553E+00 9.524E+00
5.500E-01 1.667E-02 5.667E-01 1.212E+00 1.650E+00 1.012E+01
5.417E-01 1.500E-02 5.567E-01 1.234E+00 1.621E+00 9.940E+00
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
2.667E-01 0.000E+00 2.667E-01 2.575E+00 7.767E-01 4.762E+00
2.833E-01 1.667E-02 3.000E-01 2.289E+00 8.738E-01 5.357E+00
2.500E-01 1.667E-02 2.667E-01 2.575E+00 7.767E-01 4.762E+00
2.633E-01 1.000E-02 2.733E-01 2.512E+00 7.961E-01 4.881E+00
times for array with leading dimension of 200
2.833E-01 1.667E-02 3.000E-01 2.289E+00 8.738E-01 5.357E+00
2.833E-01 1.667E-02 3.000E-01 2.289E+00 8.738E-01 5.357E+00
2.833E-01 1.667E-02 3.000E-01 2.289E+00 8.738E-01 5.357E+00
2.983E-01 1.000E-02 3.083E-01 2.227E+00 8.981E-01 5.506E+00
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
2.000E-01 0.000E+00 2.000E-01 3.433E+00 5.825E-01 3.571E+00
1.833E-01 1.667E-02 2.000E-01 3.433E+00 5.825E-01 3.571E+00
2.000E-01 0.000E+00 2.000E-01 3.433E+00 5.825E-01 3.571E+00
1.983E-01 6.667E-03 2.050E-01 3.350E+00 5.971E-01 3.661E+00
times for array with leading dimension of 200
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.217E-01 6.667E-03 2.283E-01 3.007E+00 6.650E-01 4.077E+00
===============================================================================
linpackd results for SPARCstation 1+/SunOS 4.1.1 Beta/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
9.333E-01 1.667E-02 9.500E-01 7.228E-01 2.767E+00 1.696E+01
9.333E-01 1.667E-02 9.500E-01 7.228E-01 2.767E+00 1.696E+01
9.333E-01 1.667E-02 9.500E-01 7.228E-01 2.767E+00 1.696E+01
9.350E-01 2.833E-02 9.633E-01 7.128E-01 2.806E+00 1.720E+01
times for array with leading dimension of 200
9.667E-01 3.333E-02 1.000E+00 6.867E-01 2.913E+00 1.786E+01
9.500E-01 3.333E-02 9.833E-01 6.983E-01 2.864E+00 1.756E+01
9.833E-01 1.667E-02 1.000E+00 6.867E-01 2.913E+00 1.786E+01
9.650E-01 3.000E-02 9.950E-01 6.901E-01 2.898E+00 1.777E+01
level 1 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
9.167E-01 1.667E-02 9.333E-01 7.357E-01 2.718E+00 1.667E+01
9.333E-01 1.667E-02 9.500E-01 7.228E-01 2.767E+00 1.696E+01
9.333E-01 3.333E-02 9.667E-01 7.103E-01 2.816E+00 1.726E+01
9.183E-01 2.833E-02 9.467E-01 7.254E-01 2.757E+00 1.690E+01
times for array with leading dimension of 200
9.500E-01 3.333E-02 9.833E-01 6.983E-01 2.864E+00 1.756E+01
9.500E-01 3.333E-02 9.833E-01 6.983E-01 2.864E+00 1.756E+01
9.333E-01 3.333E-02 9.667E-01 7.103E-01 2.816E+00 1.726E+01
9.517E-01 3.000E-02 9.817E-01 6.995E-01 2.859E+00 1.753E+01
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
5.667E-01 3.333E-02 6.000E-01 1.144E+00 1.748E+00 1.071E+01
5.667E-01 1.667E-02 5.833E-01 1.177E+00 1.699E+00 1.042E+01
5.667E-01 1.667E-02 5.833E-01 1.177E+00 1.699E+00 1.042E+01
5.700E-01 1.833E-02 5.883E-01 1.167E+00 1.714E+00 1.051E+01
times for array with leading dimension of 200
5.833E-01 1.667E-02 6.000E-01 1.144E+00 1.748E+00 1.071E+01
5.833E-01 1.667E-02 6.000E-01 1.144E+00 1.748E+00 1.071E+01
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
5.917E-01 2.000E-02 6.117E-01 1.123E+00 1.782E+00 1.092E+01
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
4.000E-01 0.000E+00 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.833E-01 3.333E-02 4.167E-01 1.648E+00 1.214E+00 7.440E+00
4.000E-01 0.000E+00 4.000E-01 1.717E+00 1.165E+00 7.143E+00
3.933E-01 1.333E-02 4.067E-01 1.689E+00 1.184E+00 7.262E+00
times for array with leading dimension of 200
4.167E-01 1.667E-02 4.333E-01 1.585E+00 1.262E+00 7.738E+00
4.333E-01 1.667E-02 4.500E-01 1.526E+00 1.311E+00 8.036E+00
4.167E-01 1.667E-02 4.333E-01 1.585E+00 1.262E+00 7.738E+00
4.267E-01 1.333E-02 4.400E-01 1.561E+00 1.282E+00 7.857E+00
===============================================================================
linpackd results for Sun 4-490/SunOS 4.1/Sun FORTRAN 1.3.1:
no optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
6.167E-01 1.667E-02 6.333E-01 1.084E+00 1.845E+00 1.131E+01
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.167E-01 1.667E-02 6.333E-01 1.084E+00 1.845E+00 1.131E+01
6.117E-01 1.667E-02 6.283E-01 1.093E+00 1.830E+00 1.122E+01
times for array with leading dimension of 200
6.167E-01 1.667E-02 6.333E-01 1.084E+00 1.845E+00 1.131E+01
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.167E-01 1.667E-02 6.333E-01 1.084E+00 1.845E+00 1.131E+01
6.200E-01 1.833E-02 6.383E-01 1.076E+00 1.859E+00 1.140E+01
level 1 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.000E-01 0.000E+00 6.000E-01 1.144E+00 1.748E+00 1.071E+01
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.000E-01 1.833E-02 6.183E-01 1.111E+00 1.801E+00 1.104E+01
times for array with leading dimension of 200
6.167E-01 1.667E-02 6.333E-01 1.084E+00 1.845E+00 1.131E+01
6.167E-01 0.000E+00 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.000E-01 1.667E-02 6.167E-01 1.114E+00 1.796E+00 1.101E+01
6.050E-01 1.833E-02 6.233E-01 1.102E+00 1.816E+00 1.113E+01
level 2 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
3.333E-01 1.667E-02 3.500E-01 1.962E+00 1.019E+00 6.250E+00
3.167E-01 1.667E-02 3.333E-01 2.060E+00 9.709E-01 5.952E+00
3.333E-01 0.000E+00 3.333E-01 2.060E+00 9.709E-01 5.952E+00
3.300E-01 1.167E-02 3.417E-01 2.010E+00 9.951E-01 6.101E+00
times for array with leading dimension of 200
3.500E-01 0.000E+00 3.500E-01 1.962E+00 1.019E+00 6.250E+00
3.333E-01 1.667E-02 3.500E-01 1.962E+00 1.019E+00 6.250E+00
3.333E-01 1.667E-02 3.500E-01 1.962E+00 1.019E+00 6.250E+00
3.400E-01 1.000E-02 3.500E-01 1.962E+00 1.019E+00 6.250E+00
level 3 optimization:
norm. resid resid machep x(1) x(n)
1.67005097E+00 7.41628980E-14 2.22044605E-16 1.00000000E+00 1.00000000E+00
times are reported for matrices of order 100
dgefa dgesl total mflops unit ratio
times for array with leading dimension of 201
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.150E-01 6.667E-03 2.217E-01 3.098E+00 6.456E-01 3.958E+00
times for array with leading dimension of 200
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.167E-01 1.667E-02 2.333E-01 2.943E+00 6.796E-01 4.167E+00
2.167E-01 0.000E+00 2.167E-01 3.169E+00 6.311E-01 3.869E+00
2.150E-01 6.667E-03 2.217E-01 3.098E+00 6.456E-01 3.958E+00
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Tristan Nefzger
Tel.: 408-743-0220
Email: lad-shrike!leadsv!tn
"If you are who you think you are, then who is doing the thinking?"
-Saying of you, or me...Well, whomever--it's still a saying.
bbc@rice.edu (Benjamin Chase) (12/06/90)
>The optimization levels used were >(0) no optimization; (1) level 1 (-O1); (2) level 2 (-O2); and (3) >level 3 (-O3). >Mflops averages >--------------- >optimization SPARCstation 2 SPARCstation 1 Sun4/490 > level single double single double single double > average average average > > 0 1.7 1.3 1.0 0.7 1.4 1.1 > 1.5 .85 1.2 > > 1 1.7 1.3 1.0 0.7 1.4 1.1 > 1.5 .85 1.2 > > 2 3.7 2.3 2.1 1.1 2.9 2.0 > 3.0 1.6 2.4 > > 3 5.1 3.3 2.7 1.6 4.9 3.1 > 4.2 2.1 4.0 > >(All averages are arithmetic means.) What I found interesting here was the small difference between optimization level 0 and level 1. Checking my Sun f77 manual page, it says that the difference between no optimization and -O1 is peephole optimization. What sort of peephole optimization are we doing? Just filling those delay slots? Generating some code generated from a small C program on my SPARCstation 1, I see that no-ops are generated for all the delay slots. On a RISC, there's not much more to do at the peephole level, if your code generator has half a brain. Looking further, it seems that "as -O1" doesn't fill the delay slots of branches either. Very odd. What sort of peephole optimization is this? The "as" manual page says that -O[n] "enables peephole optimization corresponding to optimization level n (1 if n not specified) of the Sun high-level language compilers". There are different levels of peephole optimization? Different sizes of peepholes, perhaps? Perhaps Sun only does full-blown filling of delay slots, through a large-scale (rather than peephole) analysis of the generated code? Admittedly, this elephant gun approach is necessary to fill those hard-to-fill slots (ie. when you're be turning on the "annul" bit of the branch, inhibiting execution of the instruction in the delay slot when the branch is not taken). And if you've got the elephant gun approach working, why let a popgun (ie. peephole optimizer) look for the easy marks? Looks like I need to teach my cute SPARC disassembler to use symbolic labels for branch targets, so I can get a meaningful diff between disassembled versions of each flavor of code, to actually see what the peephole optimizer is or isn't doing. I suspect any followup to this post probably needs to go somewhere other than comp.benchmarks, though I don't know which other group to pick. I seemed to have wandered into the land of instruction scheduling and SPARC assembly language... -- Ben Chase <bbc@rice.edu>, Rice University, Houston, Texas