[comp.research.japan] Kahaner Report: SX3 Benchmarks from Swiss team tests.

rick@cs.arizona.edu (Rick Schlichting) (03/25/91)

  [Dr. David Kahaner is a numerical analyst visiting Japan for two-years
   under the auspices of the Office of Naval Research-Asia (ONR/Asia).  
   The following is the professional opinion of David Kahaner and in no 
   way has the blessing of the US Government or any agency of it.  All 
   information is dated and of limited life time.  This disclaimer should 
   be noted on ANY attribution.]

  [Copies of previous reports written by Kahaner can be obtained from
   host cs.arizona.edu using anonymous FTP.]

To: Distribution
From: David Kahaner ONR Asia [kahaner@xroads.cc.u-tokyo.ac.jp]
Re: SX3 Benchmarks from Swiss team tests.
24 March 1991

ABSTRACT. Benchmarks from Swiss team tests on NEC SX-3/12 & /14, made 
          August and December 1990.

Several times over the past year, 
          Dr. Armin Friedli
          Interdisciplinary Project for Supercomputing (IPS)
          ETH-Zentrum
          Fliederstrasse
          CH-8092, Zurich
          Switzerland
          Tel: +41 1 256 3440, Fax: +41 1 252 0185
          Email: FRIEDLI@IPS.ETHZ.CH
has been in Japan as the leader of a team to evaluate supercomputers for 
possible use in Switzerland as a national supercomputer. (I've known Dr.  
Friedli for many years since I spent a sabbatical at ETH.) Now a decision 
has been made to acquire an SX-3 to address supercomputing needs in Swiss 
research universities and institutes. A two processor SX-3/22 is to be 
running by 10/91, with plans to upgrade this to a four processor model 
afterwards. The SX-3 that is being installed will have 1 GByte of main 
memory, 4 GBytes of extended memory, 70 GBytes of disk storage, including 
20 GBytes of high speed disks. A 1 TByte cartridge system and tapes will 
be available for archiving.  The computer center will be located in the 
small city of Manno, near Lugano, but operated by ETH Zurich. Access will 
be provided by the Swiss national university research network, SWITCH.  

As part of the decision making process, a number of real application 
programs were collected among the Swiss universities supercomputer users 
in early 1990. They were to serve as a basis for extended tests to 
measure single program and throughput performance. Friedli has provided 
me with summaries of their benchmark tests, which are further summarized 
below. (A more complete description will appear in IPS Windows, No 2,
1991, which can be obtained by contacting Friedli.) The tests were based
on unaltered versions of the programs. No changes were permitted except
those necessary to run the program on one processor; compiler options
were permitted, but compiler directives were not permitted except to
replace those directives already contained in the program.  Similarly,
program library routines were not permitted except to replace those
already contained in the program. (Some of the programs had been running
on Cray computers and hence had some directives included or had been
optimized in a greater or lesser way.)

In the table below, the NEC SX-3 is compared with a Cray Y-MP/8128, with
eight processors, 128 MWords of main memory and a clock time of 6.0 
nanoseconds. UNICOS(5.1) and CFT77(4.0) were used. The Cray tests took 
place at Cray Research in August 1990. The SX-3 tests took place in 
December 1990, on an SX-3/22, with two processors and two pipe sets each, 
1 GByte main memory, 4 GBytes extended memory, and a clock time of 2.9 
nanoseconds. A model /24 was also tested. In both cases only one procesor 
was used.  A pre-release version of SUPER-UX(R1.1) and a pre-release 
version of FORTRAN77/SX(R1.1) were used.  

The results below were measured on a single processor.  The performance 
ratios given in the table were obtained by computing the ratio of CPU-
times measured on the two systems.  Friedli noted that the SX-3 is at the 
beginning of its life cycle and that further improvements can be 
expected, and that hand tuning of these programs may significantly 
improve performance.  A brief description of the programs follows the 
table.  

For reference, several Linpack results are also given, 100S refers to 100 
equations scalar mode, 100V refers to 100 equations vector mode, 1000 
refers to 1000 thousand equations, best effort test. The same program 
may appear several times, such as FANTOM-1, FANTOM-2, referring to 
different cases, i.e., different input data. 

Program Name          SX-3/12        SX-3/14
                       speedup        speedup
                      vs YMP/1       vs YMP/1

CHEASE-3                1.3             1.3
FANTOM-2                1.5             1.5
BBINTER-1               2.0             2.0
LINPACK100-S            2.1             2.1
SECOND/S-4              1.8             2.1
PWSCF-1                 2.0             2.2
ZETA-1                  2.3             2.3
CYL3D-2                 2.3             2.3
CHEASE-2                2.3             2.5
LINPACK100-V            2.5             2.5
FANTOM-1                2.6             2.6
MCRG16-1                2.3             3.3
CYL3D-3                 3.3             3.4
T1XY-1                  2.6             3.5
GLOBE-1                 3.5             4.3
MCRG32-2                3.0             4.6
CORES-1                 3.8             5.5
CONV3D/C-4              3.7             5.6
CONV3D/R-4              2.6             5.6
CONV3D/C-3              4.3             7.1
TERPSICHORE-1           5.3             8.5
LINPACK100              6.7           >10.0


BBINTER: High Energy Physics. Particle tracking of particles in a high 
energy colliding beam storage ring.

CHEASE: Plasma Physics. Cubic Hermite Element Axisymmetric Static 
Equilibrium. Grad-Shafranov equation in axisymmetric geometry, produces a 
mapping for stability code ERATO.

CIRC: Computational Fluid Dynamics. Computational Investigation on 
Rotational Couette flow. Unsteady turbulent flow through plane or curved 
channels.

CONV3D: Image Analysis (CT and MR). 3D convolution using a separable 
symmetric kernel for medical data. Image enhancement, smoothing, edge 
detection.

CORES: Elementary Particle Physics (Hadrons). Inversion of Fermion 
matrices in lattice gauge theory by the conjugate residual method.

CYL3D: Computational Fluid Dynamics. Finite volume solution of unsteady 
incompressible Navier-Stokes equations.

FANTOM: Molecular Biology. Energy refinement of polypeptides and 
proteins.

GLOBE: Meteorology. Numerical model for the simulation of an 
incompressible fluid on a rotating sphere.

MCRG32: Elementary Particle Physics. Monte Carlo Renormalization Group 
calculation for 4D SU(2) lattice gauge theory on 32^4 lattices.

PWSCF: Solid State Physics. Plane waves pseudopotential local density 
self consistent calculation of the band structure and total energy for a 
solid.

SECOND: Integrated Device Simulation. 3D semiconductor device simulation 
by the finite element method.

TERPSICHORE: Plasma Physics. 3D ideal magnetohydrodynamics stability 
program.

T1XY: Molecular Spectroscopy. Flexible models for intramolecular motion, 
a versatile treatment and its applications to Glyoxal.

ZETA: Integer arithmetic. Computing with arbitrary precision.

-----------------END OF REPORT------------------------------------------