[comp.benchmarks] A short io benchmark...

pack@acd.ucar.edu (Dan Packman) (12/08/90)

The following very short benchmark program attempts to measure
raw fortran unformatted io speed.  I can duplicate the program n times
for n tmp files or can run n read programs on a single tmp file.
The tmp file is about 10 megabytes in size.  The 130000 dimension
is representative of some of our applications.
I simply run "simultaneously" via

   % time rd & ; time rd1 & ; time rd2 & ; time rd3 & ; ...

 Running on an IBM RS6000/320 I obtain (in elapsed seconds on an
otherwise unloaded machine):

   \n        |  1   |   2  |   3  |     4   |  5   |  6  |  7  |
__________________________________________________________________
one tmpfle   |      |  31  |  47  |  64     | 80   | 101 | 119 |
_______________16.5_______________________________________________
 n  tmpfles  |      |  36.5|  56  |  86-116 |
__________________________________________________________________

The first test shows remarkable linearity through 7 simultaneous processes.
The second test with multiple files to read shows a falloff (and unequal
times for the processes) for 4 simultaneous processes.  Any comments?
[I would guess that "server" class machines might show slightly
better times for a single read, but would show much better results
for many simultaneous processes.]
=====
       program wr
       double precision a(130000)
       open(1,file='/tmp/tmpfle1',form='unformatted',status='new')
       do 10 i=1,10
         a(130000)=dble(i)
         write(1)(a(ii),ii=1,130000)
 10    continue
       end

       program rd
       double precision a(130000)
       open(1,file='/tmp/tmpfle1',form='unformatted',status='old')
       do 10 i=1,10
         read(1)(a(ii),ii=1,130000)
         if (a(130000).ne.dble(i)) then write(0,100)i,a(130000)
 100     format(' i=',i2,' bad value =',1pd12.5)
 10    continue
       end
Dan Packman     NCAR                         INTERNET: pack@ncar.UCAR.EDU
(303) 497-1427  P.O. Box 3000                   CSNET: pack@ncar.CSNET
                Boulder, CO  80307       DECNET  SPAN: 9.367::PACK

mark@mips.COM (Mark G. Johnson) (12/11/90)

In article <9456@ncar.ucar.edu>, pack@acd.ucar.edu (Dan Packman)
gives a short benchmark program that exercises I/O.  He presents
measured data for n simultaneous processes reading the same file,
and also data for k processes simultaneously reading different
files.  Dan writes:
    >The first test shows remarkable linearity through 7
    >simultaneous processes.

I repeated Dan's test of simultaneous processes reading the same
file (using Dan's pgm to create that specific file) and found a modest
degree of better-than-linearity (slope a bit less than one).

Measurements on the two workstations
    >in elapsed seconds on an otherwise unloaded machine:

                             # of simultaneous processes reading same file
 Hardware                     1      2      3      4      5      6      7
----------------------------------------------------------------------------
IBM RS6000/320       [dan]   16.5   31     47     64     80    101    119
MIPS RS3230 Magnum   [mgj]   11.3   21.1   31.0   40.9   50.7   60.8   71.0

The (slight) difference between the two machines is the incremental cost
of the 2nd process.  For the IBM, each process costs about 16.5 seconds.
For the Magnum, each process costs around 9.9 seconds, *except* the first
one which costs 11.3.  Also the IBM begins to depart from linearity
at n>5 processes, when the cost per process rises again.  At least thru
7 processes the Magnum seems not to do this.
-- 
 -- Mark Johnson	
 	MIPS Computer Systems, 930 E. Arques M/S 2-02, Sunnyvale, CA 94086
	(408) 524-8308    mark@mips.com  {or ...!decwrl!mips!mark}