[comp.sys.atari.st] benchmark battles - round 1.

COBB@BRANDEIS.BITNET.UUCP (04/13/87)
Date:     Mon, 13 Apr 87 00:36 EDT
From:     <COBB@BRANDEIS.BITNET> (wes cobb [ cobb@brandeis.bitnet ])
Subject:  benchmark battles - round 1.
To:       info-atari16@score.stanford.edu
X-Original-To:  atari16, COBB

dear benchmarkers,

first of all, here is yet another savage benchmark result.  actually the only
reason i am posting this is that it disagrees by more than 10% with a recent
recent posting which appeared in volume #157 ....

##############################################################################

                            Savage Benchmark
                            ----------------
                                                   float   int
                                                 size mant size
 computer   cpu-MHz  fpu-MHz  OS  compiler       bits bits bits accuracy time
----------- -------- -------- --- -------------- ---- ---- ---- -------- -----
atari/st(1) 68000-8    none   tos absoft fortran  32   24   32  3.92e2   20.67
atari/st(1) 68000-8    none   tos absoft fortran  64   53   32  1.76e-7  67.41

Notes:
       1. atari 520 st with 1 Meg memory upgrade.

##############################################################################


i totally agree with moshe braner`s remarks about the savage benchmark -
the silly thing is ONLY really a test for the trig library supplied with
a compiler - it is NOT a reasonable benchmark program to test realistic
floating point performance.  it also certainly makes the 68881 look much
much better than it really is - i have spoken to Absoft ( they have done
several compiler versions for different machines which support 68881s )
they tell me that typically one sees about a 5-10x improvement in floating
*,-,+,/ operations with a 68881 and up to a 50x improvment in sin,exp,log
and atan. don't expect to do several Mflops on your ST with a 68881
board...( you may get at most a few hundred Kflops )


a much better test of floating point performance is the whetstone
benchmark.  whetstone was based on a study of real applications programs -
the authors studied how often sin, atan, log, exp, *, -, +, /, array
indexing, subroutine calls, and integer arithmetic show up in typical
scientific and engineering oriented programs.  i think i have both the
`double` and `single` precision versions of the program running around
here somewhere -- if anyone is interested i guess i could post c and
fortran source to whetstone ... i suppose one can make an argument for
just doing the double precision test ( otherwise virtually no c people
get to take part in testing ) even though one rarely uses double
precision in real fp applications.


i've gotten rather frustrated by the plethora of benchmark results flying
around the nets lately ( yes i played my part in it too! ) - and i would
like to make a couple of suggestions and/or pleas to all of you benchers
out there.


1. what do int and long mean to your compiler?
   ------------------------------------------
      if you are going to run a 'standard' benchmark program on your
      favorite compiler there is at least one utterly obvious - and
      usually overlooked - rule to follow:  you must be sure that
      you are using the same size integer and floating point numbers
      as everyone else is.   now obviously if the benchmark program
      was sloppily written - as most unfortunately were - you arent
      going to easily be able to do this ( example: in the Sieve,
      is the `int` type used in the loops supposed to be 16 bits
      or 32 bits?  running the code AS IS will kill your results
      in Lattice C just because Lattice uses a 32 bit int size, and
      will INCORRECTLY lead you to assume that Lattice is much
      slower than it is ).  since you CAN'T usually know what was
      intended, it is best to explicitly STATE what your int size
      is.


2. what do float and double mean to your compiler?
   ----------------------------------------------
      the same problem holds for floating point numbers in an even
      more extreme fashion:  with floating point numbers not only do
      you need to know whether you have 4,6,8, or 16 byte floating
      point numbers, it is also crucial to know HOW those bytes
      are distributed as mantissas,exponents, and sign bits.  It
      just doesnt make any sense to compare Whetstone results for
      Absoft F77 in single precision ( real*4 with a 24 bit mantissa )
      to Lattice C in double precision ( real*8 with a 53 bit
      mantissa ) to GFA Basic in middle precision ( real*6 with a
      32 bit mantissa )

      c and f77 programs for testing the mantissa size of single and
      double precision numbers are appended to the end of this letter.
      it should be easy to adapt one of these to any other language
      you might want to use.


3. always use checksums.
   ---------------------
      if you are going to write or create your own benchmark program
      ALWAYS provide some sort of checksum as a means of checking the
      accuracy of your answers.  there are 2 reasons for this - first
      of all some compiler optimizers are clever enough to simply skip
      code which is never going to be used for anything outside of a
      loop.  second, it is all well and good that your compiler has
      smeared the world at the BRUTUS benchmark - but if the answer
      you ended up with is utter nonsense then what good will it do
      you?  case in point:  megamax-c has an _apparently_ functional
      -- albeit slow -- log(x) function which works for x > .5 but
      gives wildly inaccurate answers for x approaching 0....why?
      the stupid thing apparently uses the WRONG SERIES EXPANSION
      for x < .5 !!!

      ( moral:  fast but WRONG is not interesting - supply a checksum )


4. timer routines
   --------------
      a lot of people have been using the xbios gettime() routine for
      reporting benchmark times.  this is okay IF AND ONLY IF the execution
      time for the program was so great that +/- 2 seconds ( the accuracy
      of the gettime routine ) doesnt significantly affect the results -
      i would argue that this would require execution times of at at least
      several hundred seconds to give reasonable accuracy.  in any event it
      is silly to quote something as short as 16 seconds as a benchmark time
      using gettime() - ( it could be 14, it could be 18, it could be just
      about anything in between )

      c and fortran source code for timer routines accurate to +/- .005
      second are in the appendix to this letter.


5. system software configuration
   -----------------------------
      it MATTERS what desk accessories and \auto folder programs you have
      installed on your system.  in particular things like screensavers,
      control panels, foreign operating systems, etc can EASILY make a
      10-15% difference in performance - since it isnt practical to keep
      vast lists of qualifications explaining exactly what was resident
      on benchers systems during the tests - DONT RUN BENCHMARKS IF YOU
      HAVE DESK ACCESSORIES OR \AUTO\ PROGRAMS loaded.  unload them.
      THEN run the benchmarks.   if you are using MINIX, or OS9, or MTC
      then SAY SO - AND BE SURE TO USE ELAPSED CPU-TIME *not* REAL-TIME
      in your time reporting.


6.  system hardware configuration
    -----------------------------
      it MAY matter whether or not you have a 520st, or a 520st + 1meg
      upgrade, or a 1040ST!! - for example if your upgrade memory uses
      significantly faster or slower RAM than original RAM the system
      still has, then depending on what your ramdisk setup is, you may
      find that sometimes your program may be executing in fast ram,
      and sometimes ( with a different ramdisk size ) it may be executing
      in slow ram.  this could make a 5-10% difference in benchmark
      performance too.  it CERTAINLY matters if you have popped a
      68010 into your machine.  also - if you have a 68881 board on your
      system you should say what speed IT is running at since unless you
      have a 68020 based system you are likely running in an asynchronous
      mode with a different clock speed from the main processor.

      ( moral:  when reporting a benchmark result, if you
                have modified the hardware then by all means say so! )




wes cobb ( cobb@brandeis.bitnet )
department of physics
brandeis university
waltham, mass 02254


appendix.( source code mentioned in the body of the letter. )
--------

/*
 *  mntss.c  - tests to see how many bits are in the mantissae of
 *             floats and doubles.
 */

#include <stdio.h>

main()
{
   long   i,j;
   float  x;
   double y;

   i = 0;
   x = 1.;
   do{
      ++i;
      x /= 2.;
   }while( (1.+x) != 1. );

   printf("\n floats have %ld bit mantissae",i);

   j = 0;
   y = 1.;
   do{
      ++j;
      y /= 2.;
   }while( (1.+y) != 1. );

   printf("\n doubles have %ld bit mantissae",j);

}

*
*  here is fortran code for the same thing...
*  stdout - is a system dependent number.
*           absoft f77  has stdout = 9
*           vax fortran has stdout = 6
*

      program mntss
      integer*4 i,j,stdout
      parameter ( stdout = 9 )
      real*4    x
      real*8    y

         i = 0
         x = 1.
         dowhile( (1.+x) .ne. 1. )
           i = i + 1;
           x = x / 2.;
         enddo
         write(stdout,*)' floats have ',i,' bit mantissae '

         j = 0
         y = 1.
         dowhile( (1.+y) .ne. 1. )
           j = j + 1;
           y = y / 2.;
         enddo
         write(stdout,*)' doubles have ',j,' bit mantissae '

      end


/*
 *  secnds.c - a timer routine for c
 *             ( tested with Megamax, Lattice )
 *
 *  usage:
 *              main()
 *              {
 *                   double dt,secnds();
 *                   ...
 *                   ...
 *                   dt = secnds(0.);
 *                      ...
 *                      ... whatever is to be timed goes here
 *                      ...
 *                   dt = secnds(dt);
 *                   ...
 *                   printf("\n elapsed time = %7.2f seconds",dt);
 *              }
 */

#include <osbind.h>
#define SECONDS_PER_TICK .005

double secnds(offset)
double offset;
{
        long peek_timer(),temp;
        temp = SECONDS_PER_TICK * (double)xbios( 38, &peek_timer ) - offset;
        return(temp);
}

long peek_timer()
{
        long temp2;
        temp2 = *(long *)0x4BA;
        return(temp2);
}

*
*  fortran timer routine for
*  the atari-st - absoft fortran
*
*  usage:        program test
*                real*8 secnds,dt
*                ...
*                dt = secnds(0.)
*                ...
*                ...what you want to time..
*                ...
*                dt = secnds(dt)
*                ...
*                write(9,'('' elapsed time = '',f7.2,'' seconds '')')dt
*                end
*
        real*8 function systimer(offset)
        implicit none
        include lib\gemdos.inc
        integer*4 atari,dummy,systix,oldstack
        real*8 mspt,offset
        parameter ( mspt = 5.0e-3 )               ! milli seconds per tick
        oldstack = atari( Super, 0 )              ! save stack
        systix = long(z'4BA')                     ! change mode and read
        dummy = atari( Super, oldstack )          ! timer, and restore stack
        systimer = -offset + mspt * dble(systix)  ! convert ticks to seconds
        return
        end