[comp.sys.atari.st] benchmark battles: part 2.

COBB@BRANDEIS.BITNET (04/23/87)
Date:     Wed, 22 Apr 87 23:06 EDT
From:     <COBB@BRANDEIS.BITNET> (wes cobb [ cobb@brandeis.bitnet ])
Subject:  benchmark battles: part 2.
To:       info-atari16@score.stanford.edu
X-Original-To:  atari16, COBB

dear benchfolk,

  someone ( sorry -- i`ve lost the reference ) made a remark which in
essence claimed that one should use BASIC if one wished to do fp work
on the ST!!!!!

well i strongly disagree...lets digress for a moment and look at
the code he posted:  supposedly the basic program the writer had
tested used...

   x = tan(atan(exp(log(sqr(x*x)))))+1.

...as what he was looping over.   now in every basic *I* have ever seen
the arc tangent function is called ATN _not_ atan....  i ran the code
he listed under atari basic - there were no error messages - and the
program literally blew fortran and c away.   of course since atan
was undefined the program was really just calculating tan(0.) 2500
times so it SHOULD have been pretty fast!  when i substituted atn
for the arc tangent function i got about 26 seconds for the single
precision execution time.   not bad - but a good 50% slower than
fortran.

  also in ATARI basic there is ( as far as i can tell ) no way
at all to even _PERFORM_ a double precision savage benchmark - i.e
there dont seem to BE double precision forms of the math library
functions.   obviously doing the loops in single precision
and then typecasting the results into double precision is NOT legal
-- so where on earth are these double precision atari basic results
coming from?

  if one wants to use benchmark results to choose what language to work
in one must be careful to choose reasonable benchmark programs and once
one has to compare apples with apples and oranges with oranges -- it is
silly to compare 48 bit floating point number benchmark results with 64
bit benchmark results etc....  at any rate its NUTS to base ones choice
of programming language on ( of all tests ) Savage because:

   THE SAVAGE BENCHMARK IS *N*O*T* A GOOD TEST OF FP PERFORMANCE.

the savage tests ONLY trig library functions.  period. it isnt even a very
good test of THEM.  period.  since the overwhelming preponderance of all
fp calls, operations, and usage do NOT involve trig library calls it just
doesnt make sense to base anything important on it.

if you MUST base everything on a single test then at least make it
the whetstone benchmark since that at least is a reasonable model of a
real applications program. ( atari basic by the way is about 100x slower
than fortran on the whetstone )

i`ve included another benchmark program which you may want to try...it
tests the speed of floating *,-,/,+ in c,fortran,ratfor,and basic. even
THIS is surely a more realistic test of floating point behavior than the
savage benchmark is:  fp *,-,/,+ occur overwhelmingly more often than
trig lib calls in real applications code.

wes



############################################################################

                            THE FLOPS BENCHMARK
                               results as of
                               22 april 1987

machine     language    os    rating   cpu-time  mant  size  % error  notes
----------  ----------  --- ---------  --------  ----  ----  -------  -------
atari/st    absoft f77  tos    11,453    153.70   24    32   8.4e-2   1,2
atari/st    absoft f77  tos     6,223    282.80   53    32   1.6e-10  1,2
atari/st    megamax c   tos     3,659    480.95   24    32   2.7e-3   1,3
atari/st    megamax c   tos     1,352   1301.34   53    64   3.4e-10  1,4
atari/st    lattice c   tos     1,227   1433.99   54    64   3.5e-12  1,5
atari/st    basic       tos       607   2899.19   24   (32)  1.3e-3   1,6
----------  ----------  --- ---------  --------  ----  ----  -------  -------

      1.  68000 at 8 MHz.  1 meg ram.  no fpa.  no desk accs.  med rez.
      2.  absoft fortran v2.2 ( dynamically linked )
      3.  megamax c v1.00 ( fmath.o )
      4.  megamax c v1.00 ( double.o )
      5.  lattice c v3.02 ( doesnt support single precision )
      6.  atari basic ( i ASSUME 32 bits here... manual doesnt say )

  the flops benchmark - as its name suggests - purports to estimate how
many floating operations per second a machine/compiler combination can
perform.   this program is primarily intended to be run on microcomputers
so no effort has been made to allow for machines more than about 10x
faster than a typical supermini.

   in the context of this program a floating operation is merely one
of the 4 basic operations { *,/,-,+ } weighted so as to reflect what
would seem to be their frequency of appearance in scientific applications
code:  35% for *, 26% for -, 22% for +, and 17% for /.   these numbers
were derived from a study of IMSL source code.

  the program does use several nested loops and typecasts but -- never
fear -- the overhead these require is subtracted out of the total
since we are only really interested here in flops.  note that the
overhead is typically about 50% of the flops cputime total.

  the program makes calls to system dependent timing routines.  for the
atari st ( well at least for a monotasking atari st running TOS ) these
routines are provided in the code.   if you are running this on another
system you will have to improvise.

please send results to:

electronic mail:  cobb@brandeis.bitnet
                  ci$ [ 72155, 1422 ]

snail mail:       wes cobb
                  dept. of physics
                  brandeis university
                  waltham, mass 02254
                  usa

############################################################################

/*
 *     c version of flops
 *        tested with
 *     lattice && megamax
 */

#include <stdio.h>
#include <osbind.h>

#define OUTERMOST 20
#define INNERMOST 20
#define INNER     20
#define OUTER     20

#define ANSWER    2.061914565513972e7

main()
{
  long sbits,dbits,smant(),dmant();
  title();
  sbits = smant();
  dbits = dmant();

  if( sbits < 23 || sbits > 25 ){
    printf("\n this single precision is NOT ieee standard...");
  }
  if( dbits < 52 || dbits > 54 ){
    printf("\n this double precision is NOT ieee standard...");
  }
  if( sbits == dbits ){
    printf("\n ...there is evidently no difference between `float`");
    printf("\n    and `double` in this c implementation.");
    printf("\n    ( only going to run d-flops ... )");
    dflops(dbits);
    exit();
  }
  sflops(sbits);
  dflops(dbits);
  printf("\n  ");
}

title(){
  printf("\n  E");
  printf("\n               flops benchmark v1.0");
  printf("\n                  22 april 1987");
  printf("\n  ");
}


long smant()
{
  float s;
  long i;
  s = 1.;
  i = 0;
  while( (s+1.) != 1. ){
    i++;
    s = s / 2.;
  }
  return(i);
}

long dmant()
{
  long i;
  double d;
  d = 1.;
  i = 0;
  while( (d+1.) != 1. ){
    i++;
    d = d / 2.;
  }
  return(i);
}

sflops(sbits)
long sbits;
{

  float    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror,error;
  double   dt,secnds(),dtover;
  long     i,j,k,l,m;

  dtover = secnds(0.);
  for( i = 1; i <= OUTERMOST ;i++){
        q = (float)(i);
        for( j = 1; j<= OUTER ;j++){
                r = (float)(j);
                for( k = 1;k <= INNER ;k++){
                        s = (float)(k);
                        for( m = 1; m <= INNERMOST ;m++){
                                t = (float)(m);
                        }
                }
        }
  }
  dtover = secnds(dtover);

  p = 1.;
  dt = secnds(0.);
  for( i = 1; i <= OUTERMOST ;i++){
        q = (float)(i);
        for( j = 1;j <= OUTER ;j++){
                r = (float)(j);
                for( k = 1;k <= INNER ;k++){
                        s = (float)(k);
                        for( m = 1;m <= INNERMOST ;m++){
                                t = (float)(m);
                                v = 1./t;
                                w = s * v;
                                x = r + w;
                                y = q - x;
                                a = y * v;
                                b = a * t;
                                c = b - w;
                                d = c / q;
                                f = d - r;
                                g = f * s;
                                p = p + g;
                        }
                }
        }
  }
  dt = secnds(dt) - dtover;
  error  = abs(p) - ANSWER;
  perror = abs(100. * error / ANSWER );
  rating = 11. * (float)(OUTERMOST*OUTER*INNER*INNERMOST) / dt;
  printf("\n  total execution time = %7.2f ",dt+2.*dtover);
  printf("\n  overhead             = %7.2f ",2.*dtover);
  printf("\n  <*> s-flops rating   = %ld ",(long)(rating+.5));
  printf("\n  <*> s-flops cpu time = %7.2f ",dt);
  printf("\n  <*> float mantissa   = %ld ",sbits);
  printf("\n  <*> percent error    = %12.5e ",perror);
}


dflops(dbits)
long dbits;
{

  double    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror;
  double    error;
  double    dt,secnds(),dtover;
  long      i,j,k,l,m;

  dtover = secnds(0.);
  for( i = 1; i <= OUTERMOST ;i++){
        q = (double)(i);
        for( j = 1;j <= OUTER ;j++){
                r = (double)(j);
                for( k = 1;k <= INNER ;k++){
                        s = (double)(k);
                        for( m = 1;m <= INNERMOST;m++ ){
                                t = (double)(m);
                        }
                }
        }
  }
  dtover = secnds(dtover);

  p = 1.;
  dt = secnds(0.);
  for( i = 1; i <= OUTERMOST; i++ ){
        q = (double)(i);
        for( j = 1;j <= OUTER; j++){
                r = (double)(j);
                for( k = 1;k <= INNER;k++ ){
                        s = (double)(k);
                        for(  m = 1; m<= INNERMOST;m++ ){
                                t = (double)(m);
                                v = 1./t;
                                w = s * v;
                                x = r + w;
                                y = q - x;
                                a = y * v;
                                b = a * t;
                                c = b - w;
                                d = c / q;
                                f = d - r;
                                g = f * s;
                                p = p + g;
                        }
                }
        }
  }
  dt = secnds(dt)-dtover;
  error  = abs(p) - ANSWER;
  perror = abs(100. * error / ANSWER );
  rating = 11. * (double)(OUTERMOST*OUTER*INNER*INNERMOST) / (double)(dt);
  printf("\n  total execution time = %7.2f ",dt+2.*dtover);
  printf("\n  overhead             = %7.2f ",2.*dtover);
  printf("\n  <*> s-flops rating   = %ld ",(long)(rating+.5));
  printf("\n  <*> s-flops cpu time = %7.2f ",dt);
  printf("\n  <*> double mantissa   = %ld ",dbits);
  printf("\n  <*> percent error    = %12.5e ",perror);
}

double secnds(offset)
double offset;
{
        long peek();
        double temp;
        temp = .005 * (double)xbios( 38, &peek ) - offset ;
        return(temp);
}
long peek()
{
        long temp2;
        temp2 = *(long *)0x4ba;
        return(temp2);
}


############################################################################

/*
 *                   absoft fortran 77
 *             version of the flops benchmark
 *       this is what the ratfor source ends up as...
 */

        program flops
        implicit none

        integer*4 sbits,dbits

        call title
        call smant(sbits)
        call dmant(dbits)

        if( sbits .LT. 23 .OR. sbits .GT. 25 )then
        write(9,*)' note: this single precision is NOT ieee standard'
        endif

        if( dbits .LT. 52 .OR. dbits .GT. 54 )then
        write(9,*)' note: this double precision is NOT ieee standard'
        endif

        if( sbits .EQ. dbits )then

        write(9,*)' evidently single and double are the same in ...'
        write(9,*)' ...this language.  we will only run d-flops'
        write(9,*)' '
        write(9,*)' *** now running the d-flops test ***'
        call dflops(dbits)

        else

        write(9,*)' '
        write(9,*)' *** now running the s-flops test ***'
        call sflops(sbits)
        write(9,*)' '
        write(9,*)' *** now running the d-flops test ***'
        call dflops(dbits)

        endif

        write(9,*)' '

        end

        subroutine title
        write(9,*)' E'
        write(9,*)'              flops benchmark v1.0'
        write(9,*)'                 22 april 1987'
        write(9,*)' '
        return
        end


        subroutine smant(i)
        implicit none
        integer*4 i
        real*4 s
        s = 1.
        i = 0
        while( (s+1.) .NE. 1. )
          i = i + 1
          s = s / 2.
        repeat
        return
        end

        subroutine dmant(i)
        implicit none
        integer*4 i
        real*8 d
        d = 1.
        i = 0
        while( (d+1.) .NE. 1. )
          i = i + 1
          d = d / 2.
        repeat
        return
        end


        subroutine sflops(sbits)
        implicit none
        integer*4 sbits
        real*4    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror
        real*4    dt,secnds,dtover,error
        integer*4 i,j,k,l,m
        dtover = secnds(0.)

        do( i = 1, 20 )
          q = float(i)
          do( j = 1, 20 )
             r = float(j)
             do( k = 1, 20 )
               s = float(k)
               do( m = 1, 20 )
                 t = float(m)
               enddo
             enddo
          enddo
        enddo
        dtover = secnds(dtover)
        p = 1.0
        dt = secnds(0.)
        do( i = 1, 20 )
          q = float(i)
          do( j = 1, 20 )
            r = float(j)
            do( k = 1, 20 )
              s = float(k)
              do( m = 1, 20 )
                t = float(m)
                v = 1./t
                w = s * v
                x = r + w
                y = q - x
                a = y * v
                b = a * t
                c = b - w
                d = c / q
                f = d - r
                g = f * s
                p = p + g
                enddo
              enddo
           enddo
        enddo
        dt = secnds(dt) - dtover
        error  = abs(p) - 2.061914565513972d7
        perror = abs(100. * error / 2.061914565513972d7    )
        rating = 11. * float(20*20*20*20) / dt
        write(9,*)' total execution time  = ',dt+2.*dtover
        write(9,*)' overhead              = ',2.*dtover
        write(9,*)' <*> s-flops rating    = ',nint(rating)
        write(9,*)' <*> s-flops cpu time  = ',dt
        write(9,*)' <*> real*4 mantissa   = ',sbits
        write(9,*)' <*> real*4 length     = ',32
        write(9,*)' <*> percent error     = ',perror
        return
        end

        subroutine dflops(dbits)
        implicit none
        integer*4 dbits
        real*8    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror
        real*8    error
        real*4    dt,secnds,dtover
        integer*4 i,j,k,l,m
        dtover = secnds(0.)
        do( i = 1, 20 )
          q = dble(i)
          do( j = 1, 20 )
            r = dble(j)
            do( k = 1, 20 )
              s = dble(k)
              do( m = 1, 20 )
                t = dble(m)
              enddo
            enddo
          enddo
        enddo
        dtover = secnds(dtover)
        p = 1.d0
        dt = secnds(0.)
        do( i = 1, 20 )
          q = dble(i)
          do( j = 1, 20 )
            r = dble(j)
            do( k = 1, 20 )
              s = dble(k)
              do( m = 1, 20 )
                t = dble(m)
                v = 1./t
                w = s * v
                x = r + w
                y = q - x
                a = y * v
                b = a * t
                c = b - w
                d = c / q
                f = d - r
                g = f * s
                p = p + g
                enddo
              enddo
           enddo
        enddo
        dt = secnds(dt)-dtover
        error  = abs(p) - 2.061914565513972d7
        perror = abs(100. * error / 2.061914565513972d7    )
        rating = 11. * dble(20*20*20*20)/dble(dt)
        write(9,*)' total execution time  = ',dt+2.*dtover
        write(9,*)' overhead              = ',2.*dtover
        write(9,*)' <*> s-flops rating    = ',nint(rating)
        write(9,*)' <*> s-flops cpu time  = ',dt
        write(9,*)' <*> real*8 mantissa   = ',dbits
        write(9,*)' <*> real*8 length     = ',64
        write(9,*)' <*> percent error     = ',perror
        return
        end

        real*4 function secnds(offset)
        integer*4 Super
        parameter (Super     = z'00000902')
        real*4 offset
        integer*4 atari,dummy,stack,systimer,oldstack
        real*4 mspt
        parameter ( mspt = 5.0e-3 )
        oldstack = atari( Super, 0 )
        systimer = long(z'4BA')
        dummy = atari( Super, oldstack )
        secnds = -offset + mspt * float(systimer)
        return
        end


############################################################################

/*
 *    ratfor version of flops benchmark.
 */

#include <lib\fortran.h>  /* this just defines stdout,stdin and the very */
                          /* few quirky things which vary between vaxf77 */
                          /* and absoft...                               */
#define OUTERMOST 20
#define INNERMOST 20
#define INNER     20
#define OUTER     20

#define ANSWER    2.061914565513972d7   /* the right answer to 16 figs */

program flops
implicit none

  integer*4 sbits,dbits

  call title
  call smant(sbits)
  call dmant(dbits)

  if( sbits < 23 || sbits > 25 )then
    write(stdout,*)' note: this single precision is NOT ieee standard'
  endif

  if( dbits < 52 || dbits > 54 )then
    write(stdout,*)' note: this double precision is NOT ieee standard'
  endif

  if( sbits == dbits )then

    write(stdout,*)' evidently single and double are the same in ...'
    write(stdout,*)' ...this language.  we will only run d-flops'
    write(stdout,*)' '
    write(stdout,*)' *** now running the d-flops test ***'
    call dflops(dbits)

  else

    write(stdout,*)' '
    write(stdout,*)' *** now running the s-flops test ***'
    call sflops(sbits)
    write(stdout,*)' '
    write(stdout,*)' *** now running the d-flops test ***'
    call dflops(dbits)

  endif

  write(stdout,*)' ';
end

/*
 *   this just clears the screen and then draws the title message.
 */

subroutine title
  write(stdout,*)' E'
  write(stdout,*)'              flops benchmark v1.0'
  write(stdout,*)'                 22 april 1987'
  write(stdout,*)' '
return
end

/*
 *   this measures how many significant bits are carried along in the
 *   mantissa of single precision floating point numbers...does this by
 *   repeated division by 2 until stasis is reached.
 */
subroutine smant(i)
implicit none;

  integer*4 i;
  real*4 s;

  s = 1.;
  i = 0;

  while( (s+1.) != 1. )
    i = i + 1;
    s = s / 2.;
  repeat

return
end

/*
 *   exactly like smant except for double precision numbers...
 */

subroutine dmant(i)
implicit none;

  integer*4 i;
  real*8 d;

  d = 1.;
  i = 0;

  while( (d+1.) != 1. )
    i = i + 1;
    d = d / 2.;
  repeat

return
end

/*
 *  the single precision benchmark.
 *
 */

subroutine sflops(sbits)
implicit none;
integer*4 sbits;

  real*4    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror;
  real*4    dt,secnds,dtover,error;
  integer*4 i,j,k,l,m;

/*
 *  first figure out what the loop and type conversion overhead is...
 */

  dtover = secnds(0.);

  do( i = 1, OUTERMOST )
        q = float(i);
        do( j = 1, OUTER )
                r = float(j);
                do( k = 1, INNER )
                        s = float(k);
                        do( m = 1, INNERMOST )
                                t = float(m);
                        enddo
                enddo
        enddo
  enddo

  dtover = secnds(dtover);

/*
 *   now do the actual benchmark loop..
 */

  p = 1.0;
  dt = secnds(0.);

  do( i = 1, OUTERMOST )
        q = float(i);
        do( j = 1, OUTER )
                r = float(j);
                do( k = 1, INNER )
                        s = float(k);
                        do( m = 1, INNERMOST )
                                t = float(m);
                                v = 1./t;
                                w = s * v;
                                x = r + w;
                                y = q - x;
                                a = y * v;
                                b = a * t;
                                c = b - w;
                                d = c / q;
                                f = d - r;
                                g = f * s;
                                p = p + g;
                        enddo
                enddo
        enddo
  enddo

  dt = secnds(dt) - dtover;

  error  = abs(p) - ANSWER;
  perror = abs(100. * error / ANSWER);
  rating = 11. * float(OUTERMOST*OUTER*INNER*INNERMOST) / dt;
  write(stdout,*)' total execution time  = ',dt+2.*dtover;
  write(stdout,*)' overhead              = ',2.*dtover;
  write(stdout,*)' <*> s-flops rating    = ',nint(rating);
  write(stdout,*)' <*> s-flops cpu time  = ',dt;
  write(stdout,*)' <*> real*4 mantissa   = ',sbits;
  write(stdout,*)' <*> real*4 length     = ',32;
  write(stdout,*)' <*> percent error     = ',perror;

end

/*
 *   exactly like the single precision test except in double precision
 */

subroutine dflops(dbits)
implicit none;
integer*4 dbits;

  real*8    a,b,c,d,e,f,g,h,p,q,r,s,t,u,v,w,x,y,z,rating,perror;
  real*8    error;
  real*4    dt,secnds,dtover;
  integer*4 i,j,k,l,m;

  dtover = secnds(0.);
  do( i = 1, OUTERMOST )
        q = dble(i);
        do( j = 1, OUTER )
                r = dble(j);
                do( k = 1, INNER )
                        s = dble(k);
                        do( m = 1, INNERMOST )
                                t = dble(m);
                        enddo
                enddo
        enddo
  enddo
  dtover = secnds(dtover);

  p = 1.d0;
  dt = secnds(0.);
  do( i = 1, OUTERMOST )
        q = dble(i);
        do( j = 1, OUTER )
                r = dble(j);
                do( k = 1, INNER )
                        s = dble(k);
                        do( m = 1, INNERMOST )
                                t = dble(m);
                                v = 1./t;
                                w = s * v;
                                x = r + w;
                                y = q - x;
                                a = y * v;
                                b = a * t;
                                c = b - w;
                                d = c / q;
                                f = d - r;
                                g = f * s;
                                p = p + g;
                        enddo
                enddo
        enddo
  enddo
  dt = secnds(dt)-dtover;

  error  = abs(p) - ANSWER;
  perror = abs(100. * error / ANSWER);
  rating = 11. * dble(OUTERMOST*OUTER*INNER*INNERMOST)/dble(dt);
  write(stdout,*)' total execution time  = ',dt+2.*dtover;
  write(stdout,*)' overhead              = ',2.*dtover;
  write(stdout,*)' <*> s-flops rating    = ',nint(rating);
  write(stdout,*)' <*> s-flops cpu time  = ',dt;
  write(stdout,*)' <*> real*8 mantissa   = ',dbits;
  write(stdout,*)' <*> real*8 length     = ',64;
  write(stdout,*)' <*> percent error     = ',perror;

end

real*4 function secnds(offset)
#include <lib\gemdos.inc>
real*4 offset;

  integer*4 atari,dummy,stack,systimer,oldstack;
  real*4 mspt;
  parameter ( mspt = 5.0e-3 );               /* milli seconds per tick   */

  oldstack = atari( Super, 0 );              /* save stack               */
  systimer = long(z'4BA');                   /* change mode and read     */
  dummy = atari( Super, oldstack );          /* timer, and restore stack */
  secnds = -offset + mspt * float(systimer); /* convert ticks to seconds */

return
end

########################################################################

/*
 *                            atari basic
 *                   version of the flops benchmark
 *                       single precsion only
 *              ( double looked too slow to wait for )
 */

1000  rem:---------------------------------------------
1010  rem:
1020  rem:   flops benchmark, atari basic version.
1030  rem:
1040  rem:---------------------------------------------
9920  defsng a-h,p-z
10100 defint i-n
10130 def seg = 0
10135 loc# = 1210
10140 def fntimer(z) = .005 * peek( loc# ) + z
10145 ANSWER = 2.061914565513972e7
10150 rem:----------------------------------:
10170 rem:   mantissa of single precision   :
10190 rem:----------------------------------:
10200 s = 1.
10210 i# = 0
10220 while (s+1.) <> 1.
10230    i# = i# + 1#
10240    s = s / 2.
10250 wend
13000 rem:-------------------------------------------------
13010 rem:
13020 rem:   single precision test...
13030 rem:
13040 rem:-------------------------------------------------
13100 dtover = fntimer(0.)
13200 for i# = 1# to 20
13300     q = float(i#)
13400     for j# = 1 to 20
13500         r = float(j#)
13600         for k# = 1 to 20
13700             s = float(k#)
13800             for m# = 1 to 20
13900                 t = float(m#)
14000             next m#
14010         next k#
14020      next j#
14030 next i#
14040 dtover = fntimer(dtover)
15000 p = 1.
15100 dt = fntimer(0.)
15200 for i# = 1 to 20
15300     q = float(i#)
15400     for j# = 1 to 20
15500         r = float(j#)
15600         for k# = 1 to 20
15700             s = float(k#)
15800             for m# = 1 to 20
15900                   t = float(m#)
15990                   v = 1./t
15991                   w = s * v
15992                   x = r + w
15993                   y = q - x
15994                   a = y * v
15995                   b = a * t
15996                   c = b - w
15997                   d = c / q
15998                   f = d - r
15999                   g = f * s
16000                   p = p + g
16001             next m#
16010         next k#
16020      next j#
16030 next i#
16040 dt = fntimer(dt) - dtover
16050 erratum = abs(p) - ANSWER
16060 perror = abs(100.*erratum/ANSWER)
16070 rating = 11. * float(20.*20.*20.*20.)/dt
16073 print " <*> s-flops rating   = ";int(rating+.5)
16080 print " <*> s-flops cputime  = ";dt
16090 print " <*> single mantissa  = ";i#
16100 print " <*> percent error     = ";perror
16300 end

############################################################################