[comp.benchmarks] 68040 and Floats, is this true?

irf@kuling.UUCP (Bo Thide') (06/14/91)

In order to see how well the 68040 performs in general, and on sprintf()
in particular, I ran the "C Cost" benchmark on the HP9000/425t (68040/25
MHz), HP9000/400t (68030/50 MHz) and the Sun SparcStation 1.  The
results are presented below.  As is seen, the HP-UX 7.05 sprintf() on
the 68040 is a factor of 2 *slower* than on the 68030 but a factor of
two *faster* than in SunOS4.1 on the Sun SparcStation (lower numbers =
faster).

The "C Cost" program is taken from an article titled "An Elementary C
Cost Model" written by Jon Bentley, Brian Kernighan, and Chris Van Wyk
contained within the Volume 9 Number 2 issue of "Unix Review", February
1991.

RESULTS:

-------------------------------------------------------------
Operation                         Mics/N  Mics/N  Mics/N
                                  HP425t  HP400t  Sun Sparc-
                                  (68040) (68030) Station 1
Null Loop (n=1000000)           
 {}                                 0.00    0.43    0.18
Int Operations (n=1000000)              
 i1++                               0.16    0.18    0.34
 i1 = i2                            0.16    0.19    0.35
 i1 = i2 + i3                       0.24    0.35    0.30
 i1 = i2 - i3                       0.24    0.35    0.30
 i1 = i2 * i3                       0.36    1.21    0.30
 i1 = i2 / i3                       2.02    2.11    0.31
 i1 = i2 % i3                       2.02    2.12    0.30
Float Operations (n=1000000)           
 f1 = f2                            0.24    0.19    0.42
 f1 = f2 + f3                       0.40    2.68    0.43
 f1 = f2 - f3                       0.40    2.68    0.42
 f1 = f2 * f3                       0.48    3.29    0.43
 f1 = f2 / f3                       1.78    3.70    0.42
Numeric Conversions (n=1000000)         
 i1 = f1                            1.83    4.92    0.49
 f1 = i1                            0.49    1.92    0.79
Integer Vector Operations (n=1000000)
 v[i] = i                           0.41    0.39    0.38
 v[v[i]] = i                        0.59    0.71    0.62
 v[v[v[i]]] = i                     0.73    0.83    0.82
Control Structures (n=1000000)          
 if (i == 5) i1++                   0.28    0.27    0.12
 if (i != 5) i1++                   0.38    0.39    0.67
 while (i < 0) i1++                 0.32    0.18    0.12
 i1 = sum1(i2)                      0.20    0.82    0.60
 i1 = sum2(i2, i3)                  0.28    1.11    0.67
 i1 = sum3(i2, i3, i4)              0.32    1.42    0.84
Input/Output (n=10000)          
 fputs(s,fp)                       10.00   15.57   15.42
 fgets(s,9,fp)                     11.20   20.37   11.42
 fprintf(fp,sdn,i)                 28.40   48.37   65.82
 fscanf(fp,sd,&i1)                 47.60   80.77   89.42
Malloc (n=20000)                
 free(malloc(8))                    7.60   19.57   28.82
 push(i)                            6.20   15.77   14.02
 i1 = pop()                         0.60    1.97    2.22 
String Functions (n=100000)             
 strcpy(s,s0123456789)              2.08    3.97    5.06
 i1 = strcmp(s,s)                   3.52    4.93    6.14
 i1 = strcmp(s,sa123456789)         1.16    1.49    3.42
String/Number Conversions (n=10000)
 i1 = atoi(s12345)                  5.60    8.37    7.02
 sscanf(s12345,sd,&i1)             48.40   81.57   97.02
 sprintf(s,sd,i)                   23.20   40.77   63.02
 f1 = atof(s123_45)                81.20   56.37  558.62
 sscanf(s123_45,sf,&f1)           148.80  146.37  478.62
 sprintf(s,sf62,123.45)           250.40  127.57  519.02
Math Functions (n=20000)                
 i1 = rand()                        1.60    2.17    6.22
 f1 = log(f2)                      33.20   25.37   13.02
 f1 = exp(f2)                      26.40   19.77   16.42
 f1 = sin(f2)                      24.20   20.37   19.82
 f1 = sqrt(f2)                      4.60   12.57   26.82




----------------------------------------------------------------------

I've cross-posted to comp.benchmarks for possible comments.

Bo

---

   ^   Bo Thide'--------------------------------------------------------------
  |I|       Swedish Institute of Space Physics, S-755 91 Uppsala, Sweden
  |R|  Phone: (+46) 18-303671.  Telex: 76036 (IRFUPP S).  Fax: (+46) 18-403100 
 /|F|\        INTERNET: bt@irfu.se       UUCP: ...!uunet!sunic!irfu!bt
 ~~U~~ -----------------------------------------------------------------sm5dfw

tim@proton.amd.com (Tim Olson) (06/15/91)

In article <2080@kuling.UUCP> bt@irfu.se (Bo Thide') writes:
| In order to see how well the 68040 performs in general, and on sprintf()
| in particular, I ran the "C Cost" benchmark on the HP9000/425t (68040/25
| MHz), HP9000/400t (68030/50 MHz) and the Sun SparcStation 1.  The
| results are presented below.  As is seen, the HP-UX 7.05 sprintf() on
| the 68040 is a factor of 2 *slower* than on the 68030 but a factor of
| two *faster* than in SunOS4.1 on the Sun SparcStation (lower numbers =
| faster).

I don't trust any of these numbers, as they appear highly suspect for
even the simple operations:

| RESULTS:
| 
| -------------------------------------------------------------
| Operation                         Mics/N  Mics/N  Mics/N
|                                   HP425t  HP400t  Sun Sparc-
|                                   (68040) (68030) Station 1
| Null Loop (n=1000000)           
|  {}                                 0.00    0.43    0.18
				      ^^^^
				      It appears that the HP compiler
				      removed the null loop through
				      dead code elimination.	

| Int Operations (n=1000000)              
|  i1++                               0.16    0.18    0.34
|  i1 = i2                            0.16    0.19    0.35
|  i1 = i2 + i3                       0.24    0.35    0.30
|  i1 = i2 - i3                       0.24    0.35    0.30
|  i1 = i2 * i3                       0.36    1.21    0.30
|  i1 = i2 / i3                       2.02    2.11    0.31
|  i1 = i2 % i3                       2.02    2.12    0.30

Are these register or memory operations (they would appear to be
memory to memory by the times listed)?  Note that the SparcStation
times for multiply and divide are the same as those for the simple
operations, even though it has no hardware MUL or DIV.  In the 68040
column, why does it take the same amount of time to perform an
assignment as it does to increment a variable?  That could only be
if they were register-to-register operations, but then why would it
take 160ns @ 25MHz? Again, these numbers are highly suspect.

| Control Structures (n=1000000)          
|  if (i == 5) i1++                   0.28    0.27    0.12 <-- ??
|  if (i != 5) i1++                   0.38    0.39    0.67 <-- ??
|  while (i < 0) i1++                 0.32    0.18    0.12
|  i1 = sum1(i2)                      0.20    0.82    0.60
|  i1 = sum2(i2, i3)                  0.28    1.11    0.67
|  i1 = sum3(i2, i3, i4)              0.32    1.42    0.84

How can these vary by such a large amount?  They should be equal times.

--
	-- Tim Olson
	Advanced Micro Devices
	(tim@amd.com)