[comp.sys.transputer] transputer vs. 680x0 benchmarks

koontz%capvax.decnet@CAPSRV.JHUAPL.EDU ("CAPVAX::KOONTZ") (07/18/90)

Hi Len,

     I was going to send this to you directly but I thought other netters may 
find the numbers of interest.

     We've been trying to compare 68020s et. al. with transputers for some time 
now.  All of our benchmarks to date have been floating-point intensive (scalar, 
not vector).  In the olden days, we tried an instruction mix that would appear 
in an algorithm of interest on a 68020/68881 and a T800.  The 68020/68881 was 
coded in assembly and the T800 in occam.  The 68020/881 was running at 16MHz and 
the T800 at 20MHz.  We tried several different mixes that included and excluded 
transcendental functions.  The 68881 has instructions for sin, cos, etc. while 
the T800 had to make a function call and do polynomial interpolation.  With 
cache enabled on the 68020 and everything in on-chip RAM on the T800, we got the 
following:

                            68020/881 (16MHz)    T800 (20MHz)
     Mix 1 (with trans)         100                 21.34
     Mix 2 (without trans)       53                  5.82

     Lately, we've run another benchmark on 68Ks vs. T8s which is a Kalman
filter application.  This benchmark is pretty heavy on floating-point and does 
Kalman tracking of X radar targets (with tracks synthesized before the benchmark 
begins).  The benchmark was written from common specs in both C and Ada and was 
mainly used to compare the relative strengths of various C compilers to various 
Ada compilers for 68020s.  We took both sources lately and ran them through 
Logical Systems C (89.1 Beta3) and Alsys Ada (v4.4).  Here's what we got for
the best runs under LSC and Alsys and for the best compilers found for the 68020 
(Microtek C and XD-Ada):

                           68020/881 (20MHz)     T800 (20MHz)
     Kalman (C)               21.0                  2.68
     Kalman (Ada)             16.0                  6.41

     The benchmarks use their own math functions so they completely disable any 
transcendental instructions on either computer.  We ran the benchmarks again in 
C allowing the program to use the math libraries supplied with the compiler.  
For the 68020, this allowed sin, cos, sqrt, and arctan to generate 68881 
instructions; for the T800, sqrt instructions were generated.  The results for 
this pass were:

     Kalman (C) w/trans        8.38                 2.11

As a brief aside, we had no trouble compiling the benchmarks under C or Ada.
Both sources were expertly crafted (no sloppy use of the language); I'm sure 
this helped ease the porting.


Discussion:

     The Kalman results lined up well with our initial instruction mixes 
performed almost 2 years earlier.  If no transcendentals are used, expect 8 
times better performance.  If transcendentals are used, expect 4 times better.

     The difference between C and Ada times are interesting to observe.  Ada 
compilers have been maturing at a fast clip for 68020s; their optimizers can now 
beat C!  However, the Alsys compiler is the 1st Ada for transputers while LSC 
has been maturing nicely for years.  But it sure beats the pants off the 1st Ada 
compiler for 68020s we tried years back (I think the execution speed was 
somewhere out in turtle-land at 200+ seconds!).

     In general, benchmarks are nasty things.  You have to get your application 
(or something approaching your application) running on multiple machines. Then,
make sure the hardware is similar (similar clocks, memory wait-states, use of 
cache or on-chip RAM, etc.).  Then, try and throw as many optimization switches 
in the right positions to get the fastest performance.  We generally send 
several people knowledgeable about their hardware and compiler off with some 
code and say "Make it run as fast as you can."  In the end, you hope that your 
comparisons are useful and not apples-to-oranges.

     We're going to run the Kalman on newer chips given their availability.
We would like to run things on a 68030/68882, 68040, T801, and T805 at faster 
clock rates (33-40MHz on the 68K class, 25-30MHz on the T8s).  We have some 
68030/882 boards (33MHz Tadpole Multibus-II things) but no 68040s yet.  We don't 
have any T801 TRAMs yet but I hope to briefly borrow one for a day from a 
vendor.  Same for the T805 if a TRAM is available. LSC 89.1 will compile for any 
T-type; I'm not sure if Alsys Ada will run on a T801 or T805.  I think Microtek 
C supports 68030/882's now but not the 68040 yet; I don't think XD-Ada runs on a 
68030 or 040.

     We're also going to adapt an ESM signal processing algorithm that we have 
from C into Ada and run the tests on 680x0 and T80x hardware.

     Keep you posted.


Ken Koontz
Senior Engineer
The Johns Hopkins University
  Applied Physics Laboratory
Laurel, MD USA
email: koontz@capsrv.jhuapl.edu