[net.arch] send CRA1$:[FALCONE]XMAIL.TMP

falcone@erlang.DEC (Joe Falcone, HLO2-3/N03, dtn 225-6059) (11/03/84)

CC:	RHEA::DECWRL::"net.arch"


68020 Performance Revisited Again

I am delighted to get responses from both Xerox PARC and Motorola.
As was pointed out, I am concentrating on the use of the 68020 in
networked virtual-memory workstations, so this influences my analysis.

Peter Deutsch's argument about data cache is well taken, except that
his use of the 180ns figure is wrong.  The address-valid to data-valid
window is 120ns.  Given a design with parts tolerance for mass production,
you would have to design your memory management unit, virtual address
translation, and data cache for a ~100ns window.  Given that all of this
is done OFF chip, the design would have to be very aggressive and the
result would have adverse price implications because of the high-speed
logic and memory involved.

Doug MacGregor pointed out that faster 68020 processors will follow, but these
will only exacerbate the situation.  The 24MHz part will have an 80ns window
for memory access, which will require an extremely aggressive design for
the memory management unit, virtual address translation, and data cache.

The difference between the performance tables generated by MacGregor
and myself can be explained by our perspectives.  MacGregor and I
used different bases and methods to calculate our respective tables.

There were three points where I used conservative figures to emphasize
the ballpark nature of the comparison, as I was more interested in seeing
what league these devices were in with respect to the VAX line.

------------------------------------------------------------
     68K MEMORY	   100ns	  200 ns	  300ns
CPU  -------------------------------------------------------
8MHz  68000        0.6 (1x)       0.6 (1x)        0.6 (1x)
16MHz 68020*       2.1 (3.5x)     1.5 (2.5x)      1.3 (2.2x)
16MHz 68020**      2.7 (4.5x)     2.3 (3.7x)      2.0 (3.3x)
------------------------------------------------------------
*  I-cache disabled
** 100% I-cache hit ratio

1. I derived my table by dividing each figure above by 3 for the upper
   bound and 5 for the lower bound, since I had decided on 3 to 5
   as scale factors between the 68000 and the 11/780.  These scale
   factors are extremely generous after viewing the performance of
   the 8MHz 68000 and 11/780 systems running benchmarks.

2. Again, I must apologize for generosity. I used 0.7 MIPS instead
   of 0.6 MIPS for the 8MHz 68000, which is why I have 0.14-0.23 
   "VAX MIPS" (0.7 divided by 3 to 5).  I did this because 0.6 MIPS
   seemed on the low-side of my personal observations of 68K systems.
   This scaling gives MIPS figures normalized to one 780 MIPS.

3. Because there is some dispute over 780 performance even within Digital,
   I felt obliged to loosen the 780 figures to reflect the ranges reported.
   So I set the 780 at 0.7 to 1 MIPS in the comparison, downrating it by
   30% to achieve the low figure.

MacGregor generated his figures by starting with the VAX figures, deriving
the 68000 numbers by dividing by the scale factors, and then using the
68000 numbers to derive the 68020 figures with the multipliers above.

So MacGregor and I used different bases to compute our figures from.  
This is a common practice in performance analysis which prevents sane 
comparisons of one company's benchmarks with another.  

The truth of the matter is probably somewhere in the range between my
figures and MacGregor's, so I offer the following conciliatory table.

                              "VAX MIPS"
---------------------------------------------------------
     68K MEMORY	   100ns	  200 ns	  300ns
CPU  ----------------------------------------------------
8MHz    68000    0.14-0.25      0.14-0.25       0.14-0.25
16MHz   68020*   0.42-0.88      0.30-0.63       0.24-0.55
16MHz   68020**  0.56-1.13      0.46-0.93       0.40-0.83
---------------------------------------------------------
VAX-11/780	    -->		0.7-1.0		<--
VAX-11/785	    -->		1.0-1.5		<--
VAX 8600	    -->		3.4-4.2		<--
---------------------------------------------------------
*  I-cache disabled
** 100% I-cache hit ratio

My initial motivation was to put an end to the practice of comparing
microprocessor chips to 7-year-old fully-configured, virtual-memory,
multi-user computer systems.  I firmly believe that the data in the
table above can be used as ballpark figures for systems built around
the 68020, but one must remain cautious of the hooks.  It is one thing
to talk about 80-100ns virtual memory management and cache, it is quite
another thing to build it.

Joe Falcone
Eastern Research Laboratory		decwrl!
Digital Equipment Corporation		decvax!deccra!jrf
Hudson, Massachusetts			tardis!