[comp.sys.mac.hardware] 02/Re: 68040 for 68030 substitution

Charlie.Mingo.Of.30039/4218@f421.n109.z1.FidoNet.Org (Charlie Mingo Of 30039/4218) (06/13/91)

Figure 2, DayStar recommends that its graphics and DTP customers not buy
an optional 68882 FPU on its accelerators.

3.1 Integer Performance 
For integer performance, the 68040 has a high degree of instruction
parallelism -- it is capable of executing in one clock cycle an
instruction that may take 3-4 cycles to execute on a 68030.  The 68040
has two 4,096 byte caches for both instruction and data, and both are
four-way set associative.  Contrast this to a 68030, which has only a
256 bytes direct mapped cache (less efficient). Therefore, the 68040
will exhibit a much higher "hit" rate allowing zero wait state
performance up to 40 MHz.  In fact, the 68040 caches are so efficient
that there will be no need to add an external cache, as is required in
the faster 68030's.

Predicted integer performance for the 68040 (based on Motorola data) is
shown in the Table 3 against a zero wait state 68030.  Percentage gains
are shown against a 40 MHz 68030 (to represent a Mac IIfx).  Expected
gains for the 25 MHz 68040 are only on the order of 30% (1.3) when
compared to the 40 MHz 68030.

Table 3 
Performance Relative to a 40 MHz 68030       Ref: Motorola
                                             
Clock          68030          68040          68040 Volume Ship
--------------------------------------------------------------
16 MHz         0.4            n/a            n/a
25 MHz         0.6            1.3            Q2-91
33 MHz         0.8            1.7            Q1-92
40 MHz         1.0            2.1            Late 92
50 MHz         1.3            n/a            n/a

Gains of 30% will not satisfy power users.  They really demand gains of
100-200%, and these will not be available for several years, at least
for the Mac IIfx.  Gains on the 16 MHz Mac IIs should be a little over
three times greater when the 40 MHz 68040 is introduced in late 1992, so
an appreciable upgrade market will exist for users who want better than
IIfx class performance.  But in the mean time, will the initial 68040
compatibility problems be more of a problem than a IIfx upgrade or a 50
MHz accelerator?

3.2 FPU performance 
The real power of the 68040 lies within its FPU performance.  By
combining the CPU and FPU into the same piece of silicon, FPU has been
boosted three times.  But to achieve this integration Motorola accepted
a major sacrifice in instruction set commonality.  Applications not
written to directly address the 68040 FPU will either have to be
rewritten, or will have to operate through about 256K of code that
translates the 68882 calls into 68040 calls.  The overhead required for
this translation process will drastically reduce 68040 FPU performance
gains.

If an FPU intensive function is rewritten to directly use the 68040 FPU
instruction set, then performance gains can be substantial. Table 4
contains estimates for the impact of the 68040 FPU on the two FPU
intensive functions shown in Table 2.

Table 4 
Estimated Possible 25 MHz 68040 FPU Performance Gain       Ref: DayStar  
                                                          
Platform                Mac IIci    Accel IIci  Accel IIci  Accel IIci
Processor               68030       68030       68040       FPU
Clock                   25 MHz      50 MHz      25 MHz      % Gain
FPU                     Yes         Yes         Yes         
-----------------------------------------------------------------------
RenderMan   Render      98.0        56.0        17.8        215%
Excel       Recalculate 10.4        5.6         3.5         60%

In summary, the Mac community will not see immediate gains in 68040
performance.  A 25 MHz 68040 is not that much faster than a Mac IIfx,
for integer performance.  And, 68040 FPU performance will be of little
benefit to the typical Mac user.  However, in several years the 40 MHz
68040 will be double the speed of the Mac IIfx, and offer even greater
gains  for CAD and scientific functions directly utilizing the 68040's
FPU.

3.3 Today's Accelerator Performance 
The limited gains of the 25 MHz 68040 are verified by benchmarks run at
the January, 1991 San Francisco Macworld.  Here, prototype accelerators
were being shown by two different companies.  In Table 5, benchmark
performance is shown against current state-of-the-art machines. These
tests show that gains in integer performance are below Motorola
estimates.  FPU performance is no better than a regular Mac.  These
prototypes were operating in a very restricted environment (they were
only running benchmarks). Applications were not being shown.  In
contrast, once the 68030 was stable, up and running, there were few Mac
OS or application problems to overcome. In all fairness, these were just
early engineering prototypes, and they had not yet "tweeked" performance
to the maximum, as is common in the development process.

Table 5 
25 MHz Prototype Accelerator Performance    Ref: DayStar Measurements  
                                                 
OEM                 Apple     Apple     DayStar   TokaMac   IIR 
Platform            Mac IIci  Mac IIfx  Mac IIci  Mac LC    Mac II/IIx
CPU                 68030     68030     68030     68040     68040
FPU                 Yes       Yes       Yes       Yes       Yes
Speed               25 MHz    40 MHz    50 MHz    25 MHz    25 MHz
-----------------------------------------------------------------------
Float     Integer   0.18      0.15      0.10      0.20      0.10
Trig      FPU       0.57      0.36      0.32      3.18      1.20
Butterfly FPU       2.33      2.17      1.57      4.18      2.40
Ripples   FPU       17.10     12.87 01rYGtkjkD3ru4tkQpCrctk-0jrq%[ X<crctkK6$yr

 * Origin: The Clone: Macintosh Things - 301-946-8677 (1:109/421)