Charlie.Mingo.Of.30039/4218@f421.n109.z1.FidoNet.Org (Charlie Mingo Of 30039/4218) (06/13/91)
Figure 2, DayStar recommends that its graphics and DTP customers not buy
an optional 68882 FPU on its accelerators.
3.1 Integer Performance
For integer performance, the 68040 has a high degree of instruction
parallelism -- it is capable of executing in one clock cycle an
instruction that may take 3-4 cycles to execute on a 68030. The 68040
has two 4,096 byte caches for both instruction and data, and both are
four-way set associative. Contrast this to a 68030, which has only a
256 bytes direct mapped cache (less efficient). Therefore, the 68040
will exhibit a much higher "hit" rate allowing zero wait state
performance up to 40 MHz. In fact, the 68040 caches are so efficient
that there will be no need to add an external cache, as is required in
the faster 68030's.
Predicted integer performance for the 68040 (based on Motorola data) is
shown in the Table 3 against a zero wait state 68030. Percentage gains
are shown against a 40 MHz 68030 (to represent a Mac IIfx). Expected
gains for the 25 MHz 68040 are only on the order of 30% (1.3) when
compared to the 40 MHz 68030.
Table 3
Performance Relative to a 40 MHz 68030 Ref: Motorola
Clock 68030 68040 68040 Volume Ship
--------------------------------------------------------------
16 MHz 0.4 n/a n/a
25 MHz 0.6 1.3 Q2-91
33 MHz 0.8 1.7 Q1-92
40 MHz 1.0 2.1 Late 92
50 MHz 1.3 n/a n/a
Gains of 30% will not satisfy power users. They really demand gains of
100-200%, and these will not be available for several years, at least
for the Mac IIfx. Gains on the 16 MHz Mac IIs should be a little over
three times greater when the 40 MHz 68040 is introduced in late 1992, so
an appreciable upgrade market will exist for users who want better than
IIfx class performance. But in the mean time, will the initial 68040
compatibility problems be more of a problem than a IIfx upgrade or a 50
MHz accelerator?
3.2 FPU performance
The real power of the 68040 lies within its FPU performance. By
combining the CPU and FPU into the same piece of silicon, FPU has been
boosted three times. But to achieve this integration Motorola accepted
a major sacrifice in instruction set commonality. Applications not
written to directly address the 68040 FPU will either have to be
rewritten, or will have to operate through about 256K of code that
translates the 68882 calls into 68040 calls. The overhead required for
this translation process will drastically reduce 68040 FPU performance
gains.
If an FPU intensive function is rewritten to directly use the 68040 FPU
instruction set, then performance gains can be substantial. Table 4
contains estimates for the impact of the 68040 FPU on the two FPU
intensive functions shown in Table 2.
Table 4
Estimated Possible 25 MHz 68040 FPU Performance Gain Ref: DayStar
Platform Mac IIci Accel IIci Accel IIci Accel IIci
Processor 68030 68030 68040 FPU
Clock 25 MHz 50 MHz 25 MHz % Gain
FPU Yes Yes Yes
-----------------------------------------------------------------------
RenderMan Render 98.0 56.0 17.8 215%
Excel Recalculate 10.4 5.6 3.5 60%
In summary, the Mac community will not see immediate gains in 68040
performance. A 25 MHz 68040 is not that much faster than a Mac IIfx,
for integer performance. And, 68040 FPU performance will be of little
benefit to the typical Mac user. However, in several years the 40 MHz
68040 will be double the speed of the Mac IIfx, and offer even greater
gains for CAD and scientific functions directly utilizing the 68040's
FPU.
3.3 Today's Accelerator Performance
The limited gains of the 25 MHz 68040 are verified by benchmarks run at
the January, 1991 San Francisco Macworld. Here, prototype accelerators
were being shown by two different companies. In Table 5, benchmark
performance is shown against current state-of-the-art machines. These
tests show that gains in integer performance are below Motorola
estimates. FPU performance is no better than a regular Mac. These
prototypes were operating in a very restricted environment (they were
only running benchmarks). Applications were not being shown. In
contrast, once the 68030 was stable, up and running, there were few Mac
OS or application problems to overcome. In all fairness, these were just
early engineering prototypes, and they had not yet "tweeked" performance
to the maximum, as is common in the development process.
Table 5
25 MHz Prototype Accelerator Performance Ref: DayStar Measurements
OEM Apple Apple DayStar TokaMac IIR
Platform Mac IIci Mac IIfx Mac IIci Mac LC Mac II/IIx
CPU 68030 68030 68030 68040 68040
FPU Yes Yes Yes Yes Yes
Speed 25 MHz 40 MHz 50 MHz 25 MHz 25 MHz
-----------------------------------------------------------------------
Float Integer 0.18 0.15 0.10 0.20 0.10
Trig FPU 0.57 0.36 0.32 3.18 1.20
Butterfly FPU 2.33 2.17 1.57 4.18 2.40
Ripples FPU 17.10 12.87 01rYGtkjkD3ru4tkQpCrctk-0jrq%[ X<crctkK6$yr
* Origin: The Clone: Macintosh Things - 301-946-8677 (1:109/421)