[comp.sys.m68k] 68030 MMU overhead query

levy@nsc.nsc.com (Jonathan Levy) (01/14/89)

I am cross posting this from the 68k group to the 32k group for
interest.

In article <216@unet.UUCP> jimmc@unet.PacBell.COM (Jim McCrae) writes:
>
>	Also, has anyone noticed that the instruction cache on the
>	68030 can actually slow down execution speed? We saw this
>	during evaluation and as yet have no explanation.

Yes, this can happen. Small caches, with low hit ratios can increase
the bus load due to line fills in burst mode. The situation becomes
worse if wait states are evident. The increased bus load then
can interfere with other bus activity that the processor needs
to perform ( such as operand reads or writes).
You may try to disable burst mode and verify this.
(The above information is theoretical, and is based on analysis
that we had done when selecting cache size for the NS32532. The
Instruction cache was selected to be 512 bytes, as smaller
caches had this danger of actually degrading performance)

Jonathan

schene@prisma (01/16/89)

/* Written  3:59 pm  Jan 12, 1989 by jimmc@unet in comp.sys.m68k */
/* ---------- "68030 MMU overhead query" ---------- */

>	Does anyone out there have experience with the MMU who could
>	pass on some information on the cost in cycles for an ATC
>	miss? Is the overhead measureable or is it hidden in with 
>	the prefetch cycles? I'm hoping it's not a factor.

At a previous employer, I was in charge of porting the OS (a UNIX derivative)
to the '030.  I don't remember any surprises in the table walk cycles.
Just how many cycles are used in table walking depends on the layout of
your page tables, but we didn't notice any references that weren't
expected.  Keep in mind that all page table references by the MMU are
done with read-modify-write cycles, so you won't get any cache hits and
hence no prefetch.  It is definitely measurable and easy to calculate.

>	Also, has anyone noticed that the instruction cache on the
>	68030 can actually slow down execution speed? We saw this
>	during evaluation and as yet have no explanation.

Yes, this is a real phenomenon.  It is caused by the fact that the '030
suspends execution until the entire cache line is fetched, rather than
resuming when the first longword arrives (I think the cache line size
on an '030 is 16 bytes).  On a no-wait-state machine, this resulted in
an overall performance degradation of around 5%, as measured by a
variety of benchmarks.  Consequently, that machine now ships with burst
mode disabled.  We also had a one-wait-state machine, which showed a
small performance gain with cache bursting, so it ships with bursts
enabled.  The moral is that it depends on your implementation.