[comp.realtime] Predicting Execution Times for Processors with caches

dave@hp-lsd.COS.HP.COM (David C. Mueller) (04/07/90)

>So, the question for readers of this newsgroup is : how do you take 
>unpredictable instruction timings into account in designing your hard 
>real-time applications?
>
	One possibility is to prototype pieces of code and take
performance measurements as the pieces are put together.  As you've
stated, changes in the flow of code (preempting) does play havoc
with the contents of the caches.  If my understanding is correct,
changing the execution flow is roughly the same as starting with an
empty cache (i.e. where the code picks up execution is not in the
cache).  If this is true, then a worst case execution time for that
stream of code can be modeled by running the code in the target
system with the caches flushed at the beginning of the run.

	Other subtle things may have an effect on the execution timing.
For example, modifying the target system's memory access time has
first order and second order effects on the overall execution time.
Obviously having a slower memory access time means slower overall
execution time.  However, on processors such as the 68020/030 the
slower memory access time allows the execution unit portion of the
processor to overlap ALU-type operations with the ongoing bus cycle.
The 68020/030 can even perform bus cycles in different order
(program fetches vs. operand cycles) depending on access times.  The
bottom line here...simply scaling the instruction cycle count by the
clock rate is not 100% accurate.

	Just as a recommendation...look into some emulation/analysis
tools.  They offer a quick way to make performance measurements 
		IN YOUR TARGET SYSTEM..
and often times have built in statistical features that allow you to
get worst case response times in "real life" examples.

	As always...contact your local HP representative :-)

Dave Mueller

cantrell@Alliant.COM (Paul Cantrell) (04/11/90)

In article <15630004@hp-lsd.COS.HP.COM> dave@hp-lsd.COS.HP.COM (David C. Mueller) writes:
>stated, changes in the flow of code (preempting) does play havoc
>with the contents of the caches.  If my understanding is correct,
>changing the execution flow is roughly the same as starting with an
>empty cache (i.e. where the code picks up execution is not in the
>cache).  If this is true, then a worst case execution time for that
>stream of code can be modeled by running the code in the target
>system with the caches flushed at the beginning of the run.

Depending on the design of the cache in question, assuming that a random
state cache is the same as an empty cache is not a safe assumption, nor
will it result in identical execution times.

One possibility which comes to mind is that a non-writethru cache which
has modified (dirty) data from some other (preempted) thread in it has to
write that modified data back to memory as new data is brought in. A
flushed cache would thus run faster since it has no dirty data to write
back to memory.

I can also believe that some prefetch algorithms would cause some wildly
different behavior in some cases where the access pattern differed because
of old cached data versus a flushed cache.

In general, I'm of the belief that caches and realtime don't mix, but that
we are stuck with them because they are so cost effective with the current
generations of system.

					Paul Cantrell

dave@hp-lsd.COS.HP.COM (David C. Mueller) (04/12/90)

>
>Even if you disable the 68020's caches, you still must worry that your
>runtime data may be misaligned in memory.  The 68020 will happily
>compensate for this, slowing down your program considerably.  In my
>opinion, you should stop thinking about the 68020 for real-time
>systems, and start thinking about highly deterministic RISC processors
>such as the R2000, or perhaps the MC88000.

	This problem can be resolved by assembler/linker tools.  The
assmelber I'm familiar with has an option whereby any symbol can be
forced to a 32-bit boundary.

	I like to think of the 68020 as providing me with the option of
code space optimization and memory size independence by aligning 16
and 8 bit values on byte boundaries.

dave@hp-lsd.COS.HP.COM (David C. Mueller) (04/12/90)

>
>Depending on the design of the cache in question, assuming that a random
>state cache is the same as an empty cache is not a safe assumption, nor
>will it result in identical execution times.
>
>One possibility which comes to mind is that a non-writethru cache which
>has modified (dirty) data from some other (preempted) thread in it has to
>write that modified data back to memory as new data is brought in. A
>flushed cache would thus run faster since it has no dirty data to write
>back to memory.

	Good example.  It is my understanding however, that the
68020/030 does not support non-writethru data caches.