[comp.arch] Register usage and Policy vs. Mechanism

mcg@mipon2.intel.com (06/11/89)
In article <3427@bd.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth) writes:
>In article <1RcY6x#64Zq3Y=news@anise.acc.com> lars@salt.acc.com (Lars J Poulsen) writes:
>
>>From a humble applications programmer, who occasionally has written a bit
>>of kernel code: The biggest pain with an architecture that exposes too
>>large a register file is saving and restoring on context switches.
>
>There's another pitfall here.  Some machines use the same register
>window mechanism to handle interrupts, automatically shifting to
>a new set.  If the interrupt hits you at just the wrong call depth,
>you get an automatic register spill.

Quite the contrary.  It is easy to reserve a register set soley for interrupts.
This makes actual interrupt latency fast and predictable.  The interrupt
routine never needs to save any context-dependent registers.  This allows the
960 architecture to have both fast and predictable length interrupts.

> So about one-sixth or one-fourth
>of the time, it takes maybe three times as long overall to field the
>interrupt and maybe thirty times as long to reach the first instruction
>of the handler.  Since hard real time people care about predictability
>next only to performance, this gives loads of grief.

Note that the register architecture (of any machine, not just the 960) has no
effect on indeterminancy of the time to reach the first instruction of an
interrupt handler.  This is strictly an instruction cache issue.  If your
interrupt handler is in a cache (on-chip or off-chip), it's access latency
will be reduced.  This is not affected by the time (and possible
indeterminancy in that time) required to save processor state during an
interrupt.

As you have said, the goal of many "hard-real-time" applications is a strong
determinancy in the latency of certain events.  Caches, (instruction, data,
and register or stack) present some problems in this arena, but they also hold
the seeds for solving those problems.  If one has a cache, then one can expose
some mechanism for controlling parts of that cache to come to the aid of your
application.  Examples are the use of an additional register set to provide
determinancy (as well as speed) in interrupt register save overhead;
specification of certain instruction or data regions as non-cacheable, so
fetches for (e.g.) interrupt routines are of deterministic length;
specification of other regions as cache-locked, etc.

The question that must be asked is how amenable an architecture is to the
addition of such mechanisms.  An architecture that specifies policy without
specifying mechanism is the best here, since it allows designers to change
the mechanism to suite the applications without affecting the policy, and
hence code compatibility and tools.  Comp.arch readers, as well as many
silicon designers, tend to concentrate on mechanism, because it is more
concrete and easier to talk about (and quantify).  Policy is harder to talk
about.  The hard part about policy is to allow mechanisms that you haven't
built yet, and may not even know how to build.

We designed the 960 architecture as a set of modular policies, not a set of
interacting mechanisms.  This has allowed us to define (so far) 5 distinct
implementations with these modular policies.  The FPU has a set of policies,
one existing implementation (actualy two: the 960 FPU is the basis for the 387
FPU), and one next-generation implementation; the MMU has two concentric sets
of policy, each with an implementation; the MMU and process control policies
allow transparent multi-processor mechanisms, one of which has been
implemented; and the CPU has one policy, and two dramatically different
implementations: one very inexpensive one and one that will be the first
superscalar single-chip processor.  Further proliferations are planned.  The
960 architecture was designed to support full out-of-order instruction
dispatch, even though we didn't know how to do that (on a single chip) at the
time.

So I guess this leaves me back on my standard soapbox: architecture vs.
implementation.  Policy vs. Mechanism is just another name, perhaps
it will catch more attention.

S. McGeady
Intel Corp.