[net.micro.68k] Microstate on different chip revisions in a multiprocessor

gnu@l5.uucp (John Gilmore) (09/11/85)

In article <3233@nsc.UUCP>, freund@nsc.UUCP (Bob Freund) asks:
>                                        Does Motorola guarantee
> micro-state compatibility across revision levels?  Does it guarantee
> compatibility of micro-state across implementations?
Motorola CPUs generate a microcode revision level as part of the microstate.
This is the first thing checked when the state is reloaded.  If the revision
level is wrong, a "stack format trap" occurs (leaving the stacked state
alone).  If by "across implementations" you mean from e.g. the 68010 to the
68020, the microstate is not compatible.  Each stack frame from ANY trap
or interrupt has a frame format field, however, and the two use a different
frame format and will produce a trap as above if you use the wrong one.
Motorola has a patent pending on this approach, "Data Processor Version
Validation", ser # 447,600.

If a process stops with a page fault, it must be restarted on a CPU
with the same microcode version.  If a "wrong" processor tries to run
it, the hardware will say "no".  If it stops with any other kind of
trap, there is no problem.

>						      What effect
> does this have on the types of multiprocessor systems that can be designed
> based on the part?  What is the effect on distributed systems that
> allow task migration across network.  What about paging across network?

Paging across the net is unaffected -- that's just I/O.  (Sun and
Apollo have been doing it for years.)  Migrating a process across a net
(or across a bus) has to follow the above constraints; the easiest way
is to not migrate it when it has a page fault pending.  The system
could also have each CPU determine its microcode rev and remember which
CPUs could restart each others' page faults -- or give an error at boot
time and make Field Service keep all the CPUs at the same level.

One trick is to turn on the Trace bit in the stacked page fault frame
when resuming the process on the original CPU.  It will complete the
instruction (unless another page fault occurs) and then take a trace
trap, which occurs between instructions and with clean state.  In Sun
Unix this is done when a signal is pending on a page fault-ed process.
This is becaue it's not safe to alter the process's PC in a page fault
stack frame.

This is a typical hardware/software complexity/performance tradeoff.
This one leans toward complexity in software and performance in hardware.