[comp.realtime] Real-Time UNIX: guaranteed context switch times Vs interrupts

jordan@winvmd.iinus1.ibm.com (02/18/91)

Novice question: I've read items in the press which state that LynxOS
guarantees a maximum user process context switch time. Steve Daukas of
Harris implies on a previous posting that CX/RT can do the same. Can
anyone explain (without giving away trade secrets) how a UNIX-based
real-time O/S can guarantee context switch time in the presence of
interrupts? Surely interrupts take priority over all user processes? And
a stream of interrupts could potentially delay a context switch
indefinitely.

Also, do these systems allow users to add new interrupt
handlers? If so, are there disclaimers regarding the context switch
time, given that the path length of a user-added interrupt handler is
unknown?

I've got no axe to grind, just curious to understand the truth...

+-----All Views Expressed Are My Own And Are Not Necessarily Shared By------+
+------------------------------My Employer----------------------------------+
Rob Jordan, IBM UK Laboratories, Hursley Park, Winchester, UK.
            jordan@winvmd.iinus1.ibm.com

steved@hrshcx.csd.harris.com (Steve Daukas) (02/19/91)

In article <9102181527.AA02102@ucbvax.Berkeley.EDU>, jordan@winvmd.iinus1.ibm.com writes:
> Novice question: I've read items in the press which state that LynxOS
> guarantees a maximum user process context switch time. Steve Daukas of
> Harris implies on a previous posting that CX/RT can do the same. Can
> anyone explain (without giving away trade secrets) how a UNIX-based
> real-time O/S can guarantee context switch time in the presence of
> interrupts? Surely interrupts take priority over all user processes? And
> a stream of interrupts could potentially delay a context switch
> indefinitely.

Before I start, let me say that Harris guarantees context switch times
for both CX/UX and CX/RT on Harris' hardware.  I'd bet that LynxOS'
guaranteed times are different depending on what hardware its running on...
(just an observation - I've never seen lynxOS before, only heard of it from
our customers who use to use it).  Further, the number we guarantee is
based on over 30 man-years of effort re-writing the guts of the Kernal
as described below...  Keep in mind its SVID, SVVS, and POSIX 1003.x
compliant (as well as 88open BCS for the 88K systems).
 
> Also, do these systems allow users to add new interrupt
> handlers? If so, are there disclaimers regarding the context switch
> time, given that the path length of a user-added interrupt handler is
> unknown?

I think you first must do away with the notion of Unix per say.  Unix,
the AT&T or BSD standard kernals, are _not_ real-time and cannot guarantee
anything.  What you need to do in order to guarantee deterministic behavior
is: 1) allow for preemtive processing; 2) multithread the kernal; and
3) do away with the "normal" Unix scheduler (fairness) and replace it with
something that doesn't screw with priorities.

If the Kernal is preemtive, then a higher priority process can get in
and do what it needs without a dead-lock or deadly-embrace.

If the Kernal is multithreaded, then more than one process can be in the
Kernal at the same time, even in the same service.  This can be accomplished
with spin-locks,  major and minor number protection, etc..  One can even
add their own Device Driver and, providing its multithreaded, not violate
the determinism (e.g., spl8 doesn't do what you would expect in this kind
of Kernal). 

If the Kernal's context switcher doesn't try to be fair based on time
quantums et. al., then you don't have to worry about a process
executing at time X one iteration and at time X+delta the next iteration.
Static priorities alone or in conunction with non-static priorities can
achieve a deterministic system.

Interrupt handlers are just another process and can be handled according
to the same model (I'm leaving alot out here).  

In a multiprocessor environment, the interrupts must be vectorable to a 
given CPU where the handler can then process it.  This requires not only 
1 - 3 above, but also a symmetric Kernal as well.  
The symmetric kernal requires no master slave relationship, and therefore 
an interrupt handler will use the system service(s) it needs irregardless 
of what other processes are in the Kernal from other CPUs.  Further, even 
if the ISR is in the same service as another process, both can execute on 
their respective CPUs _in_parallel_ until a spin-lock.  There the higer 
priority process will preempt the lower...  This self-same structure works 
on a single CPU as well (spin-locks protect such things as Kernal data
structures...).

Now, if the interrupts are posted every 100 usec, for example, you need the 
ability to respond to and process each.  The Kernal must allow for the
response and processing in the most efficacious way.  The hardware must provide
the horsepower.  The interrupt hardware must be fast (e.g., less than 4 or 
5 usec) so that the CPU can start the context switch and processing.  The kernal
needs to allow for the above list so that the interrupt can be handled and
processed as fast as possible.  Obviously, the faster the stream of interrupts,
the more horsepower is required to process them.  Lets face it, an interrupt
every 100usec or so will overwhelm a PC or Workstation unless it has that
singular purpose - even then it will be close.  Of course a poorly designed OS
will not make matters any better.

When you combine the type of Kernal described above with fast hardware, you
can achieve a deterministic system.  Even with the Kernal by itself, you can
improve the performance of the given hardware considerably and provide
determinism at some other level (i.e., everyone's definition of "real-time"
is dependent upon their application - this ranges from msec to "hard real-time"
in the usec range).  Harris' definition of real-time is usec to the extent that
we provide a 250 nanosecond system clock that provides the user with 
1 usec granularity for use with frequency based schedulers, event based 
schedulers, et. al..  

Hope this helps a bit.  I have not included alot of detail, but rather,
given concepts and hints as to how maximal numbers can be guaranteed.
If you have further questions, let me know.

Steve
--

Stephen C. Daukas           |           sdaukas@csd.harris.com  
Harris Corporation          |        uunet!travis!misg!sdaukas  
Computer Systems Division   |   (617) 221-1834, (617) 221-1830

"Old MacDonald had an agricultural real estate tax abatement."

edwardm@hpcuhd.HP.COM (Edward McClanahan) (02/23/91)

Rob Jordan asks:

> Novice question: I've read items in the press which state that LynxOS
> guarantees a maximum user process context switch time. Steve Daukas of
> Harris implies on a previous posting that CX/RT can do the same. Can
> anyone explain (without giving away trade secrets) how a UNIX-based
> real-time O/S can guarantee context switch time in the presence of
> interrupts? Surely interrupts take priority over all user processes? And
> a stream of interrupts could potentially delay a context switch
> indefinitely.

> Also, do these systems allow users to add new interrupt
> handlers? If so, are there disclaimers regarding the context switch
> time, given that the path length of a user-added interrupt handler is
> unknown?

> I've got no axe to grind, just curious to understand the truth...

Stephen C. Daukas responds:

> ...

> I think you first must do away with the notion of Unix per say.  Unix,
> the AT&T or BSD standard kernals, are _not_ real-time and cannot guarantee
> anything.  What you need to do in order to guarantee deterministic behavior
> is: 1) allow for preemtive processing; 2) multithread the kernal; and
> 3) do away with the "normal" Unix scheduler (fairness) and replace it with
> something that doesn't screw with priorities.

> ...

The rest of Steve's posting deals mostly with "performance" issues.  Returning
to the question, it is important to understand that no "real-time" system can
guarantee that a given task will be completed IN THE PRESENCE OF HIGHER
PRIORITY TASKS.  In any particular system, only the highest priority task
(such as responding to the highest priority interrupt) is guaranteed to
complete in a specified time.  Repeated execution of a higher priority task
can indefinately hold off execution of a lower priority task.  The maximum
context switch times claimed by Harris, Lynx, and all others are IN THE ABSENCE
OF HIGHER PRIORITY TASKS (such as responding to an interrupt).

Where there is some question is what does a particular "real-time" OS do with
tasks of the same priority.  Some systems impose a round-robin technique by
periodically interrupting a task to check if other tasks at the same priority
are ready to run.  Other systems rely on tasks giving up the CPU intentionally.

The above refers only to what the "real-time" OS can guarantee without the
cooperation of the application code and its "real-world" interface.  Suppose
we have an extremely simple system with only two priority levels 1 and 2,
priority 1 being "higher" just to confuse things.  Just to pull numbers out
of the air, suppose:

  Interrupt at pri 1 - External hardware interrupts the CPU
                     - "real-time" OS guarantees 100 ticks to respond

  Interrupt at pri 2 - Round-robin h/w or s/w context switch interrupt
                     - "real-time" OS guarantees 300 ticks to context switch

Now, suppose the application s/w and h/w could guarantee:

  Driver handling pri 1 interrupt will take only 100 ticks to complete

  H/W will only issue pri 1 interrupts at 1000 tick intervals

This last condition (frequency of higher priority interrupts) allows the
combination of OS and application to guarantee a 500 tick context switch
time (i.e. atmost one pri 1 interrupt can hold off completion of the
context switch).  Real world systems can often characterize interrupts
sufficiently to allow absolute guarantees on such things as context switch
times.  The "real-time" OS cannot by itself.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

  Edward McClanahan
  Hewlett Packard Company     -or-     edwardm@cup.hp.com
  Mail Stop 42UN
  11000 Wolfe Road                     Phone: (480)447-5651
  Cupertino, CA  95014                 Fax:   (408)447-5039