jessea@dynasys.UUCP (Jesse W. Asher) (11/03/90)
I was wondering if any realtime operating systems also incorporated fault-tolerant considerations. If so, how are they implemented software-wise? How do hardware based fault-tolerance systems interact with software based realtime? It seems to me that they go hand in hand in many cases and that one would want to implement both in a system. As a side note, are they any people from Modcomp reading this newsgroup? ---*---*---*---*---*---*---*---*---*---*---*---*---*---*---*---*---*---*---*--- Jesse W. Asher Phone: (901)382-1609 6196-1 Macon Rd., Suite 200, Memphis, TN 38134 UUCP: {fedeva,chromc,rutgers}!dynasys!jessea -> GIVE: Support the helpless victims of computer error.
dwells@fits.cx.nrao.edu (Don Wells) (11/05/90)
In article <720@dynasys.UUCP> jessea@dynasys.UUCP (Jesse W. Asher) writes:
From: jessea@dynasys.UUCP (Jesse W. Asher)
Newsgroups: comp.realtime
Date: 2 Nov 90 16:35:30 GMT
I was wondering if any realtime operating systems also incorporated
fault-tolerant considerations. If so, how are they implemented software-wise?
How do hardware based fault-tolerance systems interact with software based
realtime? It seems to me that they go hand in hand in many cases and that
one would want to implement both in a system.
...
I consider a hardware-based watchdog timer with automatic reboot to be
a form of fault-tolerance. In many cases a software-based watchdog
timer is almost as effective for this purpose.
--
Donald C. Wells, Assoc. Scientist | dwells@nrao.edu
Nat. Radio Astronomy Observatory | 6654::DWELLS
Edgemont Road | +1-804-296-0277 38:02.2N
Charlottesville, VA 22903-2475 USA | +1-804-296-0278(Fax) 78:31.1W
valentin@cbmvax.commodore.com (Valentin Pepelea) (11/05/90)
In article <720@dynasys.UUCP> jessea@dynasys.UUCP (Jesse W. Asher) writes: > > I was wondering if any realtime operating systems also incorporated > fault-tolerant considerations. If so, how are they implemented > software-wise? Realtime operating systems typically provide for task exceptions handlers, which are task specific functions that are called when an error occurrs during a particular task's time slice. That task's function then decides on the appropriate action to take, and corrects the fault under the same context (at the same priority) that the fault occured. Of course, some exceptions such as those generated by power faults, might require the same corrective action no matter under what task's context it occurs. > How do hardware based fault-tolerance systems interact with software based > realtime? It seems to me that they go hand in hand in many cases and that > one would want to implement both in a system. The typical hardware based fault tolerant system is one that uses several units, and decides upon an action depending on what the majority of the units vote to do. Democracy at work. The typical hardware/software combined system is one where special circuitry is used to detect a hardware fault to initiate a software recovery routine. Although even in the case above some software might be necessary to control the selection of a faulty unit, the saliant point here is that software is the backbone of the fault tolerant system, and software will be used to recover or circumvent the fault. I'll let somebody more experienced give us some examples and juicy anecdotes. Valentin -- The Goddess of democracy? "The tyrants Name: Valentin Pepelea may destroy a statue, but they cannot Phone: (215) 431-9327 kill a god." UseNet: cbmvax!valentin@uunet.uu.net - Ancient Chinese Proverb Claimer: I not Commodore spokesman be
alex@vmars.tuwien.ac.at (Alexander Vrchoticky) (11/06/90)
jessea@dynasys.UUCP (Jesse W. Asher) writes: >I was wondering if any realtime operating systems also incorporated >fault-tolerant considerations. If so, how are they implemented software-wise? >How do hardware based fault-tolerance systems interact with software based >realtime? It seems to me that they go hand in hand in many cases and that >one would want to implement both in a system. What is software-based real-time ? :-) -- Alexander Vrchoticky, Tech Univ Vienna, CS, Dept for Real-Time Systems Voice: +43/222/58801-8168 Fax: +43/222/569149 Internet: alex@vmars.tuwien.ac.at Path: vmars!alex@relay.eu.net
steved@hrshcx.csd.harris.com (Steve Daukas) (11/06/90)
In article <720@dynasys.UUCP>, jessea@dynasys.UUCP (Jesse W. Asher) writes: > I was wondering if any realtime operating systems also incorporated > fault-tolerant considerations. If so, how are they implemented software-wise? > How do hardware based fault-tolerance systems interact with software based > realtime? It seems to me that they go hand in hand in many cases and that > one would want to implement both in a system. What do *you* mean by fault tolerance? I've seen requirements were a company would have to pay fines on the order of $20,000 a minute for down-time. In this case, you need a duplicate or triplicate system with shared disks, etc., so that when something fails in one system, you switch to the other. One might define fault tolerance as the abiliy to be self correcting in the sense of exception handeling... In any case, the operating system usually doesn't have direct control over the fault tolerance. Its usually a matter of using the proper hooks in the OS to provide for whatever capabilities make sense for the given application. I guess the next question is: what do *you* mean by real-time? Are we talking miliseconds or microseconds for response times? Steve -- Stephen C. Daukas | sdaukas@csd.harris.com Harris Corporation | uunet!hcx1!misg!sdaukas Computer Systems Division | (617) 221-1834, (617) 221-1830 "Old MacDonald had an agricultural real estate tax abatement."
varvel@cs.utexas.edu (Donald A. Varvel) (11/06/90)
One approach to getting realtime and fault-tolerance, within certain assumptions, is self-stabilization. The usual assumption for the investigation of self-stabilizing programs is that the program will not be degraded, but that any or all data may. This assumes that ROM can be made not to degrade, whereas RAM cannot. Environments like the space telescope come to mind. A self-stabilizing system has a subset of states, usually defined as those reachable from a predefined starting state, which are acceptable. The program in any unacceptable state must eventually reach an acceptable state. To make any claim to being "real-time", a self-stabilizing program must reach an acceptable state within defined time bounds. I have a hard time visualizing such a system being able to guarantee "hard" constraints, but there may be some definition of "real time" that would be satisfied. -- Don Varvel (varvel@cs.utexas.edu)
srp@modcomp.uucp (Steve Pietrowicz) (11/06/90)
In <720@dynasys.UUCP> jessea@dynasys.UUCP (Jesse W. Asher) writes:
] As a side note, are they any people from Modcomp reading this newsgroup?
You bet!
--
--------------
SR Pietrowicz UUCP: ...!uunet!modcomp!srp CIS: 73047,2313