strom@andreadoria.watson.ibm.com (Rob Strom) (09/13/90)
In article <45212@cornell.UUCP> ken@gvax.cs.cornell.edu (Ken Birman) writes: >In article <1990Aug30.164153.25008@swbatl.sbc.com> cbs@swbatl.sbc.com (Brad Slaten 529-7636) writes: >> >> Our current application is using an object-oriented implementation >>under the ISIS lightweight tasking library and Unix. We have provided an >>interface to the Inter-process Communication (IPC) mechanism such that >>the application does not know when an object with which it is >>communicating requires IPC. The problem which occurs is that in >>making the IPC transparent, the lightweight task context switch is >>also transparent. The problem results from the fact that certain >>objects are shared among multiple tasks. When a context switch occurs >>from one task to another, any or all shared objects may be modified by >>a different task without the initial task having knowledge of this. > >Because of the wide posting this received and the mention of ISIS, I >want to comment on how our group sees this issue. > > [summary of problems of transaction systems, and of ISIS's approach] ... > >I should probably point to other non-transactional work that uses this >approach ("rollback if in trouble"). The best known is Jefferson's virtual >time system ("Time Warp OS") and the Strom/Yemeni system reported in >TOCS a few years back. Neither approach really took off except in certain >specialized languages and applications, notably simulation. Willy Zwaenapoel >(Rice) has recently done a version of this in his work on "Sender based >message logging", which is cheap, but doesn't support rollback and hence >isn't an appropriate tool for solving this particular problem (he is more >interested in a cheap transparent fault-tolerance mechanism, and anyhow, >he assumes a deterministic, non-threaded, execution). Since Ken has referred to our work, I'd like to clarify our approach and the status of our experimental prototypes. Our group (Distributed Systems Software Technology) at IBM Research is studying techniques for simplifying the implementation of distributed systems of the future. We use a two-pronged approach: (1) a simpler high-level language model which hides implementation complexity from the programmer, and (2) program transformations and other transparent techniques for producing an efficient implementation of our high-level model. (1) Our high-level model is based upon processes communicating by making calls and sends over ports. Our processes correspond to modules of conventional programming languages and to objects of object-oriented languages. Each process has a sequential thread of control, and private data. Ports are first-class data objects which can be passed in messages; this enables a capability-based approach to access control and dynamic configuration. Our high-level model hides from the programmer details of data representation, communication, failure, and recovery. We have developed a secure high-level programming language, Hermes, which exploits this model. In contrast to approaches based on shared abstract data types or shared objects, Hermes processes show no internal concurrency, although such concurrency can be introduced by program transformations. Multiple calls to a single port are queued until the process dequeues them. Therefore, Brad Slaten's problem would not exist in Hermes. By exploiting compile-time checking, Hermes can safely execute large numbers of ``processes'' in a single physical address space, obtaining process-switching costs essentially comparable to procedure-calling costs of conventional languages. (In the implementation of NIL, a predecessor of Hermes, the overhead of a cross-process call was 9 instructions on an IBM 370.) (2) We believe that the high-level approach described above is an ideal model for the programmer provided that it can be implemented efficiently. The second aspect of our research addresses transparent program optimizations which allow the implementer to improve the efficiency of a distributed implementation without having to rewrite the high-level code. We have explored a number of such techniques based upon the principle of *optimism* --- that is, guessing that some good thing has happened or will happened, in order to gain speed when we guess correctly, and at the expense of having to rollback computations when we guess wrong. We have done some exploratory work on optimistic techniques for process replication, and for parallelizing logically sequentially processes. Our most mature work in this direction is on Optimistic Recovery. Optimistic Recovery (OR) recovers a recent consistent state of a distributed system after a failure of one or more system components. >Recently, I understand >that Rob Strom's group has done something similar under Mach, but again >the focus is no longer on arbitrary rollback but rather is on roll-forward >for fault-tolerance with determinism assumptions. Multithreaded OO systems >are usually not deterministic because of the thread scheduling issue. > >To summarize, there seems to be no simple way out of this problem. > >... > > Ken Birman > ken@cs.cornell.edu or isis@cs.cornell.edu > 607-255-9199 The OR model is very general: any system describable as a collection of ``recovery units'' which have local state, and which communicate by exchanging messages can be recovered using this technique. Each recovery unit is independently made recoverable by logging input messages. The optimism arises from the fact that we do not synchronize logging with computation or communication; a recovery unit is allowed to receive messages from a sender recovery unit whose current state is not yet recoverable. The optimism results in lower overhead and faster commit time during normal (failure-free) execution; in exchange, recovery after a failure may be slower. The original Strom/Yemini OR algorithm has been refined. Suggested improvements have been published by Johnson and Zwaenepoel of Rice, by myself, Shaula Yemini, and David Bacon at IBM, and by others. Our rollback technique is completely general, and can be exploited for other kinds of optimistic algorithms besides OR. It is true that each recovery unit is required to be ``piecewise deterministic''. However, this property holds in many realistic environments (e.g. coroutine-based C-threads) and can be made to hold in ``disciplined'' multithreaded environments in which threads access shared storage in a concurrent-reader exclusive-writer discipline enforced by lock primitives. A Hermes implementation for RT, SUN, and RISC/6000 is available free of charge to interested researchers; a manual and tutorial have also been written and will be published at the end of the year. A prototype of a transparently recoverable Mach based on OR is being built. Under this system, distributed Mach applications can be made recoverable by recompiling and relinking them with our OR library. We hope to make this system available sometime in the future; currently a ``demo'' prototype supporting a subset of this functionality is now working. Interested persons are welcome to contact me by phone, email, or by directing followups to comp.lang.misc (Hermes) or comp.os.mach (Mach OR prototype). Rob Strom, strom@ibm.com, (914) 784-7641 IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958 -- Rob Strom, strom@ibm.com, (914) 784-7641 IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958