aleta@cs.cornell.edu (Aleta Ricciardi) (02/09/91)
Using Process Groups to Implement
Failure Detection in Asynchronous Environments
Aleta Ricciardi
Ken Birman
Cornell University
Department of Computer Science
Ithaca, NY 14853 USA
TR 91-1188
Research supported in part by DARPA/NASA Ames Grant NAG 2-593, and in part
by grants from IBM and Siemens Corp.
ABSTRACT:
Agreement on the membership of a group of processes in a distributed system
is a basic problem that arises in a wide range of applications. Such
groups occur when a set of processes co-operate to perform some task, share
memory, monitor one another, subdivide a computation, and so forth. In
this paper we discuss the Group Membership Problem as it relates to failure
detection in asynchronous, distributed systems. We present a rigorous,
formal specification for group membership under this interpretation. We
then present a solution for this problem that improves upon previous work.
Keywords : Asynchronous computation; Fault detection; Fault tolerance;
Distributed Consensus; Membership list management.
===========
The tech report is available by anonymous ftp from cu-arpa.cs.cornell.edu