mwm@VIOLET.BERKELEY.EDU (Mike Meyer, Take a giant step oustide your min) (12/24/86)
The following is the abstract for a recent paper on replicated file systems. It will be presented at the Sixth Symposium on Re- liability in Distributed Software and Database Systems, on March 17, 1987 in Williamsburg Virginia. A preliminary version is available now in technical report form. If you are interested in obtaining a copy, address inquires to: Technical Report Librarian Department of Electrical Engineering & Computer Science, C-014 University of Calfornia, San Diego La Jolla, CA 92093 The cost is $2.00 DL Darrell Long Department of Electrical Engineering and Computer Science, C-014 University of California, San Diego La Jolla, California 92093 ARPA: Darrell@Beowulf.UCSD.EDU UUCP: sdcsvax!beowulf!darrell - ----------------------------------------------------------------- On Improving the Availability of Replicated Files Darrell D. E. Long Jehan-Francois Paris Computer Systems Research Group Department of Electrical Engineering and Computer Science University of California, San Diego La Jolla, California 92093 Technical Report CS-089 ABSTRACT To improve the availability and reliability of files the data are often replicated at several sites. A scheme must then be chosen to maintain the consistency of the file contents in the pres- ence of site failures. The most commonly used scheme is voting. Voting is popular because it is simple and robust: voting schemes do not depend on any sophisticated message passing scheme and are unaffected by network partitions. When network partitions cannot occur, better availabilities and reliabilities can be achieved with the available copy scheme. This scheme is somewhat more complex than voting as the recovery algorithm invoked after a failure of all sites has to know which site failed last. We present in this paper a new method aimed at finding this site. It consists of recording those sites which received the most recent update; this information can then be used to determine which site holds the most recent version of the file upon site recovery. Our approach does not require any moni- toring of site failures and so has a much lower overhead than other methods. We also derive, under standard Markovian assumptions, closed-form expressions for the avai- lability of replicated files managed by voting, available copy and a naive scheme that does not keep track of the last copy to fail.