[mod.os] [darrell%beowulf@sdcsvax.ucsd.edu

mwm@VIOLET.BERKELEY.EDU (Mike Meyer, Take a giant step oustide your min) (12/24/86)
The following is the abstract for a recent  paper  on  replicated
file systems.  It will be presented at the Sixth Symposium on Re-
liability in Distributed Software and Database Systems, on  March
17,  1987  in  Williamsburg  Virginia.   A preliminary version is
available now in technical report form.

If you are interested in obtaining a copy, address inquires to:

Technical Report Librarian
Department of Electrical Engineering & Computer Science, C-014
University of Calfornia, San Diego
La Jolla, CA  92093

The cost is $2.00

DL
Darrell Long
Department of Electrical Engineering and Computer Science, C-014
University of California, San Diego
La Jolla, California  92093

ARPA: Darrell@Beowulf.UCSD.EDU
UUCP: sdcsvax!beowulf!darrell

- -----------------------------------------------------------------

     On Improving the Availability of Replicated Files


                    Darrell D. E. Long
                   Jehan-Francois Paris

              Computer Systems Research Group
 Department of Electrical Engineering and Computer Science
            University of California, San Diego
                La Jolla, California  92093


		   Technical Report CS-089

                          ABSTRACT


     To improve the  availability  and  reliability  of
     files  the  data  are  often replicated at several
     sites. A scheme must then be  chosen  to  maintain
     the  consistency of the file contents in the pres-
     ence of site  failures.  The  most  commonly  used
     scheme is voting.  Voting is popular because it is
     simple and robust: voting schemes do not depend on
     any  sophisticated  message passing scheme and are
     unaffected by network partitions.

          When network partitions cannot occur,  better
     availabilities  and  reliabilities can be achieved
     with the available copy scheme.   This  scheme  is
     somewhat  more complex than voting as the recovery
     algorithm invoked after a failure of all sites has
     to  know  which  site  failed last.  We present in
     this paper a new  method  aimed  at  finding  this
     site.   It consists of recording those sites which
     received the most recent update; this  information
     can then be used to determine which site holds the
     most  recent  version  of  the  file   upon   site
     recovery.  Our approach does not require any moni-
     toring of site failures and so has  a  much  lower
     overhead than other methods.

          We  also  derive,  under  standard  Markovian
     assumptions, closed-form expressions for the avai-
     lability of replicated files  managed  by  voting,
     available  copy  and  a naive scheme that does not
     keep track of the last copy to fail.