[mod.ai] Seminar - Knowledge and Action in the Presence of Faults

PATASHNIK@SU-SUSHI.ARPA (Oren Patashnik) (03/06/86)

AFLB, 13-Mar-86  :  Yoram Moses (MIT)
12:30 pm in MJ352 (Bldg. 460)

	Knowledge, Common Knowledge, and Simultaneous Actions 
		     in the Presence of Faults

We show that any protocol that guarantees to perform a particular
action simultaneously at all sites of a distributed system must
guarantee that the sites attain common knowledge of particular facts
when such an action is performed. We analyze what facts become common
knowledge at various points in the execution of protocols in a simple
model of a system in which processors are liable to crash.  We obtain
a new protocol for Simultaneous Byzantine Agreement that is optimal in
all of its runs. That is, rather than achieving the worst case
behavior, every run of the protocol halts at the earliest possible
time, given the pattern in which failures occur. This may happen as
early as after two rounds. We characterize precisely what failure
patterns require the protocol to run for k rounds, 1<k<t+2,
generalizing and simplifying the lower bound proof for Byzantine
agreement.  We also show a non-trivial simultaneous action for which
popular belief would suggest that t+1 rounds would be required in the
worst case, and use our analysis to design a protocol for it that
always halts in two rounds. This work sheds considerable light on many
heretofore mysterious aspects of the Byzantine Agreement problem. It
is one of the first examples of how reasoning about knowledge can be
used to obtain improved solutions to problems in distributed computing.

This is joint work with Cynthia Dwork of IBM Almaden.