ken@gvax.cs.cornell.edu (Ken Birman) (03/15/91)
> From: jyl@Eng.Sun.COM (Jacob Levy) > Message-Id: <9103131904.AA01088@noam.Eng.Sun.COM> > To: ken@cs.cornell.edu > Subject: Interesting Paper > Status: R > Barbara Liskov's paper about Lazy Replication in the 1990 PODC > proceedings seems to contain protocols which are very similar > to ISIS. They compare their approach to ISIS and seem to claim > that their one is more efficient for both the > normal case (no failures) and for failure detection, and also > that it is more robust for network partition. > Can you post a comparison to comp.sys.isis? A comparison between our work with the lazy replication protocols appears in the most recent TR on our version ("Lightweight Causal and Atomic Group Multicast"). Basically, their paper compared the Lazy Replication scheme with the old ISIS protocols. Our new protocols are at least as efficient as Lazy Replication in all of these aspects, and actually include some performance enhancements that the LR scheme didn't have (at the time they wrote the PODC paper, at any rate). For example, our new paper includes faster group view management protocols and a VT compression scheme that appear to be big wins. As it happens, the measured latency of our OLD protocols was comparable to the LR figures in the PODC paper (our cbcast protocol, used asynchronously, runs at the same speed as the protocol they evaluate in section 3.6.1 of their paper). Our recent paper reports the performance of the NEW cbcast protocol, which runs about 5 times faster than either our old protocols or the LR scheme. Regarding the robustness issue, LR uses a stable storage/logging scheme, while ISIS only uses logging if you explicitly enable it. When you do use logging in ISIS (or long-haul spooling) you can get the same level of availability out of our system as you can from theirs. We prefer not to log in the normal case because we prefer to reduce the overhead as much as possible for the average cbcast -- this is part of the reason our stuff is faster. In more general terms, it is worth noting that the basic LR protocol and the ISIS "lightweight causal cbcast" protocol are really very similar in the single group cases. The main differences, unless you focus on detail, are that we support large numbers of overlapped groups, and that our handling of client/server structures is somewhat more general then theirs. We do this because we need both features in ISIS; one implication is that our scheme scales well, even to very large networks. Their scheme could be extended the same way, of course -- perhaps even using our methods. I can provide more details on the fine points if people request them. Ken Birman ---------------------------------- Kenneth P. Birman E-mail: ken@cs.cornell.edu 4105 Upson Hall, Dept. of Computer Science TEL: 607 255-9199 (office)
liuba@proton.lcs.mit.edu (Liuba Shrira) (03/26/91)
Ken Birman compares the performance of ISIS with that of the Lazy Replication scheme developed by Ladin, Liskov and Shrira at MIT. I would like to make a couple of points about Ken's note. First, the Lazy Replication paper that appeared in PODC 90 does not contain any performance data. The data appear in MIT technical report MIT/LCS/TR-484 authored by Ladin, Liskov, Shrira, and Ghemawat. People interested in Lazy Replication and its performance can send me a note and I will send a hard copy of the paper. Alternatively, a postscript version of the paper can be obtained via anonymous ftp from mintaka.lcs.mit.edu (18.26.0.36) as pub/pm/lazyrep.PS. The second point concerns the performance data themselves. Our paper describes an experiment designed to evaluate the cost of Lazy Replication: it compares the response time and request processing capacity of a prototype replicated server implemented using our replication scheme with that of comparable nonreplicated server. The point of our data is the comparison between the two implementations rather than how fast the replicated implementation ran. We used the Argus system as the implementation platform for the experiments, so our system does not perform as well as it could. For example, all client interactions with the server occur within atomic transactions even though transactions are not needed in our method. We were pleasantly surprised, therefore, to hear that our response times compare so favorably with those of the highly engineered ISIS system! Liuba Shrira liuba@lcs.mit.edu
ken@CS.Cornell.EDU (Ken Birman) (03/27/91)
In article <1991Mar26.033004.25436@mintaka.lcs.mit.edu> liuba@proton.lcs.mit.edu (Liuba Shrira) writes: > >... First, the Lazy Replication paper that appeared in PODC 90 does not >contain any performance data.... Actually, my comments were based on Sec. 3.6.1 of the PODC paper, which did give some numbers. >... the highly engineered ISIS system! Thanks... but, it is funny to read this even as I type in a revision to our TOCS paper explaining why ISIS multicast is so much slower than Ameoba or X-kernel multicasts! Both systems are losing heavily by running over UNIX -- my guess is that Argus is only a minor problem. The situation we see in ISIS -- I wonder if you run into this too -- is that UNIX tends to throw away a lot of UDP messages if you present it with lots of data at once. For example, we have been looking at how ISIS performance varies as you stream data to larger and larger process groups. What we find is that the dominent effect occurs when UNIX` suddenly decides that there isn't enough mbuf memory left, and starts to discard both outgoing messages, from the sender, and also incoming ones (in this kind of test, mostly acknowledgement packets). Unfortunately, UNIX usually doesn't indicate that it is doing this, and it never indicates when it discards an incoming packet, except to keep a count in some internal, undocumented, kernel data structure. Packet loss is a performance disaster, and to make matters worse, there seems to be no way to figure out that UNIX is doing this. You can tell that it is happening -- running netstat -m, for example -- but at runtime there is no UNIX system interface to indicate that UNIX is getting overfilled with data. So, you give it the data, it thanks you and discards it, and then you sit and wait to see if it went anywhere. Needless to say, this causes a lot of multi-second delays, waiting for acks that don't make it, retransmitting packets, etc. Our course, there are flow control heuristics that help, a little, but my belief is that when we measure ISIS or LR or ARGUS over UNIX, we are so far from what the hardware can do and running over such a complex and quirky black box, that the situation is really pretty hopeless. For ISIS, the plan is to rebuild our system over the X-kernel inside Mach, or inside Chorus (a comparable environment), as close to the hardware as possible. This should give us direct access to information like memory usage in the kernel and to hardware multicast, and will make a huge difference to our system performance. Is anyone aware of successful multicast mechanisms layered over UNIX? I would be very interested to hear about other experiences... -- Ken -- Kenneth P. Birman E-mail: ken@cs.cornell.edu 4105 Upson Hall, Dept. of Computer Science TEL: 607 255-9199 (office) Cornell University Ithaca, NY 14853 (USA) FAX: 607 255-4428