[comp.sys.isis] looking for real-life examples of isis-groups

dalia@SHUM.HUJI.AC.IL (Dalia Malki) (05/12/91)

I would appreciate replies from ISIS users out there concerning the
group services given by ISIS:

I would like to assemble some real life examples where inter-group
dependence arises.  It appears that ISIS algorithms are much
complicated by the possible existence of multiple, overlapping groups.
This complication arises only if there is causality dependence among
them. Intuitively, it seems like the right concept. To be more
specific, Ken Birman commented in previous correspondence:

> Yes, we do believe that causality is important when multiple groups
> overlap.  The problem is that asynchronous updates create the
> obligation to enforce causality when their may be multiple communication
> routes out of and back into a group.  In systems with an "object oriented"
> structure or systems where a server is made of programs that subscrive
> to other servers, one often sees large numbers of groups overlapped in a
> way that can raise this issue.

If you have such need in your application, could you please send me a
description of it, including the general structure and where
inter-group dependency occurs. Please emphasize why you need multiple groups
instead of, for instance, one big multicast group for all related parties.

please mail the replies directly to me, and I promise to make a
summary of the interesting ones and post it in the news group for the
public benefit.

Thanks in advance,

- Dalia Malki
------------------------------------------------------------
E-mail: dalia@humus.huji.ac.il

Dalia Malki
Computer Science dept.,
The Hebrew University of Jerusalem,
Givat Ram, Jerusalem 91904
Israel

ken@CS.Cornell.EDU (Ken Birman) (05/13/91)

In article <dalia.674044526@shum> dalia@SHUM.HUJI.AC.IL (Dalia Malki) writes:
>This complication arises only if there is causality dependence among them...

Two clarifications:

First, I want to point people to the more complete discussion of this
issue in the paper that I recently wrote with Robert Cooper and Barry
Gleeson (Programming with process groups: Group and multicast semantics).

Also, readers may want to keep in mind that if they use the ISIS tools
in applications that contain multiple groups, they "automatically" need
the causal property.  This is because the tools are quite asynchronous
(i.e. they let processes get quite far out of sync with one another),
esp. replicated data update using cbcast or abcast with no replies,
and the coordinator cohort tool.  In these cases, if you have multiple
groups, the ISIS tools themselves would be buggy if we didn't preserve
causality over group boundaries.  For example, two replicated writes might
be seen in inconsistent orders by members of a group or a coordinator-cohort
application might simply hang.

The reason I wanted to make this point is that, in ISIS, the causal
guarantee can be used explicitly, but also enters in an implicit way,
in that it lets you do things that are very asynchronous without being
aware that you have done so -- i.e. without worrying about race conditions,
etc.  

Thus, if you use these sorts of tools and multiple groups, you may want
to respond to Dalia by describing your application, even if you are unclear
on whether the causal issue arises in your case.
-- 
Kenneth P. Birman                              E-mail:  ken@cs.cornell.edu
4105 Upson Hall, Dept. of Computer Science     TEL:     607 255-9199 (office)
Cornell University Ithaca, NY 14853 (USA)      FAX:     607 255-4428

ken@CS.Cornell.EDU (Ken Birman) (05/13/91)

In article <1991May12.174040.24185@cs.cornell.edu> ken@CS.Cornell.EDU (Ken Birman) writes:
>For example, two replicated writes might be seen in inconsistent orders
>by members of a group or a coordinator-cohort application might simply hang.

I got email from two people asking me to "clarify" this point.

The problem is that in ISIS, several parts of the system use asynchronous
cbcast to update replicated data.  In fact, our manual recommends that you
do this, too.  Say that group G={a,b} and that process 'a' uses this
approach.  Now, some process p does an operation on G and 'a' sends back
an answer.  p is not in G.  p now does another operation on G and
'b' receives it.  It is easy to imagine that 'a' send back information
that reflected the update, but if CBCAST is not preserved "outside" group
boundaries, 'b' may not have seen that update -- in fact, 'b' may not
have seen any of the events in this whole sequence.

In general, the reason that ISIS worries about causality "between multiple
groups" is not that problems develop in group G' because of something that
happened in group G, but rather that if group G is very asynchronous and
one interacts with several of its members, they may not have seen 
critical "past" events, if causality is not preserved.  Thus, we need to
preserve causality "over group boundaries" to avoid a problem that arises
entirely within a single group -- the group got itself into trouble, using
asynchronous cbcast, and the group will see the problem, in the form of
a race condition.

In my mind, the key question is whether asynchronous CBCAST is such a big
win over a synchronous protocol, like ABCAST.  Obviously, I personally
believe the answer is that it is.... and I expect our new system to prove
this.  Even in the current version of ISIS, asynchronous CBCAST to small
numbers of destinations is much faster than ABCAST.

So, this gets back to our basic claim: if you use asynchronous CBCAST (or
ABCAST, for that matter), problems can arise unless causality is maintained
at all times -- inside or outside of the group where you tool this action.
-- 
Kenneth P. Birman                              E-mail:  ken@cs.cornell.edu
4105 Upson Hall, Dept. of Computer Science     TEL:     607 255-9199 (office)
Cornell University Ithaca, NY 14853 (USA)      FAX:     607 255-4428