[comp.sys.isis] Tool for getting single answer back from broadcast

jyl@noam.Eng.Sun.COM (Jacob Levy) (01/30/91)

Sometimes you do a broadcast and want to get exactly one answer
back. The standard example is retrieval of replicated data. In
this case it is OK to wait only for one answer since the other
ones are redundant. ISIS currently still broadcasts to all
members of the group.

My question is: would it be possible to have a tool which, given
a PG address, would try getting the answer from each process in
turn, only attempting to broadcast to the next one after the
current one hasn't answered?

The delay caused by the use of such a tool in the general case
where no failures occur should be the same as that of
broadcasting to the entire group, since we'd be waiting for the
first answer anyway. There are probably a lot of tricky issues
wrt maintaining causality, e.g. those processes that did not
receive the broadcast must still behave as if they are in the
causal path of the broadcast.

--JYL

ken@gvax.cs.cornell.edu (Ken Birman) (01/30/91)

In article <6934@exodus.Eng.Sun.COM> jyl@noam.Eng.Sun.COM (Jacob Levy) writes:
>Sometimes you do a broadcast and want to get exactly one answer
>back. The standard example is retrieval of replicated data. In
>this case it is OK to wait only for one answer since the other
>ones are redundant. ISIS currently still broadcasts to all
>members of the group.
>
>My question is: would it be possible to have a tool which, given
>a PG address, would try getting the answer from each process in
>turn, only attempting to broadcast to the next one after the
>current one hasn't answered?
>
>The delay caused by the use of such a tool in the general case
>where no failures occur should be the same as that of
>broadcasting to the entire group, since we'd be waiting for the
>first answer anyway. There are probably a lot of tricky issues
>wrt maintaining causality, e.g. those processes that did not
>receive the broadcast must still behave as if they are in the
>causal path of the broadcast.
>
>--JYL

This is a good suggestion and in fact we have been planning to add an
ISIS interface for this purpose.  It would be presented as a "repeated
RPC".  In fact, ISIS has something along these lines now -- in the remote
exec tool.  We just need to break it out and clean up the interface to it.

Since V3.0 is frozen now (in fact, the IDS product announcement is in the
mail now), this will go into the new ISIS design.  However, such a tool
is really easy to code.  It would look something like this:

	while(ntries-- && (gv = pg_getview(gaddr)))
	{
	        who = &gv->gv_members[my_process_id%gv->gv_nmemb];
	        switch(cbcast(who, entry, "...", 1, "..."))
	        {
	          case -1:
		    return(-1);
	          case 1:
		    return(1);
	        }
	}
	return(-1);

More generally, your point is right on the mark.  I generally recommend
that people not use multicasts to large groups (always use the smallest
group that you can) and that you not multicast messages that don't change
the state of the group, i.e. by causing an update to occur.  A second
recommendation is that if you do issue a multicast to initiate an update,
asking for one reply, don't use the coordinator-cohort tool unless the
computation is a long one (i.e. a few seconds).  For a short computation,
you are better off having all members send replies even though only one was
needed.  The extra replies won't cost anything because ISIS had to send
transport-level acknowledgements anyhow (!)  They will just get thrown 
away silently on arrival.  In comparison, a coordinator-cohort scheme would
involve about twice as many packets on the wire, because the caller did a
multicast and the coordinator replies with a multicast.

We probably need to write a paper on "rules of thumb for group programming".
One more topic on the list of things to do...

Ken