ken@gvax.cs.cornell.edu (Ken Birman) (09/28/90)
In some situations, output in the "protos" log may show a task waiting for the wrong number of replies to a non-bypass broadcast. For example, I have some output from a test program that does an abcast to a group with 2 members, asking for 1 replies. Yet, a pr_dump output during the run showed this: TASK[c3490]: cl_abcast(aba9c), wants 2 of 2 replies, got 0+0 null, msgid=8630 Dests: (3/0:2230.8)(1/0:13235.8); stat <WW> Actually, it turns out that the task really is waiting for 1 reply. The code in pr_mcast.c that jots this down for use in pr_dump is wrong, but the collection of replies is fine. Thus, the code is working right but the dump output is incorrect in a confusing way. This will only occur in the "protos" log file. You can correct this by moving line 129 of protos/pr_mcast.c down to after line 131: /* ctp->task_nwant = nsent; Old location */ if(nsent > nwanted) nsent = nwanted; ctp->task_nwant = nsent; /* Moved down */ (personally, I wouldn't bother). A related comment applies when you do a completely asynchronous broadcast (0 replies) to a group to which the sender does not belong. In this case, ISIS will change the broadcast to a 1-reply broadcast and generate the reply itself. This is not a bug; it relates to what we call an "iterated broadcast" and arises because non-members of a group may have the wrong idea of who belongs to the group, in which case they have to "iterate" the broadcast. In such cases, we need to have the sending task hang around, inside protos, until we know the message got through and was delivered. Thus, you could do an abcast or cbcast specifying 0 replies and still see a protos log file showing this task waiting for 1 reply. This is correct behavior. Ken