ken@gvax.cs.cornell.edu (Ken Birman) (09/28/90)
For those who might be interested... the deal is basically this. The reason grid has this problem is that during a join, ISIS needs to flush certain information through the bypass communication channels between the two processes involved. It doesn't "mark" messages based on their sources, so these messages (which must be delivered before the join can be done) are internally indistinguishable from the ones that grid is generating by doing cbcasts to itself. If we could distinguish them, the join algorithm would block these new cbcasts while letting the flush and terminate messages, and the messages these "flush through the channels", through. Not being able to distingush these "old" messages from new ones, the current implementation just waits until there are "no messages or tasks to run". We could and probably will improve this mechanism, but I see no urgency in this specific case. After all, it is hard to think of a good reason that any real application should go into an infinite loop, sending messages to itself with such vigor! Ken