[comp.sys.isis] Isis logging tool in V2.1

ken@cs.cornell.edu (Ken Birman) (04/26/91)

> From stasel@cs.unc.edu Thu Apr 25 17:16:42 1991
> Subject: ISIS LOG TOOL
> To: isis-bugs@cs.cornell.edu


> I would appreciate your help in answering a few questions.

> BACKGROUND:  I am running isis version 2.1 using the log manager
> with automatic logging and manual flushing (I have also tried automatic
> flushing.  The problem is that when all processes of a group die I am 
> able to restart any process even if the process has an out-of-date log.
> On page 252 of the manual it seems to state that this will not happen, 
> that pg_join will die with a IE_MUSTJOIN error.  

> In our class someone thought the meaning of the book was that this would
> happen (a IE_MUSTJOIN error) only if the log were really out of date.  

> QUESTIONS:

> 1.  What are the semantics of the restart from total failure in version 2.1?  
>     (Intended and actual).

In V2.1 and V3.0, the tool requires that the lmgr, rmgr and news (!) subsystems
be included in the isis.rc file.  

V2.1 further requires that news "survive" the failure of your application.
If news itself goes down the problem you have observed will often occur
after restart.

> 2.  What are the semantics of the restart from total failure in version 3.0?

In V3.0 the system handles this problem correctly even if a total failure
occurs and all of ISIS gets shut down and restarted.  But, we still require
that the news facility be run -- in fact, we need the new, V3.0 news
facility -- and we get stuck if the copies of the news that are up
don't include one that was up when the group was last active.

The basic idea is that news is maintaining "last view" information for
the groups that use this type of logging and unless we can track down the
last view, the first to recover will always be allowed to restart.

> 3.  If I am reading the book wrong and the semantics are that IE_MUSTJOIN 
>     will happen only if the log is significantly out-of-date, can this be
>     worked around so that it must completely up-to-date?

Well, this probably due to having had the news system shut down under
V2.1 and would go away under the V3.0 solution.

> 4.  If it is a programming error of mine, is there any obvious mistakes I
>     could have made that you know of?

Looks like you were right and the manual was missing an important warning.

> Judith A. Stasel
> CS Grad - Chapel Hill
-- 
Kenneth P. Birman                              E-mail:  ken@cs.cornell.edu
4105 Upson Hall, Dept. of Computer Science     TEL:     607 255-9199 (office)
Cornell University Ithaca, NY 14853 (USA)      FAX:     607 255-4428