[mod.computers.vax] A series of command procedures for cluster synchronizing

rlb@mece5.UUCP.UUCP (06/19/86)

 
In order to keep our VAXcluster synchronized we use a combination of 
mechanisms.  Over an observation period of 2-3 weeks we determined which
node has the most stable/reliable cpu clock.  This becomes the reference
node.
 
Each node runs a "midnight" command procedure which makes sure that
it waits until midnight or later to execute.  For this we use a 
"wait_until" procedure. 
 
Then at or after midnight, the procedure invokes a "timeset" procedure 
which does the synchronizing.  This procedure looks to see if it is on the 
reference node or another one.  If it is on the reference node it simply 
exits.  If it is on another node it invokes a procedure over DECnet to 
retrieve the current time on the reference node.  Then immediately upon
receiving the reference time, it uses f$time() to get its current time.
Then it computes a delta time using TIME.EXE from FERMILAB (available on 
DECUS tapes.)  There are some corrections that I have had to make so I may 
post those later.
 
Then once the delta time has been computed it goes about altering the clock 
to match the reference node.
 
If an appropriate process were running on the reference cluster node it
would be possible to replace this whole scheme with a similar one based on
SYS$ENQ to use the distributed lock manager to pass quadword time values
from the reference to the slave nodes.
 
The procedures I mentioned here will be posted in subsequent messages.
If you wish to call me, I can be reached at (919)549-3627
 
Bob Boyd, Mgr CAE Computer Facility Operation
POB 13049, MS 7T2-04
GE Semiconductor
RTP, NC 27709-3049