rlb@rtpark.UUCP ("Bob Boyd,8*274-3627") (07/29/86)
In order to keep our VAXcluster synchronized we use a combination of mechanisms. Over an observation period of 2-3 weeks we determined which node has the most stable/reliable cpu clock. This becomes the reference node. Each node runs a "midnight" command procedure which makes sure that it waits until midnight or later to execute. For this we use a "wait_until" procedure. Then at or after midnight, the procedure invokes a "timeset" procedure which does the synchronizing. This procedure looks to see if it is on the reference node or another one. If it is on the reference node it simply exits. If it is on another node it invokes a procedure over DECnet to retrieve the current time on the reference node. Then immediately upon receiving the reference time, it uses f$time() to get its current time. Then it computes a delta time using TIME.EXE from FERMILAB (available on DECUS tapes.) There are some corrections that I have had to make so I may post those later. Then once the delta time has been computed it goes about altering the clock to match the reference node. If an appropriate process were running on the reference cluster node it would be possible to replace this whole scheme with a similar one based on SYS$ENQ to use the distributed lock manager to pass quadword time values from the reference to the slave nodes. The procedures I mentioned here will be posted in subsequent messages. If you wish to call me, I can be reached at (919)549-3627 Bob Boyd, Mgr CAE Computer Facility Operation POB 13049, MS 2P-04 GE Semiconductor RTP, NC 27709-3049