[comp.sys.apollo] ncs-clocks skew warning

bonnetf@apo.esiee.fr (bonnet-franck) (03/19/90)

hello,
I have a warning problem with the clocks of the 2 glbd daemons on my system.
The message looks like the following :

drm_admin: lrep -clocks
	dds://tpd1            	1990/03/19.11:48	*** clock skew warning ***
	dds://sig2            	1990/03/19.11:41	*** clock skew warning ***

What does it EXACTLY means ? I tried the merge_all command which works successfully
but did not correct the problem.
One daemons run in 10.2 and the other in 10.1. But before the 2 were running on 10.1
machines.
It does'nt seems to affect the system but it worry me a bit ...
---------------------------------------
bonnetf@apo.esiee.fr                   
Frank Bonnet                          
E.S.I.E.E
BP99 93162 Noisy le Grand cedex.FRANCE.
Fax : 33 1 45.92.66.99
---------------------------------------

dente@els.uucp (Colin Dente) (03/20/90)

In article <9003191219.AA12175@apo.esiee.fr> bonnetf@apo.esiee.fr (bonnet-franck) writes:
>hello,
>I have a warning problem with the clocks of the 2 glbd daemons on my system.
>The message looks like the following :
>>drm_admin: lrep -clocks
>	dds://tpd1            	1990/03/19.11:48	*** clock skew warning
>	dds://sig2            	1990/03/19.11:41	*** clock skew warning
>
>What does it EXACTLY means

What it means is that the clocks on the two systems are set to
different times.  This will not usually cause problems, but could
result in data being lost from as when glbds synchronize their
databases the most recent entry from either database is used, and,
with the clocks skewed, it would be possible to have the most
up-to-date entry not have the most recent time stamp.

The way round this is to reset the system clocks by means of the stand
alone utility CALENDAR - i.e. EX CALENDAR from the MD.

Colin

 Colin Dente                      | JANET: dente@uk.ac.man.ee.els
 Dept. of Electrical Engineering  | ARPA:  dente@els.ee.man.ac.uk 
 University of Manchester         | UUCP:  ...!mcvax!ukc!man.ee.els!dente 
 England                          | These might work now, but then again...

krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) (03/20/90)

The global loation brokers use the time/date when updating each other
when NCS servers startup or shutdown. If the clocks on the machines
running GLBD are more than 5 or 10 minutes out of sync, the location
broker can not update each other because events which are registered on one
machine can appear to be occuring in the *future* on the other machine.
Under SR10.2 (and possbily SR10.1) you can use the Unix "date" command
reset the machine's time and date to the correct value without having to
shutdown the machine to run EX CALENDAR. There is also a new daemon called
"timed" which is supposed to keep machines from having their clocks drift
with repect to one another. I haven't tried it out yet, so I can't say
whether is works or not. 


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

sommerfeld@apollo.hp.com (Bill Sommerfeld) (03/20/90)

In article <9003191219.AA12175@apo.esiee.fr> bonnetf@apo.esiee.fr (bonnet-franck) writes:

   What does it EXACTLY means ? 

It means exactly what it says :-).

The replication algorithm used by the GLB depends on the clocks on the
nodes running the GLB being loosely synchronized; the "clock skew
warning" means that the clocks are drifting too far apart.

The GLB won't correct this for you automatically; instead, you have to
synchronize the clocks in some other way.

Doing it manually:

- using a wristwatch and /bin/date whenever you see the "clock skew
warning".

Doing it automatically (more work to set up; less work over time):

- If both glbd's are on the same physical network and both have IP
(some say "TCP") enabled, set the nodes up to run /etc/timed all the
time.  Timed should keep your clocks synchronized to within a second
or so.  It works by "averaging" all your clocks together, so it works
best if there are a relatively large number of clocks participating.
I believe timed is new as of SR10.2; the binary included in SR10.2
might work on 10.1, but I haven't tried it.

- If you have a flair for overkill, and you're connected to the
Internet or have access to a proper time source (typically a radio
clock), you can bring up NTP.  This will tend to keep your node clocks
synchronized to within about 10-20 milliseconds of the correct time.

There are two different implementations of NTP for UNIX which are
generally available, "ntpd", done at the University of Maryland, and
"xntpd", by Dennis Ferguson of the University of Toronto.  I've
managed to get both working on SR10.2, and if enough people are
interested in patches, I could be convinced to make them available,
although porting them was really very trivial.  "xntpd" is a bigger,
more featureful implementation; "ntpd" is kind of minimalist.

				- Bill Sommerfeld, amateur clock watcher
				sommerfeld@apollo.com
				sommerfeld@apollo.hp.com