andy@jhunix.HCF.JHU.EDU (Andy S Poling) (04/12/91)
I'm running the ntpd code from louie.udel.edu on two Ultrix boxes and a SysV port of the same code on a box running SysV rel 3.1.5 and I'm seeing the same thing happen on all three... Whenever they choose a server to which to synchronize, they then seem to lose all of the data about that server causing an "NTP peer lost" message and a constant swapping of the available, sane, servers. This just doesn't seem quite right to me since it causes ntpd to query it's favorite two or three servers every 64 seconds most of the time, rather than sliding to a 1024 second interval (which seems to me like the thing to do). I admit that I haven't studied the NTP spec in detail... Am I missing something? Being stupid? Is there any reason why I shouldn't modify the code to prevent this behavior? Thanx, -Andy -- Andy Poling Internet: andy@gollum.hcf.jhu.edu UNIX Systems Programmer Bitnet: ANDY@JHUNIX Homewood Academic Computing Voice: (301)338-8096 Johns Hopkins University UUCP: uunet!mimsy!aplcen!jhunix!andy
louie@SAYSHELL.UMD.EDU ("Louis A. Mamakos") (04/13/91)
It sounds like what is happening is that when ntpd initially selects a host, it had to reset the local clock rather then slew it because it was too far off. Whenever the local clock is reset, all of the offset/delay samples in the filters are flushed, and we start all over again. It then reselects a peer, but the selected clock is too far off again. Could it be that your network paths are exceptionally "noisy" or that you computers are keeping really crummy time? louie
andy@jhunix.HCF.JHU.EDU (Andy S Poling) (04/17/91)
In article <9104122212.AA02914@sayshell.umd.edu> louie@SAYSHELL.UMD.EDU ("Louis A. Mamakos") writes: >It sounds like what is happening is that when ntpd initially selects a host, >it had to reset the local clock rather then slew it because it was too >far off. Whenever the local clock is reset, all of the offset/delay samples >in the filters are flushed, and we start all over again. It then reselects >a peer, but the selected clock is too far off again. Could it be that >your network paths are exceptionally "noisy" or that you computers are >keeping really crummy time? That is exactly what seems to be happening (at least the stepping part). Ntpd selects a host, steps the clock, and clears the filter. Then, ten minutes later, the same thing happens again because it has selected a different server which is too far from the first server. Here is an example of the ntpdc results on one of the servers which illustrates the problem nicely: Address Reference Strat Poll Reach Delay Offset Disp ========================================================================== +128.220.1.2 130.43.2.2 2 512 377 25.0 -75.0 33.0 *18.72.0.3 WWV 1 64 337 103.0 35.0 16.0 .132.249.16.1 WWVB 1 256 375 175.0 -10.0 11.0 +128.102.16.10 WWV 1 512 377 152.0 14.0 5.0 +130.126.174.40 WWVB 1 256 377 204.0 6.0 22.0 The offset figures stay pretty consistent, but the behavior (server hopping) persists. Are these primary servers really keeping disparate time or is something else wrong? I don't think these machines have terrible clocks, and we have generally excellent network connectivity. -Andy -- Andy Poling Internet: andy@gollum.hcf.jhu.edu UNIX Systems Programmer Bitnet: ANDY@JHUNIX Homewood Academic Computing Voice: (301)338-8096 Johns Hopkins University UUCP: uunet!mimsy!aplcen!jhunix!andy