mills@huey.udel.edu.UUCP (11/24/86)
Folks, I thought you might like an update on how clocks are ticking in the swamps. After some rummaging around today I was surprised to learn that not only the GOES radio clock on FORD1.ARPA had completely departed its interface, but the WWVB clock on UMD1.ARPA had departed its antenna, or something like that. Only the WWVB clock on DCN1.ARPA, along with the scruffy WWV clocks on GW.UMICH.EDU and UDEL2.UDEL.EDU (relocated from DCN6.ARPA), continued to tick. However, since all the swamps involved use Network Time Protocol (NTP) peers as backup, the hosts involved remained synchronized (to DCN1.ARPA) and the clockwatchers scattered throughout the Internet scarcely knew anything was abnormal. The control of time warps as synchonization switched between the local clocks and NTP-derived time was not without mishap, however, and revealed some bugs. Benign torpedoes sent by Rich Wales at UCLA exposed one bug that caused NTP targets to vaporize and then recondense, although most of the time this did not destabilize system synchronization. Some very subtle transients in the recursive median filters used by the fuzzball NTP peers to deglitch neighbor offsets proved very hard to catch, but catched they got. Several changes were made to the fuzzware to improve accuracy and reduce vulnerability to glitches. Here at U Delaware we are synchronizing clocks to DCN1.ARPA via ARPAnet paths and can report satisfying results. With an eight-stage recursive median filter, one-minute poll interval, 256-ms aperture and filter constants as reported in previous RFCs, we can reliably deliver local time to within 10-20 ms or so of the DCN1.ARPA WWVB reference clock, which has previously been calibrated to within a few milliseconds of NBS radio time. It turns out that NTP is a useful diagnostic of network health as well, since wide delay dispersions and offset glitches are sensitive indicators of path switching and congestion. Milo Medin at NASA/AMES, Rich Wales at UCLA and Mike Petry at U Maryland have NTP non-fuzzball peers running and, hopefully, can report how well things work via other paths and using other systems. Finding and fixing time warps during the shakedown of NTP in distributed-peer mode (see RFC-958) has been surprisingly hard. since the system amounts to a set of mutually coupled, nonlinear, phase-locked oscilators. As many know, the theory of linear phase-locked oscillators is well trampled in Electrical Engineering, as are models of mutual trust/distrust in Computer Science. The present problems seem to lie more in the area of nonlinear statistics, for which the technology of nonlinear filtering (e.g. order statistics, median filters), clustering algorithms (e.g. RFC-956) and multivariate estimation are proving excellent tools. These tools, incidently, are excellent for the study of large, ill-disciplined Internets in general. Which suggests, of course, further instrumentation of NTP peers as a network monitoring mechanism. Dave -------