Mills@UDEL.EDU (01/24/88)
Folks, There is good news for clockwatchers, timewarpers and chron daemons. All radio clocks known to me are now finally repaired and ticking to standard time. Also, venerable WWVB ticker dcn1.arpa (aka pogo), which some of you may fondly remember from years past, has resumed this life. There is even a NTP secondary time server on the European MILNET which is keeping pretty good time. There are a half-dozen or so gents who have scrounged up an old PDP11-compatible system and volunteered additional GOES or WWV servers as well. It may be time (!) to bring some order to the clockwatching business. A treaty is suggested in this note. As in the telephone systems NTP uses a model of stratified clocks and servers. A primary server is one directly synchronized to a radio clock, so it can keep accurate time even in the absence of NTP itself. However, from long experience with such things, the radio clocks sometimes feature scrambled time due radio propagation conditions or broken logic. Thus, the primary servers ordinarily run NTP with a couple of their buddies as a sanity check. Should the radio clock itself become suspect, time synchronization shifts to the NTP peer group. Secondary servers are synchronized with NTP using special filtering and deglitching mechanisms ordinarily accurate to a few tens of milliseconds, even across intransigent gateways. NTP can be used in either a remote-procedure-call (asymmetric) mode, in which a client sends a single message to the server and receives the time in reply, or a distributed (symmetric) mode, in which the protocol runs continuously and time is continuously compared between the peers and corrected as required. The asymmetric mode is designed for casual use, such as setting the time and date when a PC comes up, for example, while the symmetric mode is designed for more accurate and precise applications, such as transaction timestamping and time redistribution. This note is concerned only with use of the symmetric mode. Not everybody can chime with a primary server, since this would eventually lead to severe congestion and degraded service. Therefore, a system of hierarchical time servers is suggested. Assume that each of the 400-odd networks now active has a secondary time server synchronized to one or more primary servers and providing time service to other hosts in its community. In order to provide the highest robustness, the secondary server should chime with more than one primary server, perhaps three, so we are talking about 1200 peer paths. The existing LSI-11 fuzzball gateways can support at least 60 peer paths each, so some twenty primary servers would be needed. At the moment NTP chimes one packet each direction per peer per minute, so the aggregate time traffic works out to about one packet each direction per fuzzball per second. It is planned to introduce NTP protocol modifications that would reduce this rate by a factor of ten. The twenty-odd primary servers should be located at strategic spots designed to minimize the impact of the NTP traffic itself, yet provide low delay dispersion for their customers. The existing and planned NSFNET Backbone sites would seem ideal candidates and, indeed, time-synchronized fuzzballs are already installed at seven of these sites. Without admitting agenda on how the time-synchronization capability came to pass or on the likely disposition of the fuzzballs once the new NSFNET Backbone is deployed, I suggest a nucleus of timetellers is already in place. Additional timetellers are now ticking on ARPANET and local nets gatewayed to ARPANET, MILNET and elsewhere. At the U Delaware campus and at several other campuses known to me, one or more relatively inexpensive WWV clocks are installed as backups should connectivity to a primary server be lost for one reason or another. The WWV clocks are distinctly inferior in accuracy and reliability with respect to the WWVB and GOES clocks now used at the primary servers; however, as some may remember, there have been occasions over the last several years when all primary servers in the experimental system were down and the entire NTP-speaking Internet was synchronized to a dinky WWV clock on my desk at home. I suspect several institutions that cherish accurate time will install GOES, WWVB or even GPS clocks and join the NTP chorus as well, so there may be in fact no need for an overt program to buy and install additional clocks. I saved specific recommendations for last. I suggest an appropriate first step is that those sites with good connectivity to an NSFNET regional system chime NTP with the NSFNET Backbone fuzzball serving that regional system. Other sites may wish to choose one or another fuzzball listed below. However, it is most important to understand that time service is provided by each of these gizmos on a secondary basis only, is still in an experimental phase and may be limited or curtailed should it interfere with the primary functions of the machine. Speaking for myself and I suspect the other operators listed, we would expect users to set up their own time-redistribution network, perhaps using the 4.3 bsd ntpd daemon specifically designed for this purpose, and to avoid ganging up on the servers with many hosts from the same net. We would also expect users planning long-term use of the servers to express their intent to the operators and comply with requests to reaffiliate with other servers should that become necessary. Finally, we are looking for volunteers to install additional primary servers and join the chorus as well. Name Address Clock Operator and notes -------------------------------------------------------------------------- Primary servers umd1.umd.edu 128.8.10.1 WWVB Mike Petry (petry@trantor.umd.edu) U Maryland (gatewayed to NSFNET Backbone, ARPANET PSNs 17 and 20 and MILNET PSN 57. wwvb.isi.edu 128.9.2.129 WWVB Steve Casner (casner@isi.edu) ISI (gatewayed to ARPANET PSN 22) ncar.nsf.net 128.116.64.3 WWVB Scott Brim (swb@devvax.tn.cornell.edu) NCAR (NSFNET Backbone gateway) dcn1.arpa 128.4.0.1 WWVB Dave Mills (mills@udel.edu) 10.2.0.96 U Delaware (directly connected to ARPANET PSN 96) ford1.arpa 128.5.0.1 GOES Fred Ball (ball@ford-vax.arpa) Ford Research (gatewayed via 9600-bps line to ARPANET PSN 111. Secondary servers (please do NOT chime with these except by permission) macom1.arpa 192.5.8.1 NTP Woody Woodburn (woody@macom2.arpa) 10.0.0.111 Linkabit, Vienna, VA swamprat.arpa 192.5.8.2 NTP Woody Woodburn (woody@macom2.arpa) Linkabit, Vienna, VA patch.arpa 26.6.0.2 NTP Dave Park (dpark@dca-eur.arpa) USECOM Stuttgart, FRG gw.umich.edu 35.1.1.1 NTP Hans-Werner Braun (hwb@mcr.umich.edu) U Michigan (WWV backup) xyzzy.umich.edu 35.1.1.3 NTP Hans-Werner Braun (hwb@mcr.umich.edu) U Michigan libra.rice.edu 128.42.1.64 NTP Paul Milazzo (milazzo@rice.edu) 10.4.0.62 Rice U dcn6.arpa 128.4.0.6 NTP Dave Mills (mills@udel.edu) Newark, DE (WWV backup) sdsc.nsf.net 192.12.207.1 NTP Scott Brim (swb@devvax.tn.cornell.edu) San Diego Supercomputing Center uiuc.nsf.net 128.174.5.14 NTP Scott Brim (swb@devvax.tn.cornell.edu) National Center for Supercomputing Aplications psc.nsf.net 128.182.1.2 NTP Scott Brim (swb@devvax.tn.cornell.edu) Pittsburg Supercomputing Center cornell.nsf.net 128.84.238.50 NTP Scott Brim (swb@devvax.tn.cornell.edu) Cornell U (NYSERNET) jvnc.nsf.net 128.121.50.20 NTP Scott Brim (swb@devvax.tn.cornell.edu) John von Neumann Center (JVNCNET) nsfnet-gw.umd.edu 128.8.10.6 NTP Mike Petry (petry@trantor.umd.edu) U Maryland (SURANET) Corrections or additions to this list would be appreciated Dave
jqj@hogg.cc.uoregon.EDU (01/25/88)
As a matter of principal, I don't think it is appropriate to design a time synchronization system for longterm use as strictly hierarchical. That makes it too susceptible to Byzantine failures on the parts of nodes high in the hierarchy, that cause problems for very large subtrees. It would make the time synchronization system particularly inappropriate in a military/tactical environment, for example. Although one may not like the specific algorithms, I prefer the Cornell (Schneider, Toueg, etc.) approach, that attempts to achieve consensus among a set of peers at any level. For practical purposes, it is probably acceptable to model the system as a hierarchy of SETS of time servers, each set having 5 to 10 members. Presumably, algorithms can be chosen to insure that the probablility of Byzantine failure of the whole SET is acceptably low. However, this implies that we should design a system in which the core/primary timeservers expect to be queried not by a large number of mutually independent secondary servers, but by a large number of members of sets of secondaries. For example, we might have a set of secondaries on a given regional network all of whom attempt to achieve consensus among themselves but who also all query the primaries as a time reference. Note that it implies also that any given secondary must plan to query several primaries (to detect Byzantine failures in the primaries). Correspondingly, it implies more network traffic unless we are careful in the placement of servers. I think this suggests at least a 3-level rather than 2-level hierarchy of time servers, where level 3 is generally individual networks or small groups of such networks, and level 2 is large (wellconnected) subsets of the whole Internet. Comments? P.S. I would also like to see more thought given to how we should cope with situations in which the radio timebases are inaccurate or inconsistent.
Mills@UDEL.EDU (01/25/88)
jqj, While I can understand your concern about Byzantine agreement, there may never be enough primary servers that everybody can play with very many of them, so I think we are stuck with a hierarchy, even if we quibble on the number of strata (to borrow the telephonic term). I am concerned about broken radio clocks, hosts, networks and leap-seconds, as witness the experiments I reported in RFCs 956-958 which were repeated recently with interesting results. I have no problem in organizing subsets of clocks which might run Byzantine algorithms in order to determine the truthtellers and falsehood speakers; however, my main concern is the accuracy and stability of the basic time service itself. I have taken a statistical approach which attempts to maximize the quality of the data using signal filtering and smoothing techniques which detect and discard outlyers due to broken clocks and are based on the assumption that at least half the clocks are tracking the same random variable and the rest are uniformly distributed over the observation interval (see RFC-956). These algorithms have been extensively simulated and tested in prototype implementations now running and have proved extremely resilient to noisy networks, broken clocks and jittery gateways but not, I admit, to broken programmers. Dave
mcc@ETN-WLV.EATON.COM (Merton Campbell Crockett) (01/25/88)
This discussion concerning time synchronization is interesting but I must be missing the point somewhere. For the dissemmination (sp) of message traffic I fail to see why it is important to be synchronized within 1 or 2 milliseconds. If time is important, such as for real-time ranging data, it would seem that one should purchase a WWV receiver and interface which ain't all that expen- sive. The biggest problem will be clock "jitter" introduced by programmers like myself who used the JFFO (DECSystem 10) instruction to decode the 48 bit BCD milliseconds since start of year for timestamping real-time data. The cause of the "jitter" was a side effect of the JFFO instruction which should have been found immediately except everyone was impressed by an actual use for the JFFO instruction. I digressed. The solution is sites and subnets with "real-time" requirements should have their own WWV receivers with only "nominal" pinging of the net to verify the WWV receiver has not failed. The rest can use their time of day clocks since time isn't that important. Merton Campbell Crockett
Mills@UDEL.EDU (01/25/88)
Merton, I don't know where you are as the ionosphere flies, but in my corner of this world I sure would not trust my WWV receiver to stamp my archives. I would trust my WWVB receiver rather more, except just now even that signal has dropped below the floor and the clock is almost a second off. I did in fact suggest exactly the model you mention, using a more-or-less reliable local radio clock and verifing sanity with nominal pinging of its friends. Now, about the protocol to accomplish those pings... I started in this hokey business mostly to support accurate performance measurement and analysis of the Internet; however, as time ticked on it became an end in itself and lots of fun as well. Maybe you don't care about time to the millisecond, but then you probably don't do any computerized trading, wide-area monitoring and so forth. For example, with the system now in place I have been able to compare timestamped error reports collected from several packet switches and merge them into a scenario accurate to within the flight time of packets between switches. Last night I killed first one radio clock and then another untill all WWVB primary radio clocks were disabled. I verified all the primary and secondary time servers switched their hierarchical allegance until the ultimate backup WWV clocks at U Michigan and U Delaware kicked in. So far as I could tell, all time servers maintained synchronization to within a few tens of milliseconds throughout the experiment. However, in this case all clocks were observably sane. The next experiment is to insanitize a couple of them and verify the rest toss them out of the club. Dave