[comp.protocols.tcp-ip] Time synchronization and distribution plan

Mills@UDEL.EDU (01/24/88)

Folks,

There is good news for clockwatchers, timewarpers and chron daemons. All radio
clocks known to me are now finally repaired and ticking to standard time.
Also, venerable WWVB ticker dcn1.arpa (aka pogo), which some of you may fondly
remember from years past, has resumed this life. There is even a NTP secondary
time server on the European MILNET which is keeping pretty good time. There
are a half-dozen or so gents who have scrounged up an old PDP11-compatible
system and volunteered additional GOES or WWV servers as well. It may be time
(!) to bring some order to the clockwatching business. A treaty is suggested
in this note.

As in the telephone systems NTP uses a model of stratified clocks and servers.
A primary server is one directly synchronized to a radio clock, so it can keep
accurate time even in the absence of NTP itself. However, from long experience
with such things, the radio clocks sometimes feature scrambled time due radio
propagation conditions or broken logic. Thus, the primary servers ordinarily
run NTP with a couple of their buddies as a sanity check. Should the radio
clock itself become suspect, time synchronization shifts to the NTP peer
group. Secondary servers are synchronized with NTP using special filtering and
deglitching mechanisms ordinarily accurate to a few tens of milliseconds, even
across intransigent gateways.

NTP can be used in either a remote-procedure-call (asymmetric) mode, in which
a client sends a single message to the server and receives the time in reply,
or a distributed (symmetric) mode, in which the protocol runs continuously and
time is continuously compared between the peers and corrected as required. The
asymmetric mode is designed for casual use, such as setting the time and date
when a PC comes up, for example, while the symmetric mode is designed for more
accurate and precise applications, such as transaction timestamping and time
redistribution. This note is concerned only with use of the symmetric mode.

Not everybody can chime with a primary server, since this would eventually
lead to severe congestion and degraded service. Therefore, a system of
hierarchical time servers is suggested. Assume that each of the 400-odd
networks now active has a secondary time server synchronized to one or more
primary servers and providing time service to other hosts in its community. In
order to provide the highest robustness, the secondary server should chime
with more than one primary server, perhaps three, so we are talking about 1200
peer paths. The existing LSI-11 fuzzball gateways can support at least 60 peer
paths each, so some twenty primary servers would be needed. At the moment NTP
chimes one packet each direction per peer per minute, so the aggregate time
traffic works out to about one packet each direction per fuzzball per second.
It is planned to introduce NTP protocol modifications that would reduce this
rate by a factor of ten.

The twenty-odd primary servers should be located at strategic spots designed
to minimize the impact of the NTP traffic itself, yet provide low delay
dispersion for their customers. The existing and planned NSFNET Backbone sites
would seem ideal candidates and, indeed, time-synchronized fuzzballs are
already installed at seven of these sites. Without admitting agenda on how the
time-synchronization capability came to pass or on the likely disposition of
the fuzzballs once the new NSFNET Backbone is deployed, I suggest a nucleus of
timetellers is already in place. Additional timetellers are now ticking on
ARPANET and local nets gatewayed to ARPANET, MILNET and elsewhere.

At the U Delaware campus and at several other campuses known to me, one or
more relatively inexpensive WWV clocks are installed as backups should
connectivity to a primary server be lost for one reason or another. The WWV
clocks are distinctly inferior in accuracy and reliability with respect to the
WWVB and GOES clocks now used at the primary servers; however, as some may
remember, there have been occasions over the last several years when all
primary servers in the experimental system were down and the entire
NTP-speaking Internet was synchronized to a dinky WWV clock on my desk at
home. I suspect several institutions that cherish accurate time will install
GOES, WWVB or even GPS clocks and join the NTP chorus as well, so there may be
in fact no need for an overt program to buy and install additional clocks.

I saved specific recommendations for last. I suggest an appropriate first step
is that those sites with good connectivity to an NSFNET regional system chime
NTP with the NSFNET Backbone fuzzball serving that regional system. Other
sites may wish to choose one or another fuzzball listed below. However, it is
most important to understand that time service is provided by each of these
gizmos on a secondary basis only, is still in an experimental phase and may be
limited or curtailed should it interfere with the primary functions of the
machine.

Speaking for myself and I suspect the other operators listed, we would expect
users to set up their own time-redistribution network, perhaps using the 4.3
bsd ntpd daemon specifically designed for this purpose, and to avoid ganging
up on the servers with many hosts from the same net. We would also expect
users planning long-term use of the servers to express their intent to the
operators and comply with requests to reaffiliate with other servers should
that become necessary. Finally, we are looking for volunteers to install
additional primary servers and join the chorus as well.

Name		Address		Clock	Operator and notes
--------------------------------------------------------------------------
Primary servers
umd1.umd.edu	128.8.10.1	WWVB	Mike Petry (petry@trantor.umd.edu)
					U Maryland (gatewayed to NSFNET
					Backbone, ARPANET PSNs 17 and 20 and
					MILNET PSN 57.
wwvb.isi.edu	128.9.2.129	WWVB	Steve Casner (casner@isi.edu)
					ISI (gatewayed to ARPANET PSN 22)
ncar.nsf.net	128.116.64.3	WWVB	Scott Brim (swb@devvax.tn.cornell.edu)
					NCAR (NSFNET Backbone gateway)
dcn1.arpa	128.4.0.1	WWVB	Dave Mills (mills@udel.edu)
		10.2.0.96		U Delaware (directly connected to
					ARPANET PSN 96)
ford1.arpa	128.5.0.1	GOES	Fred Ball (ball@ford-vax.arpa)
					Ford Research (gatewayed via 9600-bps
					line to ARPANET PSN 111.
Secondary servers (please do NOT chime with these except by permission)
macom1.arpa	192.5.8.1	NTP	Woody Woodburn (woody@macom2.arpa)
		10.0.0.111		Linkabit, Vienna, VA
swamprat.arpa	192.5.8.2	NTP	Woody Woodburn (woody@macom2.arpa)
					Linkabit, Vienna, VA
patch.arpa	26.6.0.2	NTP	Dave Park (dpark@dca-eur.arpa)
					USECOM Stuttgart, FRG
gw.umich.edu	35.1.1.1	NTP	Hans-Werner Braun (hwb@mcr.umich.edu)
					U Michigan (WWV backup)
xyzzy.umich.edu	35.1.1.3	NTP	Hans-Werner Braun (hwb@mcr.umich.edu)
					U Michigan
libra.rice.edu	128.42.1.64	NTP	Paul Milazzo (milazzo@rice.edu)
		10.4.0.62		Rice U
dcn6.arpa	128.4.0.6	NTP	Dave Mills (mills@udel.edu)
					Newark, DE (WWV backup)
sdsc.nsf.net	192.12.207.1	NTP	Scott Brim (swb@devvax.tn.cornell.edu)
					San Diego Supercomputing Center
uiuc.nsf.net	128.174.5.14	NTP	Scott Brim (swb@devvax.tn.cornell.edu)
					National Center for Supercomputing
					Aplications
psc.nsf.net	128.182.1.2	NTP	Scott Brim (swb@devvax.tn.cornell.edu)
					Pittsburg Supercomputing Center
cornell.nsf.net	128.84.238.50	NTP	Scott Brim (swb@devvax.tn.cornell.edu)
					Cornell U (NYSERNET)
jvnc.nsf.net	128.121.50.20	NTP	Scott Brim (swb@devvax.tn.cornell.edu)
					John von Neumann Center (JVNCNET)
nsfnet-gw.umd.edu 128.8.10.6	NTP	Mike Petry (petry@trantor.umd.edu)
					U Maryland (SURANET)

Corrections or additions to this list would be appreciated

Dave

jqj@hogg.cc.uoregon.EDU (01/25/88)

As a matter of principal, I don't think it is appropriate to design a
time synchronization system for longterm use as strictly hierarchical.
That makes it too susceptible to Byzantine failures on the parts of
nodes high in the hierarchy, that cause problems for very large
subtrees.  It would make the time synchronization system particularly
inappropriate in a military/tactical environment, for example.
Although one may not like the specific algorithms, I prefer the Cornell
(Schneider, Toueg, etc.) approach, that attempts to achieve consensus
among a set of peers at any level.

For practical purposes, it is probably acceptable to model the system
as a hierarchy of SETS of time servers, each set having 5 to 10
members.  Presumably, algorithms can be chosen to insure that the
probablility of Byzantine failure of the whole SET is acceptably low.
However, this implies that we should design a system in which the
core/primary timeservers expect to be queried not by a large number of
mutually independent secondary servers, but by a large number of
members of sets of secondaries.  For example, we might have a set of
secondaries on a given regional network all of whom attempt to achieve
consensus among themselves but who also all query the primaries as a
time reference.  Note that it implies also that any given secondary
must plan to query several primaries (to detect Byzantine failures in
the primaries).  Correspondingly, it implies more network traffic
unless we are careful in the placement of servers.

I think this suggests at least a 3-level rather than 2-level hierarchy
of time servers, where level 3 is generally individual networks or small
groups of such networks, and level 2 is large (wellconnected) subsets of
the whole Internet.

Comments?

P.S. I would also like to see more thought given to how we should cope
with situations in which the radio timebases are inaccurate or inconsistent.

Mills@UDEL.EDU (01/25/88)

jqj,

While I can understand your concern about Byzantine agreement, there may
never be enough primary servers that everybody can play with very many of
them, so I think we are stuck with a hierarchy, even if we quibble on the
number of strata (to borrow the telephonic term). I am concerned about
broken radio clocks, hosts, networks and leap-seconds, as witness the
experiments I reported in RFCs 956-958 which were repeated recently with
interesting results. I have no problem in organizing subsets of clocks
which might run Byzantine algorithms in order to determine the truthtellers
and falsehood speakers; however, my main concern is the accuracy and
stability of the basic time service itself. I have taken a statistical
approach which attempts to maximize the quality of the data using signal
filtering and smoothing techniques which detect and discard outlyers
due to broken clocks and are based on the assumption that at least half
the clocks are tracking the same random variable and the rest are uniformly
distributed over the observation interval (see RFC-956). These algorithms
have been extensively simulated and tested in prototype implementations
now running and have proved extremely resilient to noisy networks, broken
clocks and jittery gateways but not, I admit, to broken programmers.

Dave

mcc@ETN-WLV.EATON.COM (Merton Campbell Crockett) (01/25/88)

This discussion concerning time synchronization is interesting but I must be
missing the point somewhere.  For the dissemmination (sp) of message traffic
I fail to see why it is important to be synchronized within 1 or 2 milliseconds.

If time is important, such as for real-time ranging data, it would seem that
one should purchase a WWV receiver and interface which ain't all that expen-
sive.  The biggest problem will be clock "jitter" introduced by programmers
like myself who used the JFFO (DECSystem 10) instruction to decode the 48
bit BCD milliseconds since start of year for timestamping real-time data.
The cause of the "jitter" was a side effect of the JFFO instruction which
should have been found immediately except everyone was impressed by an
actual use for the JFFO instruction.

I digressed.  The solution is sites and subnets with "real-time" requirements
should have their own WWV receivers with only "nominal" pinging of the net to
verify the WWV receiver has not failed.  The rest can use their time of day
clocks since time isn't that important.

Merton Campbell Crockett

Mills@UDEL.EDU (01/25/88)

Merton,

I don't know where you are as the ionosphere flies, but in my corner of this
world I sure would not trust my WWV receiver to stamp my archives. I would
trust my WWVB receiver rather more, except just now even that signal has
dropped below the floor and the clock is almost a second off. I did in
fact suggest exactly the model you mention, using a more-or-less reliable
local radio clock and verifing sanity with nominal pinging of its friends.
Now, about the protocol to accomplish those pings...

I started in this hokey business mostly to support accurate performance
measurement and analysis of the Internet; however, as time ticked on it
became an end in itself and lots of fun as well. Maybe you don't care about
time to the millisecond, but then you probably don't do any computerized
trading, wide-area monitoring and so forth. For example, with the system
now in place I have been able to compare timestamped error reports collected
from several packet switches and merge them into a scenario accurate to
within the flight time of packets between switches.

Last night I killed first one radio clock and then another untill all WWVB
primary radio clocks were disabled. I verified all the primary and secondary
time servers switched their hierarchical allegance until the ultimate backup
WWV clocks at U Michigan and U Delaware kicked in. So far as I could tell,
all time servers maintained synchronization to within a few tens of
milliseconds throughout the experiment. However, in this case all clocks
were observably sane. The next experiment is to insanitize a couple of them
and verify the rest toss them out of the club.

Dave