aws@druhi.ATT.COM (SteereA) (02/24/88)
Hi, I am looking for articles, references, implementations, etc. for solving the problem of keeping N machines within a specified time of one another. I appreciate any and all pointers. Thanks, Andy Steere (303) 538-4128 ihnp4!druhi!aws
mohamed@hscfvax.harvard.edu (Mohamed_el_Lozy) (02/25/88)
In article <2710@druhi.ATT.COM> aws@druhi.ATT.COM (SteereA) writes: > I am looking for articles, references, implementations, >etc. for solving the problem of keeping N machines within >a specified time of one another. I appreciate any and all pointers. I am directing followup to comp.protocols.tcp-ip, where it would be more appropriate. I assume that the machines are networked. A good start would be in three RFCs written by Dave Mills: 956 Algorithms for synchronizing network clocks 957 Experiments in network clock synchronization 958 Network time protocol (NTP) They all came out in 1985, and have reasonable references for that time. A UNIX (BSD only, as I recall) implementation of NTP has been written at trantor.umd.edu, which also maintains a mailing list. To get on it send mail to ntp-request@trantor.umd.edu. There is also often quite a bit of network time discussion in comp.protocols.tcp-ip, especially when interesting things like leap seconds turn up. BSD4.3 implementations have a timed program, discussed at some length in the documentation (not available to me right now). I would very much appreciate some post 1985 references.
pase@ogcvax.UUCP (Douglas M. Pase) (02/28/88)
In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes: >Hi, > I am looking for articles, references, implementations, >etc. for solving the problem of keeping N machines within >a specified time of one another. I appreciate any and all >pointers. > This article may be of some use to you %A Leslie Lamport %T Time, Clocks, and the Ordering of Events in a Distributed System %J Communications of the ACM %V 21 %N 7 %P 558-565 %D July 1978 %K lam78 -- Doug Pase -- ...ucbvax!tektronix!ogcvax!pase or pase@cse.ogc.edu (CSNet)
firth@sei.cmu.edu (Robert Firth) (02/29/88)
In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes: ] I am looking for articles, references, implementations, ]etc. for solving the problem of keeping N machines within ]a specified time of one another. I appreciate any and all ]pointers. In article <1571@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes: ]This article may be of some use to you ] ]%A Leslie Lamport ]%T Time, Clocks, and the Ordering of Events in a Distributed System ]%J Communications of the ACM ]%V 21 ]%N 7 ]%P 558-565 ]%D July 1978 ]%K lam78 I'd second that. This is an excellent article on the subject. However, whereas the first half is generally useful, the second part - where Lamport talks about real clocks rather than 'logical' clocks - applies only in the special circumstance that all the processors are in the same inertial frame of reference. (As the author indicates in a footnote) In general, you CANNOT keep two clocks synchronized within an arbitrary delta, for reasons explained by Einstein these many years ago.
delp@udel.EDU (Gary Delp) (03/01/88)
The Lamport Article is good. Dave Mills has been doing it for the Internet for a good while. You might look at: RFC-956/-7/-8 and references therein. See also latest issue CCR and several byzantia scattered over JACM and dissertations. To millisecond grandularity over a widely spread network with considerable jitter, the problem is a solved problem. -- Gary Delp 123 Evans Hall, EE, U of D, Newark, DE 19716; (302)-451-6653 or 2405 UUCP: ...!{{allegra,ihnp4}!berkeley,harvard}!delp@udel.edu CSNET: delp%udel.edu@relay.cs.net ARPA: delp@udel.edu
devine@cookie.dec.com (Bob Devine) (03/01/88)
Here are two more papers on this topic: T.K.Srikanth and Sam Toueg; Optimal Clock Synchronization Journal of the ACM July 1987 Neil Rickert; Non-Byzantine Clock Synchronization - A Programming Experiment" ACM Operating Systems Review Jan 1988 Bob
rajaei@ttds.UUCP (Hassan Rajaei) (03/05/88)
In article <1571@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes: >In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes: >>Hi, >> I am looking for articles, references, implementations, >>etc. for solving the problem of keeping N machines within >>a specified time of one another. I appreciate any and all >>pointers. >> > >This article may be of some use to you > >%A Leslie Lamport >%T Time, Clocks, and the Ordering of Events in a Distributed System >%J Communications of the ACM >%V 21 >%N 7 >%P 558-565 >%D July 1978 >%K lam78 The Virtual Time theory and its implementaion Time Warp mechanism may help as well. There is a good article on the subject: Virtual Time, David Jefferson, ACM Trans. on Prog. Lang. & Syst., Vol. 7, No3, July 1985 pp 404-425 There is a Time Warp Operating System (TWOS) on Caltech Mark III Hypercube which implements the TW mechanism. You may find the article in : ACM Operating System Review, vol. 21, No. 5, pp 77-93, "Distributed Simulation and the Time Warp Operating System", D. Jefferson et al. Hassan Rajaei rajaei@ttds.tds.kth.se
tkevans@fallst.UUCP (Tim Evans) (03/05/88)
In article <1571@ogcvax.UUCP>, pase@ogcvax.UUCP (Douglas M. Pase) writes: > In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes: > Hi, > I am looking for articles, references, implementations, > etc. for solving the problem of keeping N machines within > a specified time of one another. I appreciate any and all > pointers. > > In an environment where Fusion (tm) network software runs, you can hack a way of coordinating date/time among a group of machines. (This works in Sys V, at least.) Select one machine as a master, and manually keep its date/time correct. Set up a cron on the rest (which must be run with root authority) that executes the following command using Fusion's (tm) 'rx': date `rx machine_name date '+%m%d%H%M%y'` While I don't know, other network software presumably has the ability to remotely execute a command on another machine in the network.
edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/09/88)
> > I am looking for articles, references, implementations, > > etc. for solving the problem of keeping N machines within > > a specified time of one another. I appreciate any and all > > pointers. Here's a procedure that I designed and coded. If anyone has any question, complaints or criticism mail me. /************************************************************************** * * * sync_clocks * * * ************************************************************************** Purpose : This function helps syncronize the times between the module and lmb. Lmb time is consider the correct time. It also determines if the time unit is in seconds or in milliseconds. The method used is is follows : t1 = clock value on lmb side at message sending time t2 = clock value on module side at message receiving time m = time disparity between lmb clock and module clock (this is the value of interest) N = distribute representing the network transmittion time M = distribution representing the time to send a message to a machine and getting a response (round trip) Assumption - processing time is negligible. t1 = t2 + m + N E[M] = E[N + N] = E[N] + E[N] = 2*E[N] E[m + N] = E[m] + E[N] = m + E[M]/2 m = E[m + N] - E[M]/2 Programmer :Eddie Wyatt Date : December 1986 (Feb 1987) Input : None Output : None Locals : i - loop increment send_time - the module time that a message is sent receive_time - the module time that a message is receive (receive_time - send_time ~= M) lmb_time - the time on time lmb side (lmb_time ~= m + N) Globals : time_units - time is in seconds or milliseconds. time_disp - is modified to be equal to the time disparity between the lmb clock and the module clock port - not modified ************************************************************************/ LIB_EXPORT void sync_clocks(port) { register int i, send_time, receive_time, lmb_time; time_disp = 0; time_units = (TIME_UNITS) Nreceiveint(port); for (i = 0; i < NUMOFCLOCKSAMPLES; i++) { send_time = (time_units == INSECONDS) ? (int) time(NULLPTR(long)) : get_time_in_msec(); Nsendint(1,port); lmb_time = Nreceiveint(port); receive_time = (time_units == INSECONDS) ? (int) time(NULLPTR(long)) : get_time_in_msec(); time_disp += 2*lmb_time - receive_time - send_time; } time_disp /=(2*NUMOFCLOCKSAMPLES); } -- Eddie Wyatt e-mail: edw@ius1.cs.cmu.edu
jerry@oliveb.olivetti.com (Jerry Aguirre) (03/10/88)
Being able to synchronize all systems to the same time is nice. Having that time be the correct time is even nicer. I have several Vax750s and a Vax785 who's clocks run fast. By selecting systems with better clocks to be the masters I can work around this but the result is not ideal. Does anybody know where to adjust the real-time clock on a Vax? I could probably come up with an accurate frequency counter or just keep tweeking it until it is right but I can't find the crystal, much less a trimmer for it. Jerry Aguirre @ Olivetti ATC uunet!amdahl!oliveb!jerry
nather@ut-sally.UUCP (Ed Nather) (03/11/88)
> > In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes: > > Hi, > > I am looking for articles, references, implementations, > > etc. for solving the problem of keeping N machines within > > a specified time of one another. I appreciate any and all > > pointers. > > I recently faced the problem of keeping the internal CPU clock in an IBM PC in step with a more accurate clock located in an interface card. The interface sends a data burst once per second though the serial port. I used the arrival of the first byte as a time tick. The IBM PC keeps internal time by counting down its instruction clock frequency and it is possible to modify the value of the countdown if the timer interrupt is intercepted. No integer value of the countdown will give precisely 1 ms time ticks, which was needed for this application, so I alternate between one slightly too large, and one slightly too small. The amount of time each countdown value remains active is adjusted by watching for drift between the two clocks; a pair of countdowns is averaged about every 90 sec, and any drift is noted (and the CPU clock is forced back into phase at the same time). After about 10 cycles, the accumulated drift is used to adjust the amount of time each countdown value is used for the next 10 cycles. This constitutes a software servo that works quite well with all the PCs tested. Their instruction clock frequencies are quite different from one to the next, but the servo locks in after the first frequency change and stays locked thereafter. Details at eleven ... -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather@astro.AS.UTEXAS.EDU
hirshman@60600.dec.com (Bret H. {FS Tech Support@Sydney, Oz} SNE/G 4125546) (03/18/88)
> Being able to synchronize all systems to the same time is nice. Having > that time be the correct time is even nicer. I have several Vax750s and > a Vax785 who's clocks run fast. By selecting systems with better clocks > to be the masters I can work around this but the result is not ideal. > > Does anybody know where to adjust the real-time clock on a Vax? I could > probably come up with an accurate frequency counter or just keep > tweeking it until it is right but I can't find the crystal, much less a > trimmer for it. > > Jerry Aguirre @ Olivetti ATC > uunet!amdahl!oliveb!jerry I'm afraid there are no clock crystal frequency trimmers on any of the VAXes that I've come across. Even if there were, you would (a) almost certainly invalidate a DEC Maintenance Agreement by twiddling them yourself, (b) need to keep careful track of any maintenance done and adjust it all again if the relevant module was replaced, and (c) you'd still have thermal drift and crystal ageing to worry about. But don't despair! I can think of a number of ways to do what you want, most of which don't even need a frequency counter. But first a little background info: I'll discuss VMS here because that's what I know, but I'm sure the basic ideas are quite applicable to Unix. VMS maintains the current system time in a software counter in memory. This is incremented at hardware clock interrupt time, which is always every 10 milliseconds for VMS on all current VAXes. There are actually two hardware real-time clocks present in most VAXen, with different purposes and specifications. The first and most relevant one is the Interval Counter, a programmable 32 bit counter which is incremented at one microsecond intervals with a nominal .01% clock accuracy, i.e. +/- 8.64 seconds per day. This counter is present in all VAXes other than microVAXes, which have fixed unprogrammable 10 millisecond clock interrupts. At boot time and after power fail restarts, VMS programs the Next Interval Count Register (internal processor register #25) with a value of -10,000 to produce clock interrupts at 10 millisecond intervals. These are the only times VMS touches the NICR, a handy fact which gives us our best method for tweaking the time. Again, I'd be surprised if the various Unixes did this much differently. The other clock, which is architecturally optional for MicroVAX implementations, is the battery-backed Time Of Day/Time Of Year (TOD or TOY) clock. On the 725/30, 750, 780/2/5, 82/8300 series, 8600/50 and the 85/87/8800 series this is a 32 bit unsigned binary counter (the TODR, internal processor register #27) with 10 millisecond resolution and a clock accuracy of at least .0025%, i.e. +/- around 65 seconds per month. The old microVAX/VAXstation I has no TOD/TOY clock at all. On the microVAX/VAXstation II & 2000, and I *think* on the microVAX III/3000 series, there is a battery backed MC146818 CMOS watch chip with 1 second resolution which is accessed as a series of 8 bit registers in I/O address space. I haven't been able to find any accuracy specs for the 32.768 kHz crystal oscillator which drives this. The watch chip is tricky to access, so see the KA630 CPU Module Users Guide (DEC order no. EK-KA630-UG) for more info. Actually, the 82/8300 series also have one of these watch chips because their TODR register is volatile and the software must reload it from the watch chip after a power down. See the KA820 CPU Technical Manual (EK-KA820-TM) if you want to access the watch chip on these VAXes. All the previously mentioned registers are only accessible with the CPU in kernel mode. The only time the TOD/TOY clock is read is at boot time, after power fail restarts, or when a SET TIME command/$SETIME system service call with no parameters is executed. Note that there is *no* VMS system service call available that will simply read the TOD/TOY register without simultaneously updating the current system time in memory. So, 1) It follows that the quickest and nastiest method of improving time accuracy is to periodically update the current time using the TOD/TOY clock or some other time reference. For VMS this can be as simple as submitting a two line batch command file that does a SET TIME then resubmits itself for X minutes later. PRO:- Really simple. Works on VAXes without programmable interval counters. CON:- Really Ugly. This has the big disadvantage that the time corrects in a discontinuous fashion and may well double back on itself. Many applications wouldn't like that at all. Must have a TOD/TOY register or other machine readable time source. 2) Determine the percentage error of the system time by comparing it against a known time standard over a period of a few days. Do a once-only change to the NICR value of a compensating amount by running a suitable program as soon as possible after system initialisation. For a nominal NICR value of -10,000 this should theoretically allow a time precision of one part in 10,000 or +/- 8.6 seconds a day. By dithering the NICR value (changing it up and down by one count at precalculated intervals) you could get greater precision. PRO:- No machine readable external or internal time reference required, easy to implement, accuracy good for most purposes. Works about as well as does adjusting crystal frequency. CON:- Doesn't compensate for thermal drift and ageing. Does a mediocre job of synchronising clocks on multiple VAXes. Sensitive to modules being replaced. Difficult to get really high accuracy. Can't be done on VAXes with no programmable interval counter. 3) Use a modification of method (2) aARPA INTERNET: hirshman@ripper.DEC.COM Anon. # Snail: Digital Equipment Corp. P/L, 18 Glen Street, # Eastwood, NSW 2122, AUSTRALIA DISCLAIMER: The above opinions are mine (and probably mine alone, *sigh!*). ----------
hirshman@gidday.dec.com (Bret H., Tech Support @ Sydney, Oz SNE-G 4125546) (03/26/88)
A while ago I posted a note making some suggestions about possible methods for correcting and synchronising the system time on VAXes. A couple of the methods involved using the VAX TODR (Time Of Day Register) as a reference. This is all well and good for *most* VAXes, though I didn't really give enough information on the VAX/VMS TODR format for any one to use it successfully without more research. BUT (and this is a big but, sportsfans!) if you value your system uptime *DON'T* read the TODR on any of the 85xx/87xx/88xx/89xx series VAXes (the Nautilus family), and especially don't write to it! You might corrupt your system time at best, and hang your VAX at worst. The reasons for this are too complex to explain here in what is meant to be a quick warning note. Suffice it to say that the implementation of the TODR on Nautilus-family VAXes is a lot more complex than I led you to believe in my original posting. In other words, I blew it. Sorry about that, folks! Also, if anybody posted any queries or comments on my original note I'm afraid I didn't see them. In the best traditions of Murphy's Law, my news feed went down for more than a week within hours of my first posting. Typical! :-) So please send me mail as well as posting, or just send me mail. It's a lot more reliable for me. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # Bret A. Hirshman, Esq. "The makers may make # and the users may use, # DEC EasyNet: RIPPER::HIRSHMAN but the fixers must fix # USENET: hirshman@ripper.DEC.COM with but minimal clues" # or ..!{decwrl,decuac}!ripper.dec.com!hirshman # ARPA INTERNET: hirshman@ripper.DEC.COM Anon. # Snail: Digital Equipment Corp. P/L, 18 Glen Street, # Eastwood, NSW 2122, AUSTRALIA DISCLAIMER: The above opinions are mine (and probably mine alone, *sigh!*). ----------