[comp.arch] Time synchronization in a Distributed Environment

aws@druhi.ATT.COM (SteereA) (02/24/88)

Hi,
  I am looking for articles, references, implementations,
etc. for solving the problem of keeping N machines within
a specified time of one another.  I appreciate any and all
pointers.

	Thanks,
	Andy Steere
	  (303) 538-4128
 	  ihnp4!druhi!aws

pase@ogcvax.UUCP (Douglas M. Pase) (02/28/88)

In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes:
>Hi,
>  I am looking for articles, references, implementations,
>etc. for solving the problem of keeping N machines within
>a specified time of one another.  I appreciate any and all
>pointers.
>

This article may be of some use to you

%A Leslie Lamport
%T Time, Clocks, and the Ordering of Events in a Distributed System
%J Communications of the ACM
%V 21
%N 7
%P 558-565
%D July 1978
%K lam78

--
Doug Pase  --  ...ucbvax!tektronix!ogcvax!pase  or  pase@cse.ogc.edu (CSNet)

firth@sei.cmu.edu (Robert Firth) (02/29/88)

In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes:

]  I am looking for articles, references, implementations,
]etc. for solving the problem of keeping N machines within
]a specified time of one another.  I appreciate any and all
]pointers.

In article <1571@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes:

]This article may be of some use to you
]
]%A Leslie Lamport
]%T Time, Clocks, and the Ordering of Events in a Distributed System
]%J Communications of the ACM
]%V 21
]%N 7
]%P 558-565
]%D July 1978
]%K lam78

I'd second that.  This is an excellent article on the subject.
However, whereas the first half is generally useful, the second
part - where Lamport talks about real clocks rather than 'logical'
clocks - applies only in the special circumstance that all the
processors are in the same inertial frame of reference. (As the
author indicates in a footnote)

In general, you CANNOT keep two clocks synchronized within an
arbitrary delta, for reasons explained by Einstein these many
years ago.

delp@udel.EDU (Gary Delp) (03/01/88)

The Lamport Article is good.  Dave Mills has been doing it for the
Internet for a good while.  You might look at:

RFC-956/-7/-8 and references therein. See also latest issue CCR and
several byzantia scattered over JACM and dissertations.  

To millisecond grandularity over a widely spread network with
considerable jitter, the problem is a solved problem.
-- 
Gary Delp
123 Evans Hall, EE, U of D, Newark, DE 19716;  (302)-451-6653 or 2405
UUCP:  ...!{{allegra,ihnp4}!berkeley,harvard}!delp@udel.edu
CSNET:  delp%udel.edu@relay.cs.net      ARPA: delp@udel.edu

rajaei@ttds.UUCP (Hassan Rajaei) (03/05/88)

In article <1571@ogcvax.UUCP> pase@ogcvax.UUCP (Douglas M. Pase) writes:
>In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes:
>>Hi,
>>  I am looking for articles, references, implementations,
>>etc. for solving the problem of keeping N machines within
>>a specified time of one another.  I appreciate any and all
>>pointers.
>>
>
>This article may be of some use to you
>
>%A Leslie Lamport
>%T Time, Clocks, and the Ordering of Events in a Distributed System
>%J Communications of the ACM
>%V 21
>%N 7
>%P 558-565
>%D July 1978
>%K lam78

The Virtual Time theory and its implementaion Time Warp mechanism may
help as well. There is a good article on the subject:

Virtual Time, David Jefferson, ACM Trans. on Prog. Lang. & Syst.,
Vol. 7, No3, July 1985 pp 404-425

There is a Time Warp Operating System (TWOS) on Caltech Mark III Hypercube
which implements the TW mechanism. You may find the article in :
ACM Operating System Review, vol. 21, No. 5, pp 77-93, "Distributed Simulation
and the Time Warp Operating System", D. Jefferson et al.

Hassan Rajaei
rajaei@ttds.tds.kth.se

tkevans@fallst.UUCP (Tim Evans) (03/05/88)

In article <1571@ogcvax.UUCP>, pase@ogcvax.UUCP (Douglas M. Pase) writes:
> In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes:
> Hi,
>   I am looking for articles, references, implementations,
> etc. for solving the problem of keeping N machines within
> a specified time of one another.  I appreciate any and all
> pointers.
> 
> 
In an environment where Fusion (tm) network software runs, you can hack a
way of coordinating date/time among a group of machines.  (This works in
Sys V, at least.)

Select one machine as a master, and manually keep its date/time correct.
Set up a cron on the rest (which must be run with root authority) that
executes the following command using Fusion's (tm) 'rx':

	date `rx machine_name date '+%m%d%H%M%y'`

While I don't know, other network software presumably has the ability
to remotely execute a command on another machine in the network.

edw@IUS1.CS.CMU.EDU (Eddie Wyatt) (03/09/88)

> >   I am looking for articles, references, implementations,
> > etc. for solving the problem of keeping N machines within
> > a specified time of one another.  I appreciate any and all
> > pointers.


  Here's a procedure that I designed and coded.  If anyone has
any question, complaints or criticism  mail me.



/**************************************************************************
 *                                                                        *
 *                            sync_clocks                                 *
 *                                                                        *
 **************************************************************************

   Purpose :  This function helps syncronize the times between the module
            and lmb.  Lmb time is consider the correct time.  It also
	    determines if the time unit is in seconds or in milliseconds.
	    The method used is is follows :
 
 
                t1 = clock value on lmb side at message sending time
                t2 = clock value on module side at message receiving time
                m = time disparity between lmb clock and module clock
                    (this is the value of interest) 
                N = distribute representing the network transmittion time 
                M = distribution representing the time to send a message to 
                    a machine and getting a response (round trip)

		Assumption - processing time is negligible.
 
 
                t1 = t2 + m + N
 
                E[M] = E[N + N] = E[N] + E[N] = 2*E[N]
 
                E[m + N] = E[m] + E[N] = m + E[M]/2
 
                m = E[m + N] - E[M]/2 
            
 


   Programmer :Eddie Wyatt
 
   Date : December 1986 (Feb 1987)

   Input : None

   Output : None

   Locals : 
     i - loop increment
     send_time - the module time that a message is sent
     receive_time - the module time that a message is receive
                    (receive_time - send_time ~= M)
     lmb_time - the time on time lmb side
                    (lmb_time ~= m + N)

   Globals : 
     time_units - time is in seconds or milliseconds.
     time_disp - is modified to be equal to the time disparity between
                 the lmb clock and the module clock
     port - not modified

 ************************************************************************/

LIB_EXPORT void sync_clocks(port)
    {
    register int  i, send_time, receive_time, lmb_time;

    time_disp = 0;
    time_units = (TIME_UNITS) Nreceiveint(port);


    for (i = 0; i < NUMOFCLOCKSAMPLES; i++)
        {
        send_time = (time_units == INSECONDS) ? (int) time(NULLPTR(long))
					      : get_time_in_msec();
        Nsendint(1,port);
        lmb_time = Nreceiveint(port);
        receive_time = (time_units == INSECONDS) ? (int) time(NULLPTR(long))
				    		 : get_time_in_msec();
        time_disp += 2*lmb_time - receive_time - send_time;
        }

    time_disp /=(2*NUMOFCLOCKSAMPLES);
    }
-- 

Eddie Wyatt 				e-mail: edw@ius1.cs.cmu.edu

nather@ut-sally.UUCP (Ed Nather) (03/11/88)

> > In article <druhi.2710> aws@druhi.ATT.COM (SteereA) writes:
> > Hi,
> >   I am looking for articles, references, implementations,
> > etc. for solving the problem of keeping N machines within
> > a specified time of one another.  I appreciate any and all
> > pointers.
> > 

I recently faced the problem of keeping the internal CPU clock in an IBM PC
in step with a more accurate clock located in an interface card.  The interface
sends a data burst once per second though the serial port.  I used the arrival
of the first byte as a time tick.

The IBM PC keeps internal time by counting down its instruction clock frequency
and it is possible to modify the value of the countdown if the timer interrupt
is intercepted.  No integer value of the countdown will give precisely 1 ms
time ticks, which was needed for this application, so I alternate between one
slightly too large, and one slightly too small.  The amount of time each
countdown value remains active is adjusted by watching for drift between the
two clocks; a pair of countdowns is averaged about every 90 sec, and any drift
is noted (and the CPU clock is forced back into phase at the same time).  After
about 10 cycles, the accumulated drift is used to adjust the amount of time
each countdown value is used for the next 10 cycles.

This constitutes a software servo that works quite well with all the PCs tested.
Their instruction clock frequencies are quite different from one to the next,
but the servo locks in after the first frequency change and stays locked
thereafter.

Details at eleven ...

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.AS.UTEXAS.EDU