[comp.arch] Time Sync in MP System

markw@hpsal2.HP.COM (Mark Williams) (06/03/89)

How can I initialize and maintain time synchronization in a
multiprocessor system?  To put this in context, I am seeking an
approach that can be used by the IEEE CSR Standard, which will
be included by reference in Futurebus+ and SCI.  

We have a 64 bit system clock counter register in each node in
the system, which must be initialized  --  and resynchronized due
to local oscillator drift.  The clock counts in nanoseconds, but
nodes can increment by larger quantities if they lack a GHz
oscillator   :<)

If you have a reference on this topic, I would appreciate a posting.

Thanks, Mark Williams (408) 447-0762      EMAIL:  markw@hpda.HP.COM
US MAIL: Mark Williams, MS 42U5, Hewlett Packard, 19447 Pruneridge Av,
Cupertino, Ca 95014

aglew@mcdurb.Urbana.Gould.COM (06/05/89)

>How can I initialize and maintain time synchronization in a
>multiprocessor system?  To put this in context, I am seeking an
>approach that can be used by the IEEE CSR Standard, which will
>be included by reference in Futurebus+ and SCI.  
>
>We have a 64 bit system clock counter register in each node in
>the system, which must be initialized  --  and resynchronized due
>to local oscillator drift.  The clock counts in nanoseconds, but
>nodes can increment by larger quantities if they lack a GHz
>oscillator   :<)

Do not resynchronize the counter!  That sort of resynchronization
leads to "time warps" which dramatically reduce the usefulness
of the clock counter for high resolution time measurements,
performance evaluation, etc.

Instead, implement both the counter register and a correction register.
The correction register has to be added to the counter to get a
"time of day" measurement - but, the raw time in whatever rate the
local oscillator is ticking at can always be read.
    For small interval measurements the drift of the oscillator is 
probably less than the "warp" that you would apply to resynch the
timers.

Provide ways of reading both the raw time and the correction value
together. This does not have to be an atomic operation, but it should
be fairly precise.
    Note that the correction register changes much less often than
the local clock counter, so it is not unreasonable to define a
READ_CLOCK_AND_CORRECTION operation as two separate reads.
    Note also that providing both a raw and correction counter
avoids a race for incrementing the counter - the raw counter
is incremented only by the oscillator, the correction only by the
time synch interface.

(Actually, you would probably want a raw clock, current correction, 
and a deviation, with the deviation slowly "warping" its way into the
correction register in such a manner that raw+correction is a 
monotonic time of day clock suitable for timestamping. raw is useful
for interval measurements, and raw+deviation is the truest estimate
of the time of day).


I may sound a bit cock-eyed here, but I am deadly serious!
Please talk to someone who has actually tried to do very-high-res
measurements before defining a clock standard - myself, 
Agrawala at Maryland or Thareja at A&T. Maybe somebody in the 
POSIX Real Time group (although I don't think that there are too 
many performance people there anymore). This is exactly the sort
of thing that we were talking about at the UNIX Performance
Evaluation Birds-of-a-Feather at SIGMETRICS '89 (the BOF was a 
bit of a flop, but we did bring some ideas forward).


Andy "Krazy" Glew   aglew@urbana.mcd.mot.com   uunet!uiucdcs!mcdurb!aglew
   Motorola Microcomputer Division, Champaign-Urbana Design Center
	   1101 E. University, Urbana, Illinois 61801, USA.
   
My opinions are my own, and are not the opinions of my employer, or
any other organisation. I indicate my company only so that the reader
may account for any possible bias I may have towards our products.

roelof@idca.tds.PHILIPS.nl (R. Vuurboom) (06/05/89)

In article <28200330@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes:
>
>>How can I initialize and maintain time synchronization in a
>>multiprocessor system?  To put this in context, I am seeking an
>>approach that can be used by the IEEE CSR Standard, which will
>>be included by reference in Futurebus+ and SCI.  
>>
>Do not resynchronize the counter!  That sort of resynchronization
>leads to "time warps" which dramatically reduce the usefulness
>of the clock counter for high resolution time measurements,
>performance evaluation, etc.
>
I am inclined to agree you. A while back I was doing some work on a
Distributed Real-time Multiprocessor Operating System (called appropriately 
enough DRM System) and we spent a lot of time worrying about this
synchronization issue. The conclusion we drew was that it really wasn't
worth the bother. If an application could weather the "time warps"
introduced during resynchronization then it could probably weather
the oscillator drift in the first place (milliseconds per year?).
Vice versa if an application demanded that sort of synchronization 
then it could have a rough time weathering its way through synchronization 
bouts. 

In the days of granddad's tick-along all of this could have been pretty
hot stuff but with the accuracy of todays clocks this particular 
synchronization issue between distributed processors may have become a 
moot point.
-- 
Roelof Vuurboom  SSP/V3   Philips TDS Apeldoorn, The Netherlands   +31 55 432226
domain: roelof@idca.tds.philips.nl             uucp:  ...!mcvax!philapd!roelof