[comp.sys.dec] Time synchronization in a Distributed Environment

jerry@oliveb.olivetti.com (Jerry Aguirre) (03/10/88)

Being able to synchronize all systems to the same time is nice.  Having
that time be the correct time is even nicer.  I have several Vax750s and
a Vax785 who's clocks run fast.  By selecting systems with better clocks
to be the masters I can work around this but the result is not ideal.

Does anybody know where to adjust the real-time clock on a Vax?  I could
probably come up with an accurate frequency counter or just keep
tweeking it until it is right but I can't find the crystal, much less a
trimmer for it.

					Jerry Aguirre @ Olivetti ATC
					uunet!amdahl!oliveb!jerry

hirshman@60600.dec.com (Bret H. {FS Tech Support@Sydney, Oz} SNE/G 4125546) (03/18/88)

> Being able to synchronize all systems to the same time is nice.  Having
> that time be the correct time is even nicer.  I have several Vax750s and
> a Vax785 who's clocks run fast.  By selecting systems with better clocks
> to be the masters I can work around this but the result is not ideal.
> 
> Does anybody know where to adjust the real-time clock on a Vax?  I could
> probably come up with an accurate frequency counter or just keep
> tweeking it until it is right but I can't find the crystal, much less a
> trimmer for it.
> 
>                                         Jerry Aguirre @ Olivetti ATC
>                                         uunet!amdahl!oliveb!jerry

I'm afraid there are no clock crystal frequency trimmers on any of the VAXes
that I've come across. Even if there were, you would (a) almost certainly
invalidate a DEC Maintenance Agreement by twiddling them yourself, (b) need
to keep careful track of any maintenance done and adjust it all again if the
relevant module was replaced, and (c) you'd still have thermal drift and 
crystal ageing to worry about. But don't despair! I can think of a number of
ways to do what you want, most of which don't even need a frequency counter.
But first a little background info: 

I'll discuss VMS here because that's what I know, but I'm sure the basic ideas
are quite applicable to Unix. VMS maintains the current system time in a 
software counter in memory. This is incremented at hardware clock interrupt
time, which is always every 10 milliseconds for VMS on all current VAXes.

There are actually two hardware real-time clocks present in most VAXen, with
different purposes and specifications. The first and most relevant one is the
Interval Counter, a programmable 32 bit counter which is incremented at one
microsecond intervals with a nominal .01% clock accuracy, i.e. +/- 8.64 seconds
per day. This counter is present in all VAXes other than microVAXes, which have
fixed unprogrammable 10 millisecond clock interrupts. At boot time and after
power fail restarts, VMS programs the Next Interval Count Register (internal
processor register #25) with a value of -10,000 to produce clock interrupts at
10 millisecond intervals. These are the only times VMS touches the NICR, a
handy fact which gives us our best method for tweaking the time. Again, I'd be
surprised if the various Unixes did this much differently. The other clock,
which is architecturally optional for MicroVAX implementations, is the
battery-backed Time Of Day/Time Of Year (TOD or TOY) clock. On the 725/30, 750,
780/2/5, 82/8300 series, 8600/50 and the 85/87/8800 series this is a 32 bit
unsigned binary counter (the TODR, internal processor register #27) with 10
millisecond resolution and a clock accuracy of at least .0025%, i.e. +/- around
65 seconds per month. The old microVAX/VAXstation I has no TOD/TOY clock at
all. On the microVAX/VAXstation II & 2000, and I *think* on the microVAX
III/3000 series, there is a battery backed MC146818 CMOS watch chip with 1
second resolution which is accessed as a series of 8 bit registers in I/O
address space. I haven't been able to find any accuracy specs for the 32.768
kHz crystal oscillator which drives this. The watch chip is tricky to access,
so see the KA630 CPU Module Users Guide (DEC order no. EK-KA630-UG) for more
info. Actually, the 82/8300 series also have one of these watch chips because
their TODR register is volatile and the software must reload it from the watch
chip after a power down. See the KA820 CPU Technical Manual (EK-KA820-TM) if
you want to access the watch chip on these VAXes. All the previously mentioned
registers are only accessible with the CPU in kernel mode.

The only time the TOD/TOY clock is read is at boot time, after power fail
restarts, or when a SET TIME command/$SETIME system service call with no
parameters is executed. Note that there is *no* VMS system service call
available that will simply read the TOD/TOY register without simultaneously
updating the current system time in memory. 

So,

1) It follows that the quickest and nastiest method of improving time accuracy
is to periodically update the current time using the TOD/TOY clock or some
other time reference. For VMS this can be as simple as submitting a two line
batch command file that does a SET TIME then resubmits itself for X minutes
later. 
PRO:- Really simple. Works on VAXes without programmable interval counters.
CON:- Really Ugly. This has the big disadvantage that the time corrects in a
 discontinuous fashion and may well double back on itself. Many applications
 wouldn't like that at all. Must have a TOD/TOY register or other machine
 readable time source.

2) Determine the percentage error of the system time by comparing it against
a known time standard over a period of a few days. Do a once-only change to the
NICR value of a compensating amount by running a suitable program as soon as
possible after system initialisation. For a nominal NICR value of -10,000 this
should theoretically allow a time precision of one part in 10,000 or +/- 8.6
seconds a day. By dithering the NICR value (changing it up and down by one 
count at precalculated intervals) you could get greater precision.
PRO:- No machine readable external or internal time reference required, easy to
 implement, accuracy good for most purposes. Works about as well as does
 adjusting crystal frequency.
CON:- Doesn't compensate for thermal drift and ageing. Does a mediocre job of
 synchronising clocks on multiple VAXes. Sensitive to modules being replaced.
 Difficult to get really high accuracy. Can't be done on VAXes with no
 programmable interval counter. 

3) Use a modification of method (2) aARPA INTERNET: hirshman@ripper.DEC.COM
                 Anon.    #  Snail: Digital Equipment Corp. P/L, 18 Glen Street,
                          #         Eastwood, NSW 2122, AUSTRALIA

DISCLAIMER: The above opinions are mine (and probably mine alone, *sigh!*).
----------  

hirshman@gidday.dec.com (Bret H. {FS Tech Support@Sydney, Oz} SNE/G 4125546) (03/21/88)

> Being able to synchronize all systems to the same time is nice.  Having
> that time be the correct time is even nicer.  I have several Vax750s and
> a Vax785 who's clocks run fast.  By selecting systems with better clocks
> to be the masters I can work around this but the result is not ideal.
> 
> Does anybody know where to adjust the real-time clock on a Vax?  I could
> probably come up with an accurate frequency counter or just keep
> tweeking it until it is right but I can't find the crystal, much less a
> trimmer for it.
> 
>                                         Jerry Aguirre @ Olivetti ATC
>                                         uunet!amdahl!oliveb!jerry

I'm afraid there are no clock crystal frequency trimmers on any of the VAXes
that I've come across. Even if there were, you would (a) almost certainly
invalidate a DEC Maintenance Agreement by twiddling them yourself, (b) need
to keep careful track of any maintenance done and adjust it all again if the
relevant module was replaced, and (c) you'd still have thermal drift and 
crystal ageing to worry about. But don't despair! I can think of a number of
ways to do what you want, most of which don't even need a frequency counter.
But first a little background info: 

I'll discuss VMS here because that's what I know, but I'm sure the basic ideas
are quite applicable to Unix. VMS maintains the current system time in a 
software counter in memory. This is incremented at hardware clock interrupt
time, which is always every 10 milliseconds for VMS on all current VAXes.

There are actually two hardware real-time clocks present in most VAXen, with
different purposes and specifications. The first and most relevant one is the
Interval Counter, a programmable 32 bit counter which is incremented at one
microsecond intervals with a nominal .01% clock accuracy, i.e. +/- 8.64 seconds
per day. This counter is present in all VAXes other than microVAXes, which have
fixed unprogrammable 10 millisecond clock interrupts. At boot time and after
power fail restarts, VMS programs the Next Interval Count Register (internal
processor register #25) with a value of -10,000 to produce clock interrupts at
10 millisecond intervals. These are the only times VMS touches the NICR, a
handy fact which gives us our best method for tweaking the time. Again, I'd be
surprised if the various Unixes did this much differently. The other clock,
which is architecturally optional for MicroVAX implementations, is the
battery-backed Time Of Day/Time Of Year (TOD or TOY) clock. On the 725/30, 750,
780/2/5, 82/8300 series, 8600/50 and the 85/87/8800 series this is a 32 bit
unsigned binary counter (the TODR, internal processor register #27) with 10
millisecond resolution and a clock accuracy of at least .0025%, i.e. +/- around
65 seconds per month. The old microVAX/VAXstation I has no TOD/TOY clock at
all. On the microVAX/VAXstation II & 2000, and I *think* on the microVAX
III/3000 series, there is a battery backed MC146818 CMOS watch chip with 1
second resolution which is accessed as a series of 8 bit registers in I/O
address space. I haven't been able to find any accuracy specs for the 32.768
kHz crystal oscillator which drives this. The watch chip is tricky to access,
so see the KA630 CPU Module Users Guide (DEC order no. EK-KA630-UG) for more
info. Actually, the 82/8300 series also have one of these watch chips because
their TODR register is volatile and the software must reload it from the watch
chip after a power down. See the KA820 CPU Technical Manual (EK-KA820-TM) if
you want to access the watch chip on these VAXes. All the previously mentioned
registers are only accessible with the CPU in kernel mode.

The only time the TOD/TOY clock is read is at boot time, after power fail
restarts, or when a SET TIME command/$SETIME system service call with no
parameters is executed. Note that there is *no* VMS system service call
available that will simply read the TOD/TOY register without simultaneously
updating the current system time in memory. 

So,

1) It follows that the quickest and nastiest method of improving time accuracy
is to periodically update the current time using the TOD/TOY clock or some
other time reference. For VMS this can be as simple as submitting a two line
batch command file that does a SET TIME then resubmits itself for X minutes
later. 
PRO:- Really simple. Works on VAXes without programmable interval counters.
CON:- Really Ugly. This has the big disadvantage that the time corrects in a
 discontinuous fashion and may well double back on itself. Many applications
 wouldn't like that at all. Must have a TOD/TOY register or other machine
 readable time source.

2) Determine the percentage error of the system time by comparing it against
a known time standard over a period of a few days. Do a once-only change to the
NICR value of a compensating amount by running a suitable program as soon as
possible after system initialisation. For a nominal NICR value of -10,000 this
should theoretically allow a time precision of one part in 10,000 or +/- 8.6
seconds a day. By dithering the NICR value (changing it up and down by one 
count at precalculated intervals) you could get greater precision.
PRO:- No machine readable external or internal time reference required, easy to
 implement, accuracy good for most purposes. Works about as well as does
 adjusting crystal frequency.
CON:- Doesn't compensate for thermal drift and ageing. Does a mediocre job of
 synchronising clocks on multiple VAXes. Sensitive to modules being replaced.
 Difficult to get really high accuracy. Can't be done on VAXes with no
 programmable interval counter. 

3) Use a modification of method (2) along with a master time reference. The 
reference can be the TOD/TOY clock on a particular VAX or, better, some other
higher accuracy machine readable add-on clock. As before, calculate a basic
correction factor for the clock on each system as a starting point. Monitor
the master reference at regular intervals, either as an absolute time or as a
source of periodic interrupts. Compare the obtained time or interval with the
per-system clock and adjust the NICR value slightly to speed up or slow down
the per-system clock accordingly, thus tracking the master reference very
closely without any sudden discontinuous time changes.
PRO:- Accuracy virtually as good as that of master reference source and can
 be of even higher resolution. Can provide very precise synchronisation of
 multiple VAXes. Degrades gracefully to method (2) if master reference is lost.
 Almost unaffected by local clock frequency drift or component replacement.
CON:- Care must be taken to make this method tolerant of manual changes to the
 system time and date without getting confused. May require special clock
 hardware. Use as a multi-VAX synchronisation technique requires a communication
 medium with a maximum propagation delay preferably less than the resolution
 of the system clocks. Can't be done on VAXes with no programmable interval
 timer. Accessing the TOD/TOY clock is a non-trivial exercise on microVAXes.

4) For the total perfectionist for whom nothing else will do: Modify the VAX
backplane and/or modules to allow the use of your own external precision clock
signal, preferably locked to some broadcast frequency reference. For some VAXes
only a relatively small backplane mod is required to do it. Yes, a very few
(self maintenance!) users have done this, and no, I'm not going to broadcast
the requisite details over the net. Convince me that you *Really Need To Know*,
and I might relent in individual cases.
PRO:- Superb accuracy. In theory will work on all VAXes.
CON:- Tricky to do without violating FCC interference rules, and probably
 expensive. Definitely expensive if you make a mistake and blow something up.
 Your local DEC Field Service branch almost certainly won't want to know you.
 Gives poor long-suffering service techs headaches if they have to replace
 failed modified components. 


I haven't actually tried any of these methods, but they look like they should
work. Naturally, neither DEC nor myself can accept responsibility for the
consequences if you try them. Caveat emptor! I'm including a VMS MACRO program
and instructions to perform the basic method (2), but I'll leave it to some
other generous soul to write the code for multiple-VAX method (3). You'll need
to assemble and link the following program first, then you'll need to add a
command line in SYSTARTUP.COM to run the program at boot time. This sample
program needs to run as a detached process with CMKRNL privilege. It assembles
OK but I didn't have a VAX with a NICR to test it on, so beware.

Commands to assemble and link:

$ MACRO/LIS SET_CLOCK.MAR
$ LINK/EXE=SYS$SYSTEM:SET_CLOCK.EXE SET_CLOCK

Commands to be installed in SYSTARTUP.COM:

$ SET PROC/PRIV=CMKRNL
$ RUN SET_CLOCK
	/UIC		= [1,4] -
	/PROCESS_NAME	= "SET_CLOCK" -
	/ERROR		= SYS$MANAGER:SET_CLOCK.LOG -
	/INPUT		= NL: -
	/OUTPUT		= NL: -
	/PRIVILEGE	= CMKRNL


----------------------- cut here for SET_CLOCK.MAR ---------------------------
;
	.TITLE	MODIFY_NICR
	.IDENT	'X01-00'

;++
; ABSTRACT:
;	This program modifies the normal value set in
;	the Next Interval Count Register so that the
;	clock interrupts are generated at intervals as
;	close to the nominal 10 milliseconds apart
;	as possible.
;
; ENVIRONMENT:
;	Any VAX which fully implements the NICR register.
;	Running this program on a VAX without a NICR will
;	almost certainly cause a system fatal bugcheck.
;	This program must run as a detached process in
;	order to declare and receive power recovery ASTs.
;	This is necessary since upon a warm restart VMS
;	sets the NICR to its default value of -10,000. We
;	request notification of power recovery and then
;	modify the NICR upon a restart.
;--

	.SBTTL	DECLARATIONS

NICR	=	25	; Internal register number for NICR

; Value to load into NICR register

	.PSECT	DATA NOEXE
INTRVL:	.LONG	-10000		; Set this value to complement of
				; number of nominally 1uSec ticks
				; required to get real 10mSec. The
				; normal value is -10,000


	.PSECT 	CODE EXE
	.SBTTL	MOD_NICR - Modify the clock speed

;++
; DESCRIPTION:
;	This routine sets the NICR to the new value and declares
;	a power recovery AST so that it can be done again upon
;	recovery from a power fail if battery backup is present.
;
; Inputs:	None
;
; Outputs:	NICR reg is modified
;		A power recovery AST is declared
;--

	.ENTRY	MOD_NICR ^M<>

; Request power recovery notification and set up the NICR

	CALLS	#0, POWER_BACK
	BLBC	R0, DONE	; exit if error

; Hibernate while waiting for power recovery notification

	$HIBER_S

; Exit

DONE:	$EXIT_S	R0

;

	.SBTTL	POWER_BACK - Power recovery AST routine

;++
; DESCRIPTION:
;	Entered upon a boot or warm restart to set the NICR
;	back to the altered value so that the clock runs at
;	the correct speed. Performs a change mode to kernel.
;
; Inputs:	None
;
; Outputs:	NICR reg is modified
;		A power recovery AST is declared or redeclared
;--

	.ENTRY	POWER_BACK ^M<>

	$SETPRA_S POWER_BACK	; (Re)declare the AST
	BLBC	R0, 10$
	$CMKRNL_S DO_IT
10$:	RET


	.SBTTL	DO_IT - Modify the register in kernel mode

;++
; DESCRIPTION:
;	Routine to actually modify the NICR, which must be
;	done in kernel mode.
;
; Calling sequence:	Kernel mode
;
; Inputs:	None
;
; Outputs:	NICR reg is modified
;--

	.ENTRY	DO_IT ^M<>

	MTPR	INTRVL, #NICR
	MOVL	#1, R0			; Put success code in R0
	RET				; All done



	.END	MOD_NICR



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          #      Bret A. Hirshman, Esq.
 "The makers may make     #
 and the users may use,   #  DEC EasyNet: RIPPER::HIRSHMAN
 but the fixers must fix  #  USENET: hirshman@ripper.DEC.COM
 with but minimal clues"  #    or ..!{decwrl,decuac}!ripper.dec.com!hirshman
                          #  ARPA INTERNET: hirshman@ripper.DEC.COM
                 Anon.    #  Snail: Digital Equipment Corp. P/L, 18 Glen Street,
                          #         Eastwood, NSW 2122, AUSTRALIA

DISCLAIMER: The above opinions are mine (and probably mine alone, *sigh!*).
----------  

hirshman@gidday.dec.com (Bret H., Tech Support @ Sydney, Oz SNE-G 4125546) (03/26/88)

A while ago I posted a note making some suggestions about possible methods
for correcting and synchronising the system time on VAXes.  A couple of the
methods involved using the VAX TODR (Time Of Day Register) as a reference.
This is all well and good for *most* VAXes, though I didn't really give enough
information on the VAX/VMS TODR format for any one to use it successfully
without more research.

BUT (and this is a big but, sportsfans!) if you value your system uptime *DON'T*
read the TODR on any of the 85xx/87xx/88xx/89xx series VAXes (the Nautilus
family), and especially don't write to it!  You might corrupt your system time
at best, and hang your VAX at worst.

The reasons for this are too complex to explain here in what is meant to be a
quick warning note.  Suffice it to say that the implementation of the TODR on 
Nautilus-family VAXes is a lot more complex than I led you to believe in my
original posting.  In other words, I blew it.  Sorry about that, folks!

Also, if anybody posted any queries or comments on my original note I'm afraid
I didn't see them.  In the best traditions of Murphy's Law, my news feed went
down for more than a week within hours of my first posting.  Typical! :-)
So please send me mail as well as posting, or just send me mail.  It's a lot
more reliable for me.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                          #      Bret A. Hirshman, Esq.
 "The makers may make     #
 and the users may use,   #  DEC EasyNet: RIPPER::HIRSHMAN
 but the fixers must fix  #  USENET: hirshman@ripper.DEC.COM
 with but minimal clues"  #    or ..!{decwrl,decuac}!ripper.dec.com!hirshman
                          #  ARPA INTERNET: hirshman@ripper.DEC.COM
                 Anon.    #  Snail: Digital Equipment Corp. P/L, 18 Glen Street,
                          #         Eastwood, NSW 2122, AUSTRALIA

DISCLAIMER: The above opinions are mine (and probably mine alone, *sigh!*).
----------