deckel@relay.nswc.navy.mil (06/04/90)
In the Read This First for SunOS 4.0.3 there is a section called "Known Problems". Under the subsection called "Kernel", it talks about a time-of-day sync problem which causes the system clock to advance six minutes a year. It suggests a workaround in which a cron job is set up to run each night just after midnight to reset the time of day (add 1 second per day). We have our server, a Sun-3/160 running 4.0.3, set up as the timehost. The other workstations on the network synchronize their system clocks with the timehost using the "rdate" command. This works just fine, but my supervisor is complaining that the server's time is fast. What I need to know is: is this update of the system clock necessary? Has anyone else set up their 4.0.3 machines to update the system clock in this way? When my supervisor talked to one of the tech reps from Sun, she said she had never heard of this problem or this workaround. So now he keeps questioning whether or not it is necessary. Any ideas or opinions would be appreciated. Debbie Eckel Naval Surface Warfare Center deckel@relay.nswc.navy.mil
limes@ouroborous.eng.sun.com (Greg Limes) (06/07/90)
There has been enough traffic relating to this problem recently that perhaps a difinitive statement from the engineer involved would be in order, especially since this involves a direct binary kernel patch -- as someone mentioned earlier, you want to be completely and absolutely sure of your sources on this stuff. Here are some excerpts from the bug report involved; since I wrote 99% of the report, I don't expect anyone will begrudge the extraction ... and having wrote this up once before, there is no need for me to write new text now. Yes, there are other problems with the clock; I hate it when people ask me why their $5 watch tells time better than their $50k computer. No, I do not know what the problem is; with this patch in place, the kernel's notion of the time can drift a couple of seconds from the chip, but over the long term the accuracy is determined by a chip that is supposed to be accurate. Public Summary: Once a week, SPARCsystem 300 series machines will run their real-time clocks alternately 10% fast and 10% slow attempting to synchronize to a time calculated improperly from the clock chip. Description: Manufacturing has been complaining that once a week, all of their Stingrays' clocks start going bonkers. First they run slow for a while, then they run fast for a while, and so on (or vice-versa). This always starts at 5pm, and is reported by the date comparison script as a deviation from the server. Evaluation: The support routine in the kernel that generates the unix-format time (seconds past 00:00GMT 01JAN70) does some sanity checking on the data, and returns a zero if the clock appears to be uninitialized. Part of this check is to see if the WEEKDAY is out of the restricted range from 0 through 6. This range needs to be widened, since the TOD chip returns an entirely different range, 1 through 7. Work around: To get around this bug, reset the time of day just after midnight GMT (i.e. 5pm PST). The easiest way to do this is to log in as "root", do a "crontab -e", and add the following line to the list: 0 17 * * * /bin/date -a 1 For time zones other than PST, you will need to adjust the hour that this happens. For instance, for systems running in Eastern Standard Time, this would be 0 21 * * * /bin/date -a 1 Alternately, one can apply the following patch to the SunOS 4.0.3 kernel (SPARC only!) to fix the problem: To patch a running kernel: (will be fixed until next reboot) # adb -w -k /vmunix /dev/mem todget+0x1e0/X <old value should be 80a3e006> todget+0x1e0/W 80a3e007 $q # To patch the kernel on disk: # adb -w /vmunix todget+0x1e0?X <old value should be 80a3e006> todget+0x1e0?W 80a3e007 $q # To patch future kernels: # adb -w /sys/sun4/OBJ/clock.o todget+0x1e0?X <old value should be 80a3e006> todget+0x1e0?W 80a3e007 $q # Greg Limes limes@sun.com ...!sun!limes 73327,2473 CGDB02A [choose one] A Fool and his Money are soon Partying ...