P.E.Smee@gdr.bath.ac.uk (05/29/90)
We're having problems with the real-time clock on a SPARCserver 330 (SunOS 4). Basically, every so often it slips by an (apparently) randomish amount of time. After losing a couple seconds a day all last week (which I could live with) it decided on Monday night to suddenly set itself back by an hour and 13 minutes. Anyone else seen anything like this? Any ideas where to start looking? It doesn't happen often enough for us to have detected a pattern yet. Things we *think* we have checked and eliminated (unless we're being hacked by someone clever enough to clean up their tracks): 1) Doesn't appear to be due to power outages. (The supply is a notionally clean UPS anyway.) 2) Access on 'date' and relatives looks good; no unaccounted uses of 'date' in lastcomm. Similarly, no reboots at a time which could explain it. No unaccounted-for logins. 3) We don't believe we are configured to allow the time to be set by remote machines. The server only 'knows' (at TCP/IP level) about two other machines anyway, both of which have the right time and are under our control. (Might we be missing something here?) 4) Our crontabs look good to us. Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132
exspes@gdr.bath.ac.uk (P E Smee) (06/01/90)
In article <8245@brazos.Rice.edu> P.E.Smee@gdr.bath.ac.uk writes: >X-Sun-Spots-Digest: Volume 9, Issue 188, message 3 > >We're having problems with the real-time clock on a SPARCserver 330 (SunOS >4). Had several responses to my plea. I've tried to respond to them all, but a couple of them bounced. Many were 'if you find out, tell me'. There were enough of those that I suspect there are even more I haven't heard from, so would like to also post my synopsis in the news. Scott Leadley (cc.rochester.edu) offers: > This is a known problem with SunOS 4.0.n on the Sun 4/330. Sun may > have have a bug fix for it, but the workaround that they gave us (a > couple of months ago) is to put: > > 0 21 * * * /bin/date -a 1 1> /tmp/datelog 2>&1 > > in the root crontab. His surmise is that the hardware TOD clock remains correct (because his system seems to come up with the right time if you reboot), and that it is the software 'kernel clock' which slips. And that date -a might force the kernel to resync with the hardware. From Sun's response (below) this sounds likely. Though I would expect that it wouldn't STOP the jumps, but simply force you back into sync now and again so you didn't get too far out. (On the other hand ours only jumps once or twice a week, so maybe a daily resync would do it.) Ralph Finch (California Dept of Water Resources) mentioned that there were patches for this. Actually, he even offered to send them. I passed on this. Our root is paranoid and won't allow anything on the system unless he personally knows the originator (or at least who he can sue :-), or can check it myself. And I can't expect the other staff people to obey my rules unless I do. However, this was still very useful, as it meant that it wasn't us doing something stupid. It's a new machine type for us. We couldn't believe that a 30000 pound (sterling, more or less -- not sure what we actually spent) machine couldn't keep time as well as my 30 pound wristwatch, so were convinced it was us. Armed with this, we went after Sun technical support hotline. They provided us with a one location patch -- in fact, a 1-bit patch -- to apply in 3 places. (Object patch; we didn't get a source license.) I don't expect you to trust me any more than I trust strangers, so I'm not gonna tell you what it is. Besides, I don't know what, if any, mods you've made to things, or if UK SunOS is identical to US SunOS. Ring your Sun support center. The official Sun description is: | There is a bug in SunOS 4.0.3 which causes the Sun 4300 processor board to | be unable to synchronize the kernel's notion of the time of day with the | TOD chip. | | This applies ONLY to SunOS 4.0.3 for the Sun 4 ... Tell them you know about their guilty secret, and you want the answer. Thanks everyone. Cheers... Paul Smee, Computing Service, University of Bristol, Bristol BS8 1UD, UK P.Smee@bristol.ac.uk - ..!uunet!ukc!bsmail!p.smee - Tel +44 272 303132