ken@gvax.cs.cornell.edu (Ken Birman) (10/03/89)
Ellen Vorhees noted that ISIS logs on her system sometimes contain large numbers of messages like these: ... Time is now Mon Oct 2 10:22:59 1989 Time warp! Adjusting timers by 1.943 secs Time warp! Adjusting timers by 2.671 secs Time warp! Adjusting timers by 5.622 secs ... Time is now Mon Oct 2 10:25:05 1989 Time warp! Adjusting timers by 2.507 secs ... Time is now Mon Oct 2 10:26:58 1989 Time warp! Adjusting timers by 1.366 secs She asks if this is normal for ISIS, and what it means. Well, it isn't normal. The message arises as follows. The ISIS protocols process, bin/protos, spends its time in a main loop where it does a select() waiting for messages to arrive, with a timeout that wakes it up when it should next take a timer based action. If it wakes up AFTER the timeout should have expired, protos complains about a "time warp" and prints the delay. Thus, a warp of 2.302 seconds would mean that protos expected a timeout to interrupt its select 2.302 seconds ago, and was surprised that it didn't. Protos always adjusts its timers when this happens so that they don't all timeout a once, before it has time to read any messages that might now be on the input queues. For delays below about 1.25 seconds nothing is printed at all. Normally, you get one message like this a day, around 2am when UNIX runs its clock demon. This demon resets the system clock, and ISIS notices the jump next time it runs. Aside from printing a message, there should be no visible effect at all. A log with many "Time warp" messages means something odd is going on. This would normally be a sign that some process has gotten very large and is paging heavily (thrashing). On some versions of UNIX, it could also mean that there is a UNIX bug in the select(2) system call, such that the timeout doesn't always work. Or, it might be due to someone using "nice" to run a compute-bound process at very high priority, starving protos for CPU cycles. To look for an explanation, you might start by running vmstat and watching for a pattern -- like very high paging rates. With 'ps' you can see how large processes have gotten, and with the ISIS snapshot commands you can even get some memory statistics. Ken