lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (09/04/90)
In article <26196@bellcore.bellcore.com> mo@bellcore.com (Michael O'Dell) writes: [describing the timer facilities of the Prisma] >TOE - Time Of Eternity clock > 64-bit up-counter, zeroed at machine reset but not randomly > loadable, incrementing continuously at the machine cycle clock > rate, readable with one alternate-address-space-load-double. >QCT - Quantum Count-down Timer > 32-bit down-counter which generates an interrupt when the counter > goes to zero. The upcounter idea is just right. The downcounter, however, has problems. To explain why this is so, let me tell you about repetitive scheduling. Many real time systems have tasks which must be performed every T ticks, and schedule creep isn't acceptable. That is, the 10th event can happen at 10T + epsilon, but it mustn't happen at 10T + 10 epsilon. [There are several valid reasons for this. Perhaps the task will sample data, which we want to graph. Perhaps we want to FFT the data - which also puts bounds on epsilon. Perhaps we have to preserve a statically determined relationship between N such tasks.] If the clocks are coarse, repetitive scheduling is impossible or else easy. The problem introduced by a fine clock is determining the _exact_ value to put in the downcounter. Some software has figured out the time of the next event: but then we have to know what Now is. On some machines, Now is hard. The time Now was scheduled to be, has gone past: the last interupt's latency has intervened. Worse, this latency is affected by interrupt lockout - perhaps the kernel didn't want to be interrupted. On some machines, latency even depends on who was interrupted - e.g. the floating point registers are only saved if the interrupted task was using them. Plus, there's the usual hardware stuff like memory contention. The crude answer is: Now = InterruptScheduledTime + FudgeFactor; where the fudge factor makes the average about right. The Prisma design is obviously better - just consult the TOE register. But are we out of the woods? No: if we code something like NextTime = ...the minimum of some set of times...; DownCountRegister = NextTime - TOE_Register; then we still get schedule creep. The assignment takes a nonzero number of clocks, and worse, it may take a variable number, given cache misses, bus contention, higher priority interrupts, whatever. So it's back to DownCountRegister = NextTime - TOE_Register - small_fudge_factor; which will (of course) become wrong when you switch compilers... We could eliminate the whole headache by scrapping downcounters, and instead having a comparator on the upcounter. When the upcounter equals the value in its Compare Register, then there's an interrupt. Period. I'm asking for _very_ simple hardware. There's no need for greater-than semantics, because the kernel software can postcheck for that (if TOE is there to check against): NextTime = ...the minimum of some set of times...; TOE_CompareRegister = NextTime; /* COWABUNGA */ if( NextTime <= TOE_Register ){ /* check versus Now */ ...unset the compare hardware.. ...mark the process as runnable... } (For five extra points: why code a postcheck, not a precheck?) In summary: I'm recommending some very simple hardware: an upcounter, a latch, and an interrupt when they're equal. But the low price isn't the virtue: the virtue is that the handler can be Right instead of just fudged. -- Don D.C.Lindsay
jgk@osc.COM (Joe Keane) (09/06/90)
In article <10383@pt.cs.cmu.edu> lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) writes: >But are we out of the woods? No: if we code something like > > NextTime = ...the minimum of some set of times...; > DownCountRegister = NextTime - TOE_Register; > >then we still get schedule creep. I don't follow you here. Whether or not we get schedule creep depends only on how we compute NextTime, which is an absolute time value. As long as this is computed from the last desired NextTime, and not from when the last interrupt was actually handled, we should be OK. There is some error in setting DownCountRegister, but it does not accumulate. The same issue comes up with the Unix select system call. It takes a time interval as a parameter, while i think it should take an absolute time.
lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (09/07/90)
In article <3760@osc.COM> jgk@osc.UUCP (Joe Keane) writes: >> NextTime = ...the minimum of some set of times...; >> DownCountRegister = NextTime - TOE_Register; >> >>then we still get schedule creep. >I don't follow you here. Whether or not we get schedule creep >depends only on how we compute NextTime, which is an absolute time >value. You are correct, I misspoke. This case gives jitter, not creep, because the TOE upcounter is always there to refer to, and in absolute units. (As someone else pointed out, higher level software is the correct place to keep the mapping of absolute ticks to absolute real-world time units.) While I'm at it, I'll point out that the code above is pseudocode. Those are 64-bit quantities, which one obtains and manipulates piecemeal. Also, there's usually some futzing with the control registers of some faraway chip. Hence, it's that much more of a pain to come up with the fudge factors. Plus, if the code is interruptible, it's hard to nail down the average case and the upper bound. I'm sure that someone out there is saying "So, lock out interrupts". That's bad: one should never lock out interrupts, given a choice. (It makes other handler's worst-cases worse.) (Nor is it good to design something that _has_ to be the highest priority device.) The beauty of the upcounter/comparator design, is that the handler _doesn't_ get more complicated (or impossible) if it has to be interruptable. The pseudocode I gave will work regardless. To be fair to the downcounter design, there are timer chips out there with a halfway fix. They have a reload register, and the downcounter is atomically reloaded from it at the interrupt. It has been pointed out to me that this fix is quite good, if you can leave some single value permanently in the reload register. However, if the software has to reload the reload register, then there's a race. -- Don D.C.Lindsay
aglew@crhc.uiuc.edu (Andy Glew) (09/07/90)
Terminology: We want regularly scheduled events at times n*Tp Jitter errors give us events at times n*Tp+epsilon_n, where epsilon_n is some small random variable. Drift errors are of the form n*Tp + SUM for i from 1 to n of epsilon_i. Note that since epsilon_i may not have a symmetric distribution the errors may accumulate. Some of the previous posters have called drift errors creep errors. Let's not get too hung up on terminology. Some jitter is inherent in any timer event (except perhaps hardware signals caused by the timer) due to software indeterminacy, caches, bus traffic, etc. Drift, however, can be compensated for in most conventional timers. In a count-down timer, you can try to compensate for drift if the timer continues to count down. You read the count down timer, compare it to what it should have been, use it to adjust the next timer value you write. Essentially, this gives you twice the jitter, but you can adjust for drift long term. If you have a reload register, it can be set for regular periods of one cyclic process. But if you have two cyclic processes of incommensurate cycles, you have to reload the timer register anyway, which gives the same increased jitter problem. If you have an atomic exchange with timer operation, you can use this to adjust for drift - it's equivalent to reading the underflowing timer register, with reduced indeterminacy between the timer read and the timer load. Atomic add to timer can be used in much the same way. All of these let you compensate for this cycle's error on the next cycle, giving you essentially twice the jitter. As Donald Lindsey has written, an absolute timer with comparators can be used to eliminate drift. It also reduces jitter, in that you only have to worry about the indeterminacy of one timer interrupt. A more serious example of drift is between two separate time sources - like when your "microsecond" timer is not quite in agreement with the wall clock. Typically, then you have to use the same sort of time warping methodologies that the fellow from Wien was talking about. But then you have to worry about distinguishing between people who really want your best approximation of a rate (I want this done 1000 times a second for all of this satellite's pass-over) and those who want your best approximation of an exact time. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]