josh@klaatu.rutgers.edu (J Storrs Hall) (04/27/89)
One commonly reads in articles about arbiter circuits that it has been proven that the problem of metastability cannot be completely avoided. Is there an actual proof anywhere? --JoSH
rpw3@amdcad.AMD.COM (Rob Warnock) (04/27/89)
josh@klaatu.rutgers.edu (J Storrs Hall) writes: +--------------- | One commonly reads in articles about arbiter circuits that | it has been proven that the problem of metastability cannot | be completely avoided. Is there an actual proof anywhere? | --JoSH +--------------- The way I heard it, just as momentum & position are "conjugate" quantities subject to Heisenberg Uncertainty limits if you try to measure both at the same time, so are energy & time. A synchronizer tries to measure with absolute precision whether an energy (the "AND" of data and clock, typically) is above or below a threshold, and tries to do the measurement in a finite time. You can't do both. So that's the impossibility, at some very fundamental level. But most real synchronizers have failure rates far worse than the Heisenberg limit... Most of the circuits I've seen that claimed to "solve" the sychronizer problem either (1) simply pushed the energy-threshold decision around to a part of the circuit where you normally wouldn't think to look for it ("solution" by sweeping under the rug -- but the dirt's still there), or (2) "hide" the fact that they can delay making a decision for a while in some cases. The real trick to making a good (not perfect) synchronizer is getting a latch stage with a very high gain-bandwidth product "around the loop". This shows up as a small "rho" parameter in the MTBF equation. There are published papers (especially the one by Tom Cheney, at U Wash.) which give measured parameters of "delta" (a.k.a. "t0") and "rho" for various commercial parts. (The Fairchild 74F74 & 74F374 and AMD Am298xx parts are pretty good. I generally don't use anything else.) You can also make a two-stage synchronizer, where the second stage can fail only if the first stage comes out of metastable just as the second stage clocks. [You have to bias the output of the first stage so a first- stage metastable looks like a clean one or zero to the second stage.] Depending on your environment, this is sometimes better than clocking a single-stage synchronizer at 1/2 the rate. Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403
afscian@violet.waterloo.edu (Anthony Scian) (04/27/89)
In article <Apr.26.15.07.22.1989.3661@klaatu.rutgers.edu> josh@klaatu.rutgers.edu (J Storrs Hall) writes: >One commonly reads in articles about arbiter circuits that >it has been proven that the problem of metastability cannot >be completely avoided. Is there an actual proof anywhere? Yes, but I'm not sure where it was proven; try these references: T.J.Chaney, A Comprehensize Bibliography on Synchronizers and Arbiters Technical Memorandum # 306C, Institute for Biomedical Computing, Washington University, St.Louis T.J.Chaney, C.E.Molnar, Anomalous Behaviour of Synchronizer and Arbiter Circuits IEEE Transactions on Computers, Vol. C-22 (1973), pp. 421-422 L.R.Morino, General Theory of Metastable Operation IEE Transactions on Computers, Vol. C-30, 2 (1981), pp. 107-115 Most of the work is being done at Washington University, you should contact either Chaney or Molnar for further information. Anthony //// Anthony Scian afscian@violet.uwaterloo.ca afscian@violet.waterloo.edu //// "I can't believe the news today, I can't close my eyes and make it go away" -U2
segall@caip.rutgers.edu (Ed Segall) (04/28/89)
rpw3@amdcad.UUCP writes: > A synchronizer tries to measure with absolute > precision whether an energy (the "AND" of data and clock, typically) is above > or below a threshold, and tries to do the measurement in a finite time. You > can't do both. So that's the impossibility, at some very fundamental level. > But most real synchronizers have failure rates far worse than the Heisenberg > limit... From your description, this doesn't seem to prove that metastability is necessary. If the state of a line is 0, and it asynchronously changes to 1, a carefully designed synchronizer wouldn't mind if the transition isn't noticed on the first succeeding clock edge. Rather, it would want either a clean 0 or a clean 1. If the line stays 1, it would definitely want to see a clean 1 by the next edge. Notice that a consequence of this is that asynchronous pulses shorter than one clock period are not guaranteed to be noticed. Of course, if you want to catch short pulses, you would put a pulse catcher in front of the synchronizer. What your 'uncertainty' explanation seems to imply is that it is impossible for a synchronizer to always give the right answer - e.g. that it might say zero when it should say 1. This would seem to be a fatal flaw unless you confine the errors to be on transitions only (as I explained above). I think most systems can be designed to handle a one-cycle delay in noticing _valid_ transitions. What they can't handle is invalid logic levels, which may result from metastable states. Of course, since I haven't read the paper you referenced, I can't tell if it covers this situation. Would you post more detailed references? Is the Cheney paper you refer to actually: Chaney, T. J., Littlefield, W.M., and Ornstein, S.M. "Beware the Synchronizer," in Digest of Papers of the Sixth Annual IEEE Computer Society International "Conference, San Francisco, CA Sept 1972? It's referenced in Fletcher. If anyone out there would like a simple explanation of why metastable states occur when flip-flops are used as synchronizers, see: Hodges and Jackson, "Analysis and Design of Digital Integrated Circuits," McGraw-Hill 1983 --Ed -- uucp: {...}!rutgers!caip.rutgers.edu!segall arpa: segall@caip.rutgers.edu
henry@amdcad.AMD.COM (Henry Choy) (04/28/89)
In article <25423@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes: >josh@klaatu.rutgers.edu (J Storrs Hall) writes: >+--------------- >| One commonly reads in articles about arbiter circuits that >| it has been proven that the problem of metastability cannot >| be completely avoided. Is there an actual proof anywhere? >| --JoSH >+--------------- > >The real trick to making a good (not perfect) synchronizer is getting >a latch stage with a very high gain-bandwidth product "around the loop". The GBP also depends heavily on the capacitive loading at the output of the loop. If the synchronizer is designed on chip, I'll try to minimize the caps (diffusion, interconnects, fanout) by using a small output driver, and not let anything else to touch the cross-coupled nodes. >You can also make a two-stage synchronizer, where the second stage can >fail only if the first stage comes out of metastable just as the second >stage clocks. [You have to bias the output of the first stage so a first- >stage metastable looks like a clean one or zero to the second stage.] >Depending on your environment, this is sometimes better than clocking >a single-stage synchronizer at 1/2 the rate. > My feeling is that the propagation delay of the second stage significantly reduce the available time for synchronization. So instead of having a probability fo failure (with respect to the available time) Pr(T-Tprop), the two-stage synchronizer has a probability approximately Pr(T/2 - Tprop)**2. But Pr(t) is exponential -t/tau Pr(t) = Ke (tau = sqrt(1/GB)) +T/2tau Pr(T/2 - Tprop) = Pr(T - Tprop) * e 2 So, (Pr(T/2 - Tprop)) Tprop/tau ------------------ = Ke Pr(T - Tprop) Typically, Tprop >> tau and K is O(1). Just my opinion, Rob. > >Rob Warnock >Systems Architecture Consultant > >UUCP: {amdcad,fortune,sun}!redwood!rpw3 >DDD: (415)572-2607 >USPS: 627 26th Ave, San Mateo, CA 94403 > A paper by L. Marino (IEEE Trans. on Comp., Feb.81) and another by Lindsay Kleeman & A. Cantoni (same, Jan. 87) has proofs on metastable problems. Several papers on metastability (all published in JSSC) deserve credits: Veendrick, 4'80, Flannagan, 8'85, Sakurai, 8'88 Henry Choy Advanced Micro Devices, Inc. henry@amdcad.AMD.com
jjb@sequent.UUCP (Jeff Berkowitz) (04/28/89)
In article <25423@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes: > >You can also make a two-stage synchronizer, where the second stage can >fail only if the first stage comes out of metastable just as the second >stage clocks. [You have to bias the output of the first stage so a first- >stage metastable looks like a clean one or zero to the second stage.] >Depending on your environment, this is sometimes better than clocking >a single-stage synchronizer at 1/2 the rate. > I believe that Digital used to take all 74S74 parts (back when they were an important part of real CPUs) and run a very precise test on (Tsetup + Thold) - the actual length "window" surrounding the positive going clock edge during which data had to be stable in order to get a clean "1" on Q in a guaranteed (short) time. Real parts were quite variable with respect to this parameter. (The spec for the 74S74 is 3ns setup, 2ns hold from my TTL data book - supposedly they found that some parts were three orders of magnitude better, in the area of a few picoseconds). The ones that were particularly good were given a special part number and were then used a the second stage of the circuit described above. This served as a practical engineering approach to minimizing the failure rate. What do current VLSI designers do to minimize the likelyhood of metastable failure? -- Jeff Berkowitz N6QOM uunet!sequent!jjb Sequent Computer Systems Custom Systems Group
henry@amdcad.AMD.COM (Henry Choy) (04/29/89)
In article <Apr.27.15.59.55.1989.4457@caip.rutgers.edu> segall@caip.rutgers.edu (Ed Segall) writes: > >From your description, this doesn't seem to prove that metastability >is necessary. If the state of a line is 0, and it asynchronously >changes to 1, a carefully designed synchronizer wouldn't mind if the >transition isn't noticed on the first succeeding clock edge. Rather, >it would want either a clean 0 or a clean 1. If the line stays 1, it >would definitely want to see a clean 1 by the next edge. Notice that But even a carefully designed synchronizer would have a FINITE probability of failure, even if it is 10e-10. Those transitions that occur when the synchronizers samples the line can ALWAYS happen. >fatal flaw unless you confine the errors to be on transitions only (as >I explained above). I think most systems can be designed to handle a >one-cycle delay in noticing _valid_ transitions. What they can't Then again not all systems can handle a one-cycle delay. And even if you can use a one cycle delay, the cycle time is reducing over the years (20MHz -> 33M -> 45M -> ???) and will continue to reduce. Sooner or later the cycle time is not going to worth much with a relative slow synchronizer. > >--Ed >uucp: {...}!rutgers!caip.rutgers.edu!segall >arpa: segall@caip.rutgers.edu Henry Choy Advanced Micro Devices, Inc. Disclaimer: I do not represent the company on the net.