news@ti-csl.csc.ti.com (USENET News System) (11/20/89)
From: oh@m2.csc.ti.com (Stephen Oh) Path: m2!oh >No. If there's significant noise, you want to use ARMA estimators -- the MA >process zeros are necessary for accurately representing the white noise >itself. See, e.g., Steven M. Kay, Modern Spectral Estimation, ISBN >0-13-598582-X, p. 131 -- he suggests using an ARMA(2,2) model to estimate >the spectra of a sinusoid (actually an AR(2) process) in white noise. Hmmm. good point. You are talking about the following system: x(t) = a1 x(t-1) + a2 x(t-2) + n(t) y(t) = x(t) + e(t) where n(t) and e(t) are white noise processes. Then, y(t) = a1 x(t-1) + a2 x(t-2) + n(t) + e(t) = a1 (y(t-1) - e(t-1)) + a2 (y(t-2) - e(t-2)) + n(t) + e(t) = a1 y(t-1) + a2 y(t-2) + n(t) + e(t) - a1 e(t-1) - a2 e(t-2) This equation seems like ARMA (2,2) model even though it is not exactly ARMA(2,2) we want. But, what the heck, n(t) and e(t) are mutually independent and white noise processes. So, if e(t) is the dominating noise process, the y(t) seems to obey an ARMA(2,2) process, which is not EXACTLY an AR(2) model. >No. Note that ARMA models are no more "statistically stable" than DFTs -- >small variations in the input data may have large effects on the model >parameters. See Kay p. 331, Figure 10.6(b) for an example. Then what about Kay p.327, Figure 10.5 (b), which is another algorithm with the same ARMA modeling? But, I agree with this kind of parametric approches are sensitive to small variations. As far as I understand, if you choose higher order for an ARMA model, you are safe from those effects. >The reason that >the FFT is done in segments usually has to do with available input storage >or desired output resolution, and the reason that the segments are ^^^^^^^^^^^^^^^^^^^^^^^^^ What do you mean by desired output resolution by segments? Case that you want to have lower resolution? The achievable resolution by FFT is only 1/N where N is the number of samples. That's why we call AR or ARMA approches "high resolution frequency estimators." >But then again, it depends on what you want to estimate. If you "know" that >your input data consists of two sinusoids in white noise, all the cost >tradeoffs change, and I wouldn't be surprised to find that the ARMA model is >cheaper, because you can use a very low-order model (ARMA(4,4)) to get a >good estimate. The DFT *is* the ML spectral estimator, in the absence of >any a priori model whatsoever of the input data, and it's very cheap to >compute via the FFT. If you have a data model, the DFT is, as I understand >it, a poorer choice. Again, for the resolutions of PSD, parametric approches are *ALOT* better than FFT-based PSD. >And again, if you only have 16 data points, and can't obtain more, order >analysis is really uninteresting, since the size of the implicit constants >dominates, and besides, neither method takes significant time. There, >questions of what exactly it is you're looking for in those 16 data points >become dominant, and will usually govern your choice of analytic techniques. > True. +----+----+----+----+----+----+----+----+----+----+----+----+----+ | Stephen Oh oh@csc.ti.com | Texas Instruments | | Speech and Image Understandung Lab. | Computer Science Center| +----+----+----+----+----+----+----+----+----+----+----+----+----+
ashok@atrp.mit.edu (Ashok C. Popat) (11/23/89)
In article <98990@ti-csl.csc.ti.com> oh@m2.UUCP (Stephen Oh) writes: > >Again, for the resolutions of PSD, parametric approches are *ALOT* better than >FFT-based PSD. > In applications, you don't always have a good apriori formal model. Unless you have a formal model that's *useful* for your application, parametric estimation is worthless. Suppose I gave you some data (say 10^6 samples) and told you that the source was ergodic, but nothing else. How would you estimate the spectrum? If you used an ARMA model, how would you decide what the order of the model should be? Wouldn't you have much more confidence in an averaged-periodogram (i.e., DFT-based) estimate? I would. Ashok Chhabedia Popat MIT Rm 36-665 (617) 253-7302
rob@kaa.eng.ohio-state.edu (Rob Carriere) (11/23/89)
In article <1989Nov22.170850.21777@athena.mit.edu>, ashok@atrp.mit.edu (Ashok C. Popat) writes: > In applications, you don't always have a good apriori formal model. > Unless you have a formal model that's *useful* for your application, > parametric estimation is worthless. > > Suppose I gave you some data (say 10^6 samples) and told you that the > source was ergodic, but nothing else. How would you estimate the > spectrum? If you used an ARMA model, how would you decide what the > order of the model should be? Wouldn't you have much more confidence > in an averaged-periodogram (i.e., DFT-based) estimate? I would. Nor necessarily. DFT is quite good at some things, not at others. If you give me recorded data that I can play with for a while, I would probably run FFT, several different periodograms, ARMA or Prony models of several orders and whatever alse the data made me feel like. After doing all that, I'd feel reasonably confident I could tell you something about your data. If averaged periodograms showed different behavior in different segments of the data, that means you also want to look at parametric models over subsets of the data. In short, if I knew that little no ONE technique would make me happy. And finally, the fact that the DFT is non-parametric does not mean that you aren't making assumptions about the data (in fact, you're assuming periodicity -- something that doesn't always make sense either) SR "But the real reason is, I just like to play."
ashok@atrp.mit.edu (Ashok C. Popat) (11/27/89)
In article <3589@quanta.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes: >In article <1989Nov22.170850.21777@athena.mit.edu>, ashok@atrp.mit.edu (Ashok >C. Popat) writes: >> In applications, you don't always have a good apriori formal model. >> Unless you have a formal model that's *useful* for your application, >> parametric estimation is worthless. >> >> Suppose I gave you some data (say 10^6 samples) and told you that the >> source was ergodic, but nothing else. How would you estimate the >> spectrum? If you used an ARMA model, how would you decide what the >> order of the model should be? Wouldn't you have much more confidence >> in an averaged-periodogram (i.e., DFT-based) estimate? I would. > >Nor necessarily. DFT is quite good at some things, not at others. If you >give me recorded data that I can play with for a while, I would probably run >FFT, several different periodograms, ARMA or Prony models of several orders >and whatever alse the data made me feel like. After doing all that, I'd feel >reasonably confident I could tell you something about your data. Sounds reasonable on the surface --- try a few well known techniques, then sort of mentally average the results to conclude something about the data. The problem is that there is absolutely no justification for trying some of the techniques. What you want is a consistent, unbiased estimate of the spectrum of an (unknown) ergodic random process, given a bunch of samples. Averaging periodograms (e.g., Welch's method) gives you a consistent, asymptotically unbiased estimate. What a parametric technique gives you depends strongly on the assumed model (which isn't given as part of the problem). >If averaged periodograms showed different behavior in different segments of >the data, that means you also want to look at parametric models over subsets >of the data. Nope, ergodicity implies stationarity. You'd have to attribute the behavior to chance. >In short, if I knew that little no ONE technique would make me happy. And Any consistent, unbiased, and efficient estimate should make you happy. An estimate based on unfounded assumptions should not. >finally, the fact that the DFT is non-parametric does not mean that you aren't >making assumptions about the data (in fact, you're assuming periodicity -- >something that doesn't always make sense either) You are making assumptions, but periodicity isn't one of them. Remember, DFT-based spectral estimation *doesn't* mean simply computing the DFT of the data. In fact, it is well known that a single periodogram (the magnitude squared of the DFT) is a particularly shitty spectral estimate, since it is biased, and since its variance doesn't diminish with the amount of data you use (see Oppenheim and Schafer, Chapt. 11). DFT-based spectral estimation usually involves some sort of averaging of short, modified periodograms. Now, an argument can be made that this type of estimation also assumes something about the data, but the concept is subtle. I suggest Ronald Christiansen's "Entropy Minimax Sourcebook" for philosophical/technical discussions of problems in statistical inference. Ashok Chhabedia Popat MIT Rm 36-665 (617) 253-7302
oh@m2.csc.ti.com (Stephen Oh) (11/27/89)
In article <1989Nov22.170850.21777@athena.mit.edu> ashok@atrp.mit.edu (Ashok C. Popat) writes: >In article <98990@ti-csl.csc.ti.com> oh@m2.UUCP (Stephen Oh) writes: >> >>Again, for the resolutions of PSD, parametric approches are *ALOT* better than >>FFT-based PSD. >> > >In applications, you don't always have a good apriori formal model. >Unless you have a formal model that's *useful* for your application, >parametric estimation is worthless. > >Suppose I gave you some data (say 10^6 samples) and told you that the >source was ergodic, but nothing else. How would you estimate the >spectrum? If you used an ARMA model, how would you decide what the >order of the model should be? Wouldn't you have much more confidence >in an averaged-periodogram (i.e., DFT-based) estimate? I would. > >Ashok Chhabedia Popat MIT Rm 36-665 (617) 253-7302 Your assumption is too strong. You have 10^6 samples with ergodicity? What if you have 10^6 samples with only wide sense stationary? What if you have 10^6 smaples with only partially w.s.s? BTW, I said that parametric approaches are better than FFTs in terms of resolution. If we have only 100 samples and the separation of two frequencies is less than 0.01, there is no way to resolve two frequencies using any FFT-based method. But AR or ARMA can. :-) :-) Also, there are several methods to determine the order of the model such as AIC, MDL, CAT, etc. +----+----+----+----+----+----+----+----+----+----+----+----+----+ | Stephen Oh oh@csc.ti.com | Texas Instruments | | Speech and Image Understandung Lab. | Computer Science Center| +----+----+----+----+----+----+----+----+----+----+----+----+----+
ashok@atrp.mit.edu (Ashok C. Popat) (11/29/89)
In article <99691@ti-csl.csc.ti.com> Stephen Oh writes: >In article <1989Nov22.170850.21777@athena.mit.edu> ashok@atrp.mit.edu (Ashok C. Popat) writes: >>Suppose I gave you some data (say 10^6 samples) and told you that the >>source was ergodic, but nothing else. How would you estimate the >>spectrum? If you used an ARMA model, how would you decide what the >>order of the model should be? Wouldn't you have much more confidence >>in an averaged-periodogram (i.e., DFT-based) estimate? I would. > >Your assumption is too strong. You have 10^6 samples with ergodicity? >What if you have 10^6 samples with only wide sense stationary? >What if you have 10^6 smaples with only partially w.s.s? I'm not exactly sure what you mean by "too strong" --- it's a "given" in the problem. Are you saying that in many applications, waveforms cannot be usefully modeled as ergodic? If so, I'll buy that. I guess I shouldn't have used "ergodic," since that lumps too many assumptions together. How about agreeing that any piecewise stationary process we discuss is ergodic over each stationary piece (if I hadn't brought up ergodicity in the first place, this would not have been worth mentioning, since we'd have to assume piecewise ergodicity to infer anything it all). That leaves the issue of stationarity. Now from what I remember of stochastic processes, wide-sense and strict-sense stationarity are the same if you're dealing exclusively with second-order statistics (e.g., power spectrum). I guess then I could have described my hypothetic source as being WSS over 10^6 samples. A poor model for speech and images, but realistic in other applications. >BTW, I said that parametric approaches are better than FFTs in terms of >resolution. If we have only 100 samples and the separation of two frequencies >is less than 0.01, there is no way to resolve two frequencies using any >FFT-based method. But AR or ARMA can. :-) :-) Good point. I thought about this and here's what I came up with. The duration-bandwidth uncertainty principle says (for continuous-time waveforms) that delta_t*delta_f >= 1/pi where delta_t is the time window size and delta_f is the frequency resolution (see William Siebert, _Circuits, Signals, and Systems_). I'm sure a similar result applies in the discrete-time case, but I don't have a reference off hand --- I'll assume it has the same form. Now if you're starting with only 100 samples, the uncertainty principle says that there's simply not enough information in the data to get a high-resolution spectrum. If you do manage to get a high-resolution spectrum, the necessary added information must have come from the model, not the data. What do you think? >Also, there are several methods to determine the order of the model such as >AIC, MDL, CAT, etc. Any recommended reading on these techniques? Ashok Chhabedia Popat MIT Rm 36-665 (617) 253-7302
rob@kaa.eng.ohio-state.edu (Rob Carriere) (11/29/89)
In article <1989Nov26.194904.1376@athena.mit.edu> ashok@atrp.mit.edu (Ashok C. Popat) writes: >In article <3589@quanta.eng.ohio-state.edu> rob@kaa.eng.ohio-state.edu (Rob Carriere) writes: >>In article <1989Nov22.170850.21777@athena.mit.edu>, ashok@atrp.mit.edu (Ashok >>C. Popat) writes: >>> Unless you have a formal model that's *useful* for your application, >>> parametric estimation is worthless. >>> Suppose I gave you some data (say 10^6 samples) and told you that the >>> source was ergodic, but nothing else. >>Nor necessarily. DFT is quite good at some things, not at others. If you >>give me recorded data that I can play with for a while, I would probably run >>FFT, several different periodograms, ARMA or Prony models of several orders >>and whatever alse the data made me feel like. After doing all that, I'd feel >>reasonably confident I could tell you something about your data. > > Sounds reasonable on the surface --- try a few well known techniques, > then sort of mentally average the results to conclude something about > the data. The problem is that there is absolutely no justification > for trying some of the techniques. What you want is a consistent, > unbiased estimate of the spectrum of an (unknown) ergodic random > process, given a bunch of samples. Averaging periodograms (e.g., > Welch's method) gives you a consistent, asymptotically unbiased > estimate. What a parametric technique gives you depends strongly on > the assumed model (which isn't given as part of the problem). Well, that's good. It seems the surface agrees with the inside here :-) What I want is something that gives me a good idea of what is going on. Depending on the circumstances, consistent unbiased may or may not cut it as a good idea. The standard counterexample is to try ML on some data with two closely spaced spectral peaks. It is not at all hard to set the stage so that ML will miserably fail to separate the peaks. If my interest was primarily in the number of spectral peaks present, as it is in some applications, then it is going to be a small consolation indeed to know that at least variance has been minimized. If you are saying that we should have more knowledge of what parametric methods do when the model doesn't fit reality, I entirely agree. There is a body of knowledge, but it is entirely empirical and ad-hoc. >>If averaged periodograms showed different behavior in different segments of >>the data, that means you also want to look at parametric models over subsets >>of the data. > > Nope, ergodicity implies stationarity. You'd have to attribute the > behavior to chance. More probably, I'd attribute it to an umwaranted assumption of ergodicity. I don't know how these things are done elsewhere, but I've seen too many cases where ergodicity or even stationarity was assumed just because. If I saw clear trends between segments of the data, I'd be _very_ unlikely to attribute it to chance. >>In short, if I knew that little no ONE technique would make me happy. And > > Any consistent, unbiased, and efficient estimate should make you > happy. An estimate based on unfounded assumptions should not. If minimal variance is what I'm after, yes. However, all these things tend to have the word "asymptoticaly" before all the good stuff. All too often that means "whenever you have about 10 times more data." Consider also that since speed of convergence typically depends on the (unknown) characteristics of the data, there are "unfounded assumptions" no matter where you turn. >>finally, the fact that the DFT is non-parametric does not mean that you aren't >>making assumptions about the data (in fact, you're assuming periodicity -- >>something that doesn't always make sense either) > > You are making assumptions, but periodicity isn't one of them. > Remember, DFT-based spectral estimation *doesn't* mean simply > computing the DFT of the data. In fact, it is well known that a Yes. I goofed. Apologies for spreading disinformation. SR
oh@m2.csc.ti.com (Stephen Oh) (11/30/89)
In article <1989Nov28.185555.4259@athena.mit.edu> ashok@atrp.mit.edu (Ashok C. Popat) writes: >I'm not exactly sure what you mean by "too strong" --- it's a "given" >in the problem. Are you saying that in many applications, waveforms >cannot be usefully modeled as ergodic? Yes. >I guess then I could >have described my hypothetic source as being WSS over 10^6 samples. A >poor model for speech and images, but realistic in other applications. Sure you can do that. But, still I wonder whether 10^6 wss samples are good or not since 10^6 samples are too big. Don't you think? When I said partially wss, for some portions of data: it is wss but for whole data: it is not. >Good point. I thought about this and here's what I came up with. The >duration-bandwidth uncertainty principle says (for continuous-time >waveforms) that > delta_t*delta_f >= 1/pi >where delta_t is the time window size and delta_f is the frequency >resolution (see William Siebert, _Circuits, Signals, and Systems_). >I'm sure a similar result applies in the discrete-time case, but I >don't have a reference off hand --- I'll assume it has the same form. The resolution = 1/N where N is the number of samples. >Now if you're starting with only 100 samples, the uncertainty >principle says that there's simply not enough information in the data >to get a high-resolution spectrum. If you do manage to get a >high-resolution spectrum, the necessary added information must have >come from the model, not the data. What do you think? I don't know why you brought up the uncertainty principle, but is there any measure that 100 samples are not enough to get a high resolution estimate? (This is not a flame, I just want to know) Your statement is true though: From Kay's book (ISBN 0-13-598582-X), "Windowing of data or ACF values makes the implicit assumption that the unobserved data or ACF values outside the window are zero, which is nomally an unrealistic assumption. A smeared spectral estimate is a consequence of the windowing. Often, we have more knowledge about the process from which the data samples are taken, or at least we are able to make a more reasonable assumption other than to assume the data of ACF values are zero outside the window." And I agree with Kay. >Any recommended reading on these techniques? 1. H. Akaike, "A New Look at Statistical Model Identification," IEEE Trans. Automat. Contr., Vol. AC-19, 1974 2. E. J. Hannan, "The Estimation of the Order of an ARMA Process," Ann. Statist., Vol.8 1980. 3. R. L. Kashyap, "Optimal Choice of AR and MA parts in Autoregressive Moving Average Models," IEEE Trans. Pattern Anal. Mach. Intell., Vol PAMI-4, 1982 4. L. Marple, "Digital Spectral Analysis with Applications," Prentice-Hall, 1987 5. S. Kay, "Modern Spectral Estimation," Prentice-Hall, 1988 +----+----+----+----+----+----+----+----+----+----+----+----+----+ | Stephen Oh oh@csc.ti.com | Texas Instruments | | Speech and Image Understandung Lab. | Computer Science Center| +----+----+----+----+----+----+----+----+----+----+----+----+----+