[comp.dsp] FFTs of Low Frequency Signals

bart@videovax.tv.tek.com (Bart Massey) (11/10/89)

In article <2586@irit.oakhill.UUCP> diaz@irit.UUCP (Rafael Diaz) writes:
> In article <10208@cadnetix.COM> jensen@cadnetix.COM () writes:
> >What I am wondering about is the behavior of the transform when the
> >signals being examined have significant energies in the lower frequency
> >ranges.
> 
> From all spectral estimation techniques the FFT/DFT based has the LARGEST
> variance, therefore, using it for low frequency high resolution detection
> is the worst.
> 
> What you need is a good ARMA based technique, namely BURG and and LS. 

I don't think your choice of estimation technique really has very much to
with the frequency band of interest.  It's true that AR and ARMA models are
much better estimators of the characteristics of a single sinusoid than the
DFT, but that's sort of independent of frequency.  As other posters have
pointed out, the problem with really low frequency signals is that the
corresponding record lengths are very large, and most algorithms for signal
estimation have costs which grow as the square of the record length or so.

The typical technique for coping the problem is, I believe, to "make the
signals higher-frequency", through the magic of decimation.  Assume that all
the signals of interest are in the range 0-1KHz, and you're sampling at 16KHz
(32000 samples/sec).  If necessary, run a low-pass filter over the input, to
remove signals in the range 1-16KHz.  Note that the record length of this
filter will itself be very large, but at least it's a linear process, so
*relatively* cheap.  Now, just take every 16th sample!

What've you just done?  You've mapped the sample frequency domain 0-1KHz
onto the range 0-16KHz!  Now, if you do e.g. a 16-point FFT on your new
record, your bin centers will be at 500Hz, 1500Hz, .. , 15500Hz in the new
range, which corresponds to (500/16)Hz .. (15500/16)Hz, i.e. 31.25Hz ..
968.75Hz in the original domain.  Note that even with the FFT cost of O(N) =
N lg N, you have saved quite a bit over doing the FFT directly on the
original data.  In particular, for the FFT on the original data, the cost
was roughly 256 * 8 = 2048 units, whereas the low-pass filter plus the short
FFT costs roughly 256 + 16 * 4 = 320 units -- a significant savings, even
though you get the same amount of info about the band in question either way!

Note that using ARMA techniques on the decimated data provides an even bigger
efficiency win, since these techniques are typically O(N) >= N^2 !  In short
(no pun intended :-), decimate your data -- IMHO you'll be glad you did!

					Bart Massey
					..tektronix!videovax.tv.tek.com!bart
					..tektronix!reed.bitnet!bart

mhorne@ka7axd.WV.TEK.COM (Michael T. Horne) (11/11/89)

In a recent article by Bart Massey:
>
>The typical technique for coping (with) the problem is, I believe, to "make
>the signals higher-frequency", through the magic of decimation.  Assume that
>all the signals of interest are in the range 0-1KHz, and you're sampling at
>16KHz (32000 samples/sec).  If necessary, run a low-pass filter over the
>input, to remove signals in the range 1-16KHz.  Note that the record length
>of this filter will itself be very large, but at least it's a linear process,
>so *relatively* cheap.  Now, just take every 16th sample!
>
>What've you just done?  You've mapped the sample frequency domain 0-1KHz
>onto the range 0-16KHz!  Now, if you do e.g. a 16-point FFT on your new
>record, your bin centers will be at 500Hz, 1500Hz, .. , 15500Hz in the new
>range, which corresponds to (500/16)Hz .. (15500/16)Hz, i.e. 31.25Hz ..
>968.75Hz in the original domain.  Note that even with the FFT cost of O(N) =
>N lg N, you have saved quite a bit over doing the FFT directly on the
>original data.  In particular, for the FFT on the original data, the cost
>was roughly 256 * 8 = 2048 units, whereas the low-pass filter plus the short
>FFT costs roughly 256 + 16 * 4 = 320 units -- a significant savings, even
>though you get the same amount of info about the band in question either way!

I agree in part with Bart's comments, however decimating the data will not
provide you with any better frequency domain resolution directly.  This is
most easily shown by example.  We'll use Bart's sampling criteria as shown
above, that is:

	F  = 32 kHz			(original sampling rate)
	M  = 16				(decimation factor to use)
	F' = 2 kHz			(desired sampling rate, from F/M)
	Fc = 1 kHz			(signals of interest are below Fc)


The form of a standard sampling rate converter that decimates by a factor M
is:

                        +------+      +---+
             x(n) ----->| h(n) |----->| M |-------> y(m)
                        +------+      +---+

h(n) is designed to filter out spectral components in the range of F/M to
(F - F/M).  As Bart describes above, this would be a filter that passes
signals from 0 to ~1 kHz, and suppresses signals (ideally) from  1 to 16 kHz.
The `M' box is the decimator that throws out every M-1 filtered samples (i.e.
every M-th sample is kept).  Typical signals for the above system might
look like this (using the same frequency scale for all plots):


        |
        |------------------------//--     --//------------------------|
        |                            \   /                            |
|X(w)|  |                             \ /                             |
        +-----|-----|-----|-----//-----|-----//-----|-----|-----|-----|
        0     1     2     3            16           29    30    31    32 kHz

        |
        |----                                                     ----|
        |    \                                                   /    |
|H(w)|  |     |                                                 |     |
        +-----|-----|-----|-----//-----|-----//-----|-----|-----|-----|
        0     1     2     3            16           29    30    31    32 kHz

        |
        |----   ---------   ----//-----------//----   ---------   ----|
        |    \ /         \ /                       \ /         \ /    |
|Y(w')| |     |           |                         |           |     |
        +-----|-----|-----|-----//-----|-----//-----|-----|-----|-----|
        0     1     2     3            16           29    30    31    32 kHz


Filtering the input data stream reduces the spectral content to just the band
of interest, while the act of throwing away samples reduces the sampling
frequency to F/M.  For example, if you originally took 256 samples at
32 kHz, the output of the sample rate converter with M=16 will be 256/16 = 16
samples at 32kHz/16 = 2kHz sample rate (this assumes a continuous stream
of data points to keep the filter `primed').

The frequency resolution of an N-pt DFT is given by:

		resolution = (sample rate)/(# of samples taken)

which is really just the inverse of the duration of the sampling sequence.
If you take the DFT of the 256 sample sequence, you will find that the
`bins' are separated by 32kHz/256 = 125 Hz.  Note, however, that after you
have decimated the signal to just 16 samples at a 2 kHz sample rate, you
still have 2kHz/16 = 125 Hz resolution.  You have *not* increased your
frequency domain resolution by changing the sample rate (through decimation).

The point that I am trying to make is that just doing decimation doesn't
help you achieve more frequency resolution (though it can increase your
time-domain amplitude resolution through noise reduction); however, it does
provide you with smaller data sets to work with, a very desirable property.
You *can* get better resolution by doing 1) zero padding of the data, or
2) sampling longer at F and then decimating to reduce the data set, then
performing the DFT on the reduced data.  Both of these methods effectively
increase your sample duration, thereby increasing your frequency domain
resolution.

As a side note, there are many techniques known for reducing the computational
requirements of the filter section.  For example, since you are only
keeping every M-th output from the filter, you don't need to calculate the
filter output for those samples you throw out.  This reduces the
effective filtering rate from F calculations/sec to F/M calculations/sec,
a significant difference for large M.  Also, multi-stage filters are possible
that allow you to relax the filtering requirements in the first few
stages (by inserting "don't care" bands).  For example, you can effectively
achieve a decimate-by-16 sample rate converter by cascading two decimate-by-4
converters, yet use considerably fewer filtering taps in the two M=4 stages
combined than in the single M=16 stage.  For extended info on this topic,
I'd strongly recommend getting a copy of Crochiere & Rabiner's
"Multirate Digital Signal Processing" (Prentice-Hall).


>In short (no pun intended :-), decimate your data -- IMHO you'll be glad
>you did!

When your bands of interest are << than your sampling rate, this is a very
good recommendation from a computational point of view.

>					Bart Massey
>					..tektronix!videovax.tv.tek.com!bart
>					..tektronix!reed.bitnet!bart

Mike Horne
mhorne@ka7axd.wv.tek.com
...!tektronix!mhorne%ka7axd.wv.tek.com

harrison@sunny.DAB.GE.COM (Gregory Harrison) (11/11/89)

In article <Nov.8.16.32.08.1989.2581@caip.rutgers.edu> fineberg@caip.rutgers.edu (Adam B. Fineberg) writes:
>only be improved by increasing the observation time.  Localizing in
>frequency is always inversely proportional to localizing in time (the
>uncertainty principle).  Therefore if you want to localize in
>frequency (have better resolution) you must give up localization in
>time (use longer observation time).  Without some form of constrained
>optimization that is the best you can hope for.

The Wigner Distribution permits excellent time-frequency localization,
far surpassing an FFT in this regard.  Certain other difficulties 
arise with the Wigner Distribution (WD), primarily the generation
of cross-term interference which looks like signal energy where 
there is none.  But the cross-term interference occurs at well
defined time-frequency locations and as such its effect can 
be avoided in many circumstances.  

The time-frequency localization avoids smearing in time and frequency.
Highly precise characterizations of nonstationary signals may
be gained using the WD, or Discrete WD (DWD).

Greg Harrison
My opinions are not intended to reflect those of GE.

bart@videovax.tv.tek.com (Bart Massey) (11/12/89)

In article <5305@orca.WV.TEK.COM> mhorne@ka7axd.wv.tek.com (Mike Horne) writes:
> I agree in part with Bart's comments, however decimating the data will not
> provide you with any better frequency domain resolution directly.

He's right, you know.  The original question was about highly oversampled
VLF data, and how to cope with it without doing a million-point FFT.  I was
responding to a posting which (in my possibly flawed reading) claimed that
the best way to get around this was to instead use an ARMA model on the
data, so that you could pick the signal out using a short record length and
save computation.  My claim is that, *given that one is allowed to accumulate
a long enough input record*, accumulating more data and decimating is a
better way to reduce the computation for this measurement, regardless of
what technique is finally used to actually make the measurement.

Or, as I said in my original posting, ARMA estimators are certainly better
than DFTs at picking a single sinusoid out of white noise, regardless of the
ratio of input frequency to sample rate, and regardless of record length.
The chief disadvantages of using this class of techiques over the FFT are
that (1) they may be more computationally expensive than an FFT, and (2)
they make stronger assumptions about the form of their input than the FFT,
and thus tend to give wrong or misleading answers in cases of unexpected
input.

					Bart Massey
					..tektronix!videovax.tv.tek.com!bart
					..tektronix!reed.bitnet!bart

bart@videovax.tv.tek.com (Bart Massey) (11/12/89)

In article <5305@orca.WV.TEK.COM> mhorne%ka7axd.wv.tek.com@relay.cs.net writes:
> 
> You *can* get better resolution by doing 1) zero padding of the data, or
> 2) sampling longer at F and then decimating to reduce the data set, then
> performing the DFT on the reduced data.  Both of these methods effectively
> increase your sample duration, thereby increasing your frequency domain
> resolution.

No!  (2) will increase your resolution, (1) will not!

Certainly, zero-padding is not equivalent to increasing the sample duration
-- if you take a longer sequence of samples, you don't expect to obtain the
original sequence with a bunch of zeros appended.  As it turns out,
zero-padding the DFT input is equivalent to applying sin(x)/x interpolation
to the DFT output.  Zero-padding thus can't be adding resolution, since you
can obtain the same effect by applying a post-transformation to the output
of the original DFT on the original data.  What the zero-padding *does* do is
increase the *accuracy* of the *representation* of the data!  In other
words, in a DFT there is usually better resolution than the obvious
representation will indicate.

Given a fixed record length, the DFT can be shown to be at least as good as
any other estimator of the spectral composition of a signal, given that you
have no special a priori knowledge about the source of that signal.  There
are really only two choices for increasing your resolution; you can take
into account a better model of the signal source, as e.g. ARMA techniques
do, or you can obtain more data.

But, if you just want to better use the resolution you have, input
zero-padding, a cheap form of sin(x)/x interpolation for DFT output, is a
useful technique.

					Bart Massey
					..tektronix!videovax.tv.tek.com!bart
					..tektronix!reed.bitnet!bart

fineberg@caip.rutgers.edu (Adam B. Fineberg) (11/14/89)

>  harrison@sunny.DAB.GE.COM writes:

>  The Wigner Distribution permits excellent time frequency
>  localization far surpassing an FFT in this regard.

The wigner distribution only provides increased time-frequency
localization for non-stationary signals.  It still does not provide
better localization than the uncertainty principle.  The question was
about estimating the frequency of a stationary signal.  The only
concern was whether or not more data (longer observation time) was
needed to accuractly estimate the frequency.  The wigner distribution
would not provide any additional information over the FFT but would be
significantly more computationally expensive.

harrison@sunny.DAB.GE.COM (Gregory Harrison) (11/14/89)

In article <10208@cadnetix.COM> jensen@cadnetix.COM () writes:
>
>	The signal of interest is 0.5Hz.  For various reasons, I'm sampling at
>	a 128Hz rate.  If I collect samples for 1 sec. (128 samples), then I
>	have only captured 1/2 of a cycle of the 0.5Hz signal.
>
>	If I do an FFT on the 128 samples, will the 0.5Hz signal show up in the
>	FFT output? I'm aware that it would be spread across the discrete 

Sampling only half a sinewave will give you a DC level in the FFT.  You must 
capture more than exactly half(or less than exactly half) to get non-DC 
energy.   If you want to get energy that is due to the frequency of the sine
wave, and not just attributable to the discontinuities at the edges of
the input time series, I believe that you must capture at least a full
sine wave.

Greg Harrison
My opinions are not intended to express those of GE.

oh@m2.csc.ti.com (Stephen Oh) (11/15/89)

In article <5622@videovax.tv.tek.com> bart@videovax.tv.tek.com (Bart Massey) writes:
>
>Or, as I said in my original posting, ARMA estimators are certainly better
>than DFTs at picking a single sinusoid out of white noise, regardless of the
>ratio of input frequency to sample rate, and regardless of record length.
>The chief disadvantages of using this class of techiques over the FFT are
>that (1) they may be more computationally expensive than an FFT, and (2)
>they make stronger assumptions about the form of their input than the FFT,
>and thus tend to give wrong or misleading answers in cases of unexpected
>input.

I agree with Bart that ARMA estimators are better than FFTs. However, I don't
see any reason (expect one, I will discuss about this later) why we should use
ARMA estimators instead of AR estimators. If you want to pick sinusoids out of
white noise with moderate computations,  you should use AR estimators.
(For sinusoid estimation, the MLE of sinuoids (direct parametric approach) is
the best in term of performance, but it requires HEAVY computations.) The
main advantages over ARMAs is computational expensive. There are *lots* of
techniques available. Please refer to Marple and Kay's books.
If you want to pick deep valleyes (instead of peaks) from PSD, 
definitely you should use ARMA instead AR.

One more comment: 
      In order to improve the statistical stability of FFTs,
      it is very common to use psuedo ensemble average by segments of data.
      This causes the reduction of resolutions of PSD.
      So if you want to increase the resolution as well as to improve the
      stability, you need *lots* of computations. Because of this fact, I
      don't think that AR estimators are computationally more expensive
      than FFTs.

+----+----+----+----+----+----+----+----+----+----+----+----+----+
|  Stephen Oh         oh@csc.ti.com     |  Texas Instruments     |
|  Speech and Image Understandung Lab.  | Computer Science Center|
+----+----+----+----+----+----+----+----+----+----+----+----+----+

mhorne@ka7axd.WV.TEK.COM (Michael T. Horne) (11/15/89)

In a recent article by Bart Massey:
> > 
> > You *can* get better resolution by doing 1) zero padding of the data, or
> > 2) sampling longer at F and then decimating to reduce the data set...
> 
> No!  (2) will increase your resolution, (1) will not!
> 
> As it turns out, zero-padding the DFT input is equivalent to applying
> sin(x)/x interpolation to the DFT output.  Zero-padding thus can't be
> adding resolution, since you can obtain the same effect by applying a
> post-transformation to the output  of the original DFT on the original data.
> What the zero-padding *does* do is increase the *accuracy* of the
> *representation* of the data!

I stand corrected.  Some quick math shows that the apparent resolution gain
from zero-padding is really just an increase in resolution between samples
in the frequency domain, i.e. interpolation, as Bart has described above.
The zero-padding simply decreases the frequency-domain sampling increment,
providing more frequency-domain samples, yet no new information has been
added.

It would appear that for any finite time sequence where less than a full cycle
of the frequency components of interest are taken, the ability to discern
these components is difficult when using the FFT, apparently due to the fact
that these signals are too close to w=0 where the negative and positive
components tend to `interact'.

> But, if you just want to better use the resolution you have, input
> zero-padding, a cheap form of sin(x)/x interpolation for DFT output, is a
> useful technique.

This is interesting.  Considering the fact that the complexity of the DFT
is O(NlogN), wouldn't some sort of direct interpolation of the DFT output
be faster?  At first glance, it would appear that sample-rate conversion
techniques can be applied directly to frequency-domain data, much like it is
done for time-domain data.  It would seem that if the time-domain data set has
met the Nyquist criteria, then so has the frequency-domain data (that is, the
imaginary and real data parts, not the sqrt(i^2 + r^2)), and for large L
(interpolation) values, you might get a big win in computation rate compared
to a zero-padded DFT.  Has anyone looked into this, or is there something
fundamental that I am overlooking?

Mike Horne
mhorne%ka7axd.wv.tek.com@relay.cs.net

bart@videovax.tv.tek.com (Bart Massey) (11/17/89)

In article <98204@ti-csl.csc.ti.com> oh@m2.UUCP (Stephen Oh) writes:
> I agree with Bart that ARMA estimators are better than FFTs. However, I don't
> see any reason (expect one, I will discuss about this later) why we should use
> ARMA estimators instead of AR estimators. If you want to pick sinusoids out of
> white noise with moderate computations,  you should use AR estimators.

No.  If there's significant noise, you want to use ARMA estimators -- the MA
process zeros are necessary for accurately representing the white noise
itself.  See, e.g., Steven M. Kay, Modern Spectral Estimation, ISBN
0-13-598582-X, p. 131 -- he suggests using an ARMA(2,2) model to estimate
the spectra of a sinusoid (actually an AR(2) process) in white noise.

> One more comment: 
>       In order to improve the statistical stability of FFTs,
>       it is very common to use psuedo ensemble average by segments of data.
>       This causes the reduction of resolutions of PSD.
>       So if you want to increase the resolution as well as to improve the
>       stability, you need *lots* of computations. Because of this fact, I
>       don't think that AR estimators are computationally more expensive
>       than FFTs.

No.  Note that ARMA models are no more "statistically stable" than DFTs --
small variations in the input data may have large effects on the model
parameters.  See Kay p. 331, Figure 10.6(b) for an example.  The reason that
the FFT is done in segments usually has to do with available input storage
or desired output resolution, and the reason that the segments are
overlapped somewhat ("pseudo ensemble averaging") is often to avoid inducing
artificial "end effects" in the data rather than for statistical stability.
If you have an ARMA model, you should certainly run it over all the data you
can accumulate, or until you're certain that its parameters are sufficiently
stable, and the same is true for a DFT.

The other important thing to note is that (at least as I understand it) ARMA
estimators are roughly O(N^2) ops per data point, whereas the FFT is
O(N lg N).  This means that the overlap isn't very expensive compared to the
ARMA model.  In fact, you can do O(N/lg N) FFTs on N points in the same
amount of time it takes to run an ARMA model of equivalent resolving power
-- e.g. if N = 1000, then O(N/lg N) == 100.  This leaves *plenty* of time
for overlaps in the FFTs.

But then again, it depends on what you want to estimate.  If you "know" that
your input data consists of two sinusoids in white noise, all the cost
tradeoffs change, and I wouldn't be surprised to find that the ARMA model is
cheaper, because you can use a very low-order model (ARMA(4,4)) to get a
good estimate.  The DFT *is* the ML spectral estimator, in the absence of
any a priori model whatsoever of the input data, and it's very cheap to
compute via the FFT.  If you have a data model, the DFT is, as I understand
it, a poorer choice.

And again, if you only have 16 data points, and can't obtain more, order
analysis is really uninteresting, since the size of the implicit constants
dominates, and besides, neither method takes significant time.  There,
questions of what exactly it is you're looking for in those 16 data points
become dominant, and will usually govern your choice of analytic techniques.

					Bart Massey
					..tektronix!videovax.tv.tek.com!bart
					..tektronix!reed.bitnet!bart

smit@enel.ucalgary.ca (Theo Smit) (11/23/89)

In article <5340@orca.WV.TEK.COM> mhorne%ka7axd.wv.tek.com@relay.cs.net writes:
>
>This is interesting.  Considering the fact that the complexity of the DFT
>is O(NlogN), wouldn't some sort of direct interpolation of the DFT output
>be faster?  At first glance, it would appear that sample-rate conversion
>techniques can be applied directly to frequency-domain data, much like it is
>done for time-domain data.  It would seem that if the time-domain data set has
>met the Nyquist criteria, then so has the frequency-domain data (that is, the
>imaginary and real data parts, not the sqrt(i^2 + r^2)), and for large L
>(interpolation) values, you might get a big win in computation rate compared
>to a zero-padded DFT.  Has anyone looked into this, or is there something
>fundamental that I am overlooking?
>
>Mike Horne
>mhorne%ka7axd.wv.tek.com@relay.cs.net

If you add a pile of zeros to the end of the DFT input, you are wasting 
computational time. Consider the case where you have a sequence of four
samples, and you want to interpolate them into a 16 point DFT. I know, this
is trivial, but it illustrates the point.
Anyway, you take x0 to x3 and add 12 zeros to the end:

xn:	x0 x1 x2 x3  0  0  0  0  0  0  0  0  0  0  0  0

n:	0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15

Then bitreverse the data orders before doing a decimation-in-time FFT:

x(bitrev(n)):	x0 0  0  0  x2 0  0  0  x1 0  0  0  x3 0  0  0

n:		0  8  4  12 2  10 6  14 1  9  5  13 3  11 7  15

I would like to draw a 16 point DIT FFT butterfly diagram at this point,
but that's a pain, so you'll have to wait until the middle of 1990 when
my paper on this comes out in the IEEE ASSP journal. :-)

Anyway...

We have the FFT butterfly:
		A' = A + BW         A -   - A'
				       \ /
				        x
				       / \
		B' = A - BW	    B -W  - B'

(You know what I mean)

Now, in the first pass, the B terms are zero for all of the butterflies,
and the A terms are zero for half of the butterflies.
We know:
		A + 0W = A	(1)

		0 + 0W = 0	(2)

So half of the butterflies can be eliminated outright, and the other half
can be replaced by a replication of the upper input at both outputs.
The second pass gives us a situation where equation (1) applies to all
butterflies, so in the first two passes all we have done is to copy a
bunch of numbers around. Not too hard, even for a microprocessor. Then we
go on to do pass 3 and 4 just like in a normal FFT, and presto! we've cut
the time required to do the FFT in half. You can eliminate the time required
to copy the numbers by rewriting 'pass 3' (or wherever you have to start
calculating values) to take all inputs from the same location. In general,
if you want to interpolate from 2^m points to 2^n points, you can eliminate
the first n-m passes of the FFT.

This work, and ways to efficiently look at parts of the FFT output without
calculating all of it, has been published previously by Screenivas and Rao
in 1977 (I think - I don't have the reference handy). I extended their method
to include the case where you want to place the interpolating zeros in
the center of the sequence, which is handy if you're trying to go from the
frequency domain to the time domain.

(A side note: It takes a lot of time to get submissions published. We did this
stuff in 1987, and received word about a month ago that they were going
to publish this in 1990.)

If anyone is interested, I can provide code of our efficient zooming FFT
(That's right, EZFFT) in C.

Theo Smit (smit@enel.ucalgary.ca)