smit@enelc.uucp (01/13/90)
/*
*
* Efficient Zooming Fast Fourier Transform (EZFFT)
*
* FFT Sinc Interpolation for Center Padded Data
*
* Copyright (C) 1989 Theo J. Smit
*
* To compile : cc -O4 -ffpa -o ezfft ezfft.c -lm
* -f68881
* -fswitch
*
* To use binary input and output files, add -DBINARY in the above 'cc'
* statement. Default is ascii (human-readable) form. The input file
* should have one real and one imaginary datum per line. Output
* will also be one complex data point per line.
*
* Algorithm description:
*
* This is a 1D fft, intended to interpolate data sequences that look
* like this:
* |-- --|
* | \ / |
* | \ / |
* |____\___/____|
*
* This is what most frequency spectra look like (roughly), so this
* algorithm is most useful for inverse transforming and interpolating
* simultaneously.
*
* The algorithm is based on the fact that the data occupies known
* places in the input data sequence. The zeros that are usually
* inserted in the center give known results for the first several
* passes, so these calculations can be eliminated.
*
* We define the (DIT) butterfly operation as:
*
* A' = A + BW(k)
*
* B' = A - BW(k)
*
* where A and B are complex quantities, and W(k) = exp(2*pi*k/N), where
* N is the zoomed FFT point size. A will be referred to as the 'upper'
* input to the butterfly, and B is the 'lower' input. Similarly,
* A' and B' are the upper and lower outputs, respectively.
* It helps to draw these things out as usually done in the texts,
* or as in the AMD Am29540 data sheets.
*
* We also assume there are M = 2^m input points, to be zoomed into
* N = 2^n FFT points. If we don't want to compute all of the output points,
* we can prune the output to get only the first R = 2^r output points.
*
* End padding the data sequence is a trivial case of efficient zooming,
* since for the first several passes, the lower (B) inputs are always
* zero, which means A' = B' = A. Therefore, we can eliminate the first
* (n-m) passes of the FFT, simply replicate the data points as required,
* and perform all of the remaining butterflies.
*
* Center padding the data is more involved, since in some cases the upper input
* of the butterfly will be zero, giving the result:
*
* A' = BW(k)
*
* B' = -BW(k)
*
* This means we will need to actually do some calculating before we can
* start the (n-m+1) pass of the FFT. The algorithm I developed sums up all of
* the W(k) along each path to the n-m+1 pass (this is easy to do in polar form)
* and multiplies signs as required, to generate a special set of coefficients
* for the (n-m+1) pass, W'(k).
* We then perform this pass using the W'(k), and the rest of the passes using
* the W(k) coefficients.
*
* Pruning:
*
* Often it is not required to look at all of the output data. In this case,
* we don't need to perform all of the last pass of butterflies. If we restrict
* ourselves to looking only at the first R = 2^r output points, we can
* systematically eliminate the calculation of all unnecessary butterflies.
* This program also does that.
*
* The end padding and output pruning have been previously described in IEEE
* papers by Screenivas and Rao, Skinner, and Markel.
*
* This center padding algorithm will be described more fully in the (hopefully)
* September issue of IEEE ASSP (transactions?) magazine.
*
* I'm placing this program in the public domain. If you like it - great. If
* you don't - oh well. If you use this algorithm in a commercial application,
* I'd like to hear about it.
*
* Theo Smit
* Department of Electrical Engineering
* University of Calgary
* 2500 University Drive, NW
* Calgary AB T2N 1N4
* CANADA
*
* smit@enel.ucalgary.ca
*
* DISCLAIMER: Use the contents of this program and its output at your own risk.
*/
#include <stdio.h>
#include <strings.h>
#include <sys/file.h>
#include <math.h>
#define TRUE (1)
#define FALSE (0)
#define R_PERM (0400)
#define W_PERM (0644)
#define R_OPT (O_RDONLY)
#define W_OPT (O_WRONLY|O_CREAT|O_TRUNC)
typedef struct _complex { float re, im; } complex;
typedef struct _polar {float mag, pha;} polar;
char *malloc(), /* System space allocation routine */
*progname; /* Pointer to program name */
double atof(); /* System ascii to double converter */
int *rbit(), /* Bit erversal routine */
ilog2(); /* Routine to find log to base 2 */
main(argc, argv)
int argc;
char **argv;
{
FILE *infp = stdin,
*outfp = stdout,
*fopen();
char *inname, /* Input file name pointer */
*outname, /* Output file name pointer */
S[80];
int i,j,
points = 256, /* Number of points in input */
zoomfact = 1, /* Input zoom factor */
prunefact = 1, /* Output pruning factor */
zoompts = 256, /* Number of points after zooming */
prunepts = 256, /* Number of points after pruning */
*brev, /* Pointer to bitreversed array */
infd = 0, /* Input file descriptor (default=stdin)*/
outfd = 1, /* Output file descriptor (stdout) */
NAME = FALSE, /* Flag set when filenames used */
INVERSE = FALSE;/* Flag set for inverse FFT */
complex *data, /* Pointer to data array */
*coef, /* Pointer to coefficient array */
*f_coef; /* Pointer to first pass coefficients */
progname = argv[0]; /* Decode command line args */
if (argc != 4 && argc < 5) {
fprintf (stderr,"usage: %s [infile] points zoom prune [inv]\n",
progname);
exit(1);
}
if (argc == 4) {
points = atoi (argv[1]);
zoomfact = atoi(argv[2]);
prunefact = atoi(argv[3]);
}
else {
inname = argv[1];
NAME = TRUE;
outname = malloc((unsigned)(strlen(inname)+strlen(".fft")));
strcpy (outname, inname);
strcat (outname,".fft");
points = atoi (argv[2]);
zoomfact = atoi(argv[3]);
prunefact = atoi(argv[4]);
if (argc == 6)
if (strcmp(argv[5],"inv") == 0) INVERSE = TRUE;
}
if (NAME == TRUE) { /* Open input and output files */
#ifdef BINARY
if ((infd = open(inname,R_OPT,R_PERM)) == -1) abort(inname);
if ((outfd = open(outname,W_OPT,W_PERM)) == -1) abort(outname);
#else
if ((infp = fopen(inname,"r")) == NULL) abort(inname);
if ((outfp = fopen(outname,"w")) == NULL) abort(outname);
#endif BINARY
}
zoompts = points * zoomfact; /* Allocate space for arrays */
prunepts = zoompts/prunefact;
data = (complex *)malloc((unsigned)(zoompts * sizeof(complex)));
coef = (complex *)malloc((unsigned)(zoompts/2 * sizeof(complex)));
f_coef = (complex *)malloc((unsigned)(zoomfact * sizeof(complex)));
/* Read in input data */
#ifdef BINARY
fprintf(stderr,"%s: read %d bytes\n",progname,
read(infd,data,points*sizeof(complex)));
close (infd);
#else
for(i = 0; i < points; i++) {
fgets(S, 80, infp);
sscanf(S, "%lf%lf", &(data[i].re), &(data[i].im));
}
#endif BINARY
/* Calculate coefficients */
calc_coefs(coef, zoompts, INVERSE);
calc_first_coefs(f_coef, points, zoomfact, INVERSE);
/* Rearrange data for FFT */
brev = rbit (points);
swap_data(data, brev, points, zoomfact);
for(i = 0; i < zoompts; i++)
printf("%d: %lf %lf\n", i, data[i].re,data[i].im);
/* Do FFT */
fft(data, coef, f_coef, zoompts, zoomfact, prunefact);
/* Write output data */
#ifdef BINARY
fprintf(stderr,"%s: wrote %d bytes\n",progname,
write(outfd,(char *)data,prunepts*sizeof(complex)));
close (outfd);
#else
for(i = 0; i < prunepts; i++) {
fprintf(outfp, "%lf %lf\n", data[i].re, data[i].im);
}
fclose (outfp);
#endif BINARY
free ((char *)data);
free ((char *)coef);
free ((char *)f_coef);
free ((char *)brev);
}
/************************************************************************/
/* the FFT is done by doing a pass using special coefficients to perform
* the zooming; followed by several (or none) normal passes, followed by
* several partial passes to effect the output pruning
*
* Some parts of this algorithm should be changed for greater efficiency
* (most notably the recurring floating point division to address W(k)).
*/
fft(data, coef, f_coef, points, zoom, prune)
complex *data, /* Pointer to data */
*coef, /* Pointer to coefficients */
*f_coef; /* Pointer to first pass coefficients */
int points, /* Number of points */
zoom, /* Zoom factor */
prune; /* Prune factor */
{
int i,
j,
pass, /* Pass number */
start, /* Where to begin a set of butterflies */
prunepts, /* Number of points after pruning */
logpts, /* log to base 2 of points */
logzoom, /* log to base 2 of zoom factor */
logprune, /* log to base 2 of prune factor */
_do = 1, /* Number of butterflies to do in a set */
skip = 2; /* # of data points to skip between sets*/
complex *a, /* Index of first point in butterfly */
*b, /* Index of second point in butterfly */
*a2, /* Index of first source for butterfly */
*b2, /* Index of second source for butterfly */
A, /* Butterfly operation = */
B, /* A' = A + BW */
W, /* B' = A - BW */
BW; /* Use these for temporary storage */
/* For a given zoomfactor n, we */
/* can skip the first log2(n) */
logpts = ilog2(points); /* passes */
logzoom = ilog2(zoom);
logprune = ilog2(prune);
for(i = 0; i < logzoom; i++) {
_do *= 2; /* Have to get right skip factor*/
skip *= 2;
}
a2 = data + _do - 1;
b2 = a2 + _do;
A = *a2;
B = *b2;
for(j = 0; j < _do; j++) { /* First butterfly set on first */
a = data + j; /* pass uses normal coefficients*/
b = a + _do;
W=coef[(int)(points*((float)j/(float)_do))/2];
BW.re = B.re*W.re - B.im*W.im;
BW.im = B.im*W.re + B.re*W.im;
a->re = A.re + BW.re;
a->im = A.im + BW.im;
b->re = A.re - BW.re;
b->im = A.im - BW.im;
}
for(i = 1; i < points/skip; i++) { /* Rest of first real pass - */
start = i * skip; /* special coefficients */
a2 = data + start + _do - 1;
b2 = a2 + _do;
A = *a2;
B = *b2;
for(j = 0; j < _do; j++) {
a = data + start + j;
b = a + _do;
W=f_coef[j];
BW.re = B.re*W.re - B.im*W.im;
BW.im = B.im*W.re + B.re*W.im;
a->re = A.re + BW.re;
a->im = A.im + BW.im;
b->re = A.re - BW.re;
b->im = A.im - BW.im;
}
}
_do *= 2;
skip *= 2;
/* Now we do normal FFT passes */
for (pass = logzoom + 1; pass < logpts - logprune; pass++) {
for(i = 0; i < points/skip; i++) {
start = i * skip;
for(j = 0; j < _do; j++) {
a = data + start + j;
b = a + _do;
W=coef[(int)(points*((float)j/(float)_do))/2];
A = *a;
B = *b;
BW.re = B.re*W.re - B.im*W.im;
BW.im = B.im*W.re + B.re*W.im;
a->re = A.re + BW.re;
a->im = A.im + BW.im;
b->re = A.re - BW.re;
b->im = A.im - BW.im;
}
}
_do *= 2;
skip *= 2;
}
prunepts = points / prune;
for (; pass < logpts; pass++) { /* Output pruning passes */
for(i = 0; i < points/skip; i++) {
start = i * skip;
for(j = 0; j < prunepts; j++) {
a = data + start + j;
b = a + _do;
W=coef[(int)(points*((float)j/(float)_do))/2];
A = *a;
B = *b;
BW.re = B.re*W.re - B.im*W.im;
BW.im = B.im*W.re + B.re*W.im;
a->re = A.re + BW.re;
a->im = A.im + BW.im;
}
}
_do *= 2;
skip *= 2;
}
}
/************************************************************************/
swap_data(data, brev, points, zoom)
complex *data; /* Pointer to data */
int *brev, /* Pointer to bitreversed array */
points, /* Number of points */
zoom; /* Zoom factor */
{
int i, j;
complex dummy; /* Temporary space */
/* First we will reorder the data */
for(i = 1; i < points-1; ++i) {
if(brev[i] < i) {
dummy = data[i];
data[i] = data[brev[i]];
data[brev[i]] = dummy;
}
}
/* Then move'er out! */
for(i = points-1; i >= 0; i--)
data[(i+1) * zoom - 1] = data[i];
/* move out to last spot */
}
/************************************************************************/
int *rbit(points)
int points; /* Number of points to be bitreversed */
{
int i,
j,
logpts, /* Log base 2 of points */
fiddle = 1, /* Half of some (or all) of the points */
*brev; /* Pointer to bitreversed array */
/* Allocate space for array & initialize*/
brev = (int *)malloc((unsigned)(points*sizeof(int)));
brev[0] = 0;
logpts = ilog2 (points);
for (i = 0; i < logpts; i++) { /* Generate bitreversed sequence*/
for(j = 0; j < fiddle; j++) {
brev[j] *= 2;
brev[j + fiddle] = brev[j]+1;
}
fiddle *= 2;
}
return(brev); /* Return pointer to array */
}
/************************************************************************/
int ilog2(num)
int num;
{
int result = 0;
while (num > 1){
num /= 2;
result += 1;
}
return (result);
}
/************************************************************************/
calc_first_coefs(fcoefs, points, zoom, inv)
complex *fcoefs; /* Pointer to coefficient array */
int points, /* Number of points */
zoom, /* Zoom factor */
inv; /* Flag set for inverse FFT */
{
int i, j;
float sign; /* Sign of phases (depends on inv) */
polar *phases; /* Phase correction factors */
phases = (polar *)malloc ((unsigned)(2*zoom*sizeof(polar)));
for (i = 0; i < zoom; i++) { /* Initialize correction factors*/
phases[i].mag = 1.0;
phases[i].pha = 0.0;
}
for (i = 1; i <= zoom; i *= 2) { /* Calculate correction */
for( j = 0; j < i; j++) {
phases[j].pha += (float)j/(float)i;
phases[j+i].mag = -1.0*phases[j].mag;
phases[j+i].pha = phases[j].pha;
}
}
sign = ((inv == TRUE) ? (1.0) : (-1.0));
for (i = 0; i < zoom; i++) { /* Convert to rectangular */
fcoefs[i].re = phases[i].mag * cos(sign * M_PI*phases[i].pha);
fcoefs[i].im = phases[i].mag * sin(sign * M_PI*phases[i].pha);
}
free((char*)phases);
}
/************************************************************************/
calc_coefs(coefs, points, inv)
complex *coefs; /* Pointer to coefficients */
int points, /* Number of points */
inv; /* Flag set for inverse FFT */
{
int i;
float angle = 0.0, /* Angle of complex sinusoid */
inc; /* Increment for complex sinusoid */
inc = ((inv == TRUE) ? (1.0) : (-1.0)) * M_PI / (float)(points/2);
for (i = 0; i < points/2; i++) {/* We only need N/2 coefficients */
/* to do an N point FFT */
coefs[i].re = cos(angle);
coefs[i].im = sin(angle);
angle += inc;
}
}
/************************************************************************/
abort(error)
char *error;
{
fprintf(stderr,"ERROR! %s!\n",error);
exit(1);
}smit@.ucalgary.ca (Theo Smit) (01/13/90)
Additional notes to my previous posting: Obviously, the compile line in the source was intended for Sun-3 systems. Modify to suit your tastes. I will not be reading news regularly from now on, since I start another job at NovAtel Communications on the 15th. I will try to post a forwarding address if at all possible. However, personal email should still work if you use smit@enel.ucalgary.ca If this doesn't work, you can send messages to me via knud@enel.ucalgary.ca (Steve Knudsen) He's a regular reader of comp.dsp, so I will hear about comments through him, if my acount here gets toasted sooner than I think it will be. happy dsp'ing, Theo Smit.