smit@enelc.uucp (01/13/90)
/* * * Efficient Zooming Fast Fourier Transform (EZFFT) * * FFT Sinc Interpolation for Center Padded Data * * Copyright (C) 1989 Theo J. Smit * * To compile : cc -O4 -ffpa -o ezfft ezfft.c -lm * -f68881 * -fswitch * * To use binary input and output files, add -DBINARY in the above 'cc' * statement. Default is ascii (human-readable) form. The input file * should have one real and one imaginary datum per line. Output * will also be one complex data point per line. * * Algorithm description: * * This is a 1D fft, intended to interpolate data sequences that look * like this: * |-- --| * | \ / | * | \ / | * |____\___/____| * * This is what most frequency spectra look like (roughly), so this * algorithm is most useful for inverse transforming and interpolating * simultaneously. * * The algorithm is based on the fact that the data occupies known * places in the input data sequence. The zeros that are usually * inserted in the center give known results for the first several * passes, so these calculations can be eliminated. * * We define the (DIT) butterfly operation as: * * A' = A + BW(k) * * B' = A - BW(k) * * where A and B are complex quantities, and W(k) = exp(2*pi*k/N), where * N is the zoomed FFT point size. A will be referred to as the 'upper' * input to the butterfly, and B is the 'lower' input. Similarly, * A' and B' are the upper and lower outputs, respectively. * It helps to draw these things out as usually done in the texts, * or as in the AMD Am29540 data sheets. * * We also assume there are M = 2^m input points, to be zoomed into * N = 2^n FFT points. If we don't want to compute all of the output points, * we can prune the output to get only the first R = 2^r output points. * * End padding the data sequence is a trivial case of efficient zooming, * since for the first several passes, the lower (B) inputs are always * zero, which means A' = B' = A. Therefore, we can eliminate the first * (n-m) passes of the FFT, simply replicate the data points as required, * and perform all of the remaining butterflies. * * Center padding the data is more involved, since in some cases the upper input * of the butterfly will be zero, giving the result: * * A' = BW(k) * * B' = -BW(k) * * This means we will need to actually do some calculating before we can * start the (n-m+1) pass of the FFT. The algorithm I developed sums up all of * the W(k) along each path to the n-m+1 pass (this is easy to do in polar form) * and multiplies signs as required, to generate a special set of coefficients * for the (n-m+1) pass, W'(k). * We then perform this pass using the W'(k), and the rest of the passes using * the W(k) coefficients. * * Pruning: * * Often it is not required to look at all of the output data. In this case, * we don't need to perform all of the last pass of butterflies. If we restrict * ourselves to looking only at the first R = 2^r output points, we can * systematically eliminate the calculation of all unnecessary butterflies. * This program also does that. * * The end padding and output pruning have been previously described in IEEE * papers by Screenivas and Rao, Skinner, and Markel. * * This center padding algorithm will be described more fully in the (hopefully) * September issue of IEEE ASSP (transactions?) magazine. * * I'm placing this program in the public domain. If you like it - great. If * you don't - oh well. If you use this algorithm in a commercial application, * I'd like to hear about it. * * Theo Smit * Department of Electrical Engineering * University of Calgary * 2500 University Drive, NW * Calgary AB T2N 1N4 * CANADA * * smit@enel.ucalgary.ca * * DISCLAIMER: Use the contents of this program and its output at your own risk. */ #include <stdio.h> #include <strings.h> #include <sys/file.h> #include <math.h> #define TRUE (1) #define FALSE (0) #define R_PERM (0400) #define W_PERM (0644) #define R_OPT (O_RDONLY) #define W_OPT (O_WRONLY|O_CREAT|O_TRUNC) typedef struct _complex { float re, im; } complex; typedef struct _polar {float mag, pha;} polar; char *malloc(), /* System space allocation routine */ *progname; /* Pointer to program name */ double atof(); /* System ascii to double converter */ int *rbit(), /* Bit erversal routine */ ilog2(); /* Routine to find log to base 2 */ main(argc, argv) int argc; char **argv; { FILE *infp = stdin, *outfp = stdout, *fopen(); char *inname, /* Input file name pointer */ *outname, /* Output file name pointer */ S[80]; int i,j, points = 256, /* Number of points in input */ zoomfact = 1, /* Input zoom factor */ prunefact = 1, /* Output pruning factor */ zoompts = 256, /* Number of points after zooming */ prunepts = 256, /* Number of points after pruning */ *brev, /* Pointer to bitreversed array */ infd = 0, /* Input file descriptor (default=stdin)*/ outfd = 1, /* Output file descriptor (stdout) */ NAME = FALSE, /* Flag set when filenames used */ INVERSE = FALSE;/* Flag set for inverse FFT */ complex *data, /* Pointer to data array */ *coef, /* Pointer to coefficient array */ *f_coef; /* Pointer to first pass coefficients */ progname = argv[0]; /* Decode command line args */ if (argc != 4 && argc < 5) { fprintf (stderr,"usage: %s [infile] points zoom prune [inv]\n", progname); exit(1); } if (argc == 4) { points = atoi (argv[1]); zoomfact = atoi(argv[2]); prunefact = atoi(argv[3]); } else { inname = argv[1]; NAME = TRUE; outname = malloc((unsigned)(strlen(inname)+strlen(".fft"))); strcpy (outname, inname); strcat (outname,".fft"); points = atoi (argv[2]); zoomfact = atoi(argv[3]); prunefact = atoi(argv[4]); if (argc == 6) if (strcmp(argv[5],"inv") == 0) INVERSE = TRUE; } if (NAME == TRUE) { /* Open input and output files */ #ifdef BINARY if ((infd = open(inname,R_OPT,R_PERM)) == -1) abort(inname); if ((outfd = open(outname,W_OPT,W_PERM)) == -1) abort(outname); #else if ((infp = fopen(inname,"r")) == NULL) abort(inname); if ((outfp = fopen(outname,"w")) == NULL) abort(outname); #endif BINARY } zoompts = points * zoomfact; /* Allocate space for arrays */ prunepts = zoompts/prunefact; data = (complex *)malloc((unsigned)(zoompts * sizeof(complex))); coef = (complex *)malloc((unsigned)(zoompts/2 * sizeof(complex))); f_coef = (complex *)malloc((unsigned)(zoomfact * sizeof(complex))); /* Read in input data */ #ifdef BINARY fprintf(stderr,"%s: read %d bytes\n",progname, read(infd,data,points*sizeof(complex))); close (infd); #else for(i = 0; i < points; i++) { fgets(S, 80, infp); sscanf(S, "%lf%lf", &(data[i].re), &(data[i].im)); } #endif BINARY /* Calculate coefficients */ calc_coefs(coef, zoompts, INVERSE); calc_first_coefs(f_coef, points, zoomfact, INVERSE); /* Rearrange data for FFT */ brev = rbit (points); swap_data(data, brev, points, zoomfact); for(i = 0; i < zoompts; i++) printf("%d: %lf %lf\n", i, data[i].re,data[i].im); /* Do FFT */ fft(data, coef, f_coef, zoompts, zoomfact, prunefact); /* Write output data */ #ifdef BINARY fprintf(stderr,"%s: wrote %d bytes\n",progname, write(outfd,(char *)data,prunepts*sizeof(complex))); close (outfd); #else for(i = 0; i < prunepts; i++) { fprintf(outfp, "%lf %lf\n", data[i].re, data[i].im); } fclose (outfp); #endif BINARY free ((char *)data); free ((char *)coef); free ((char *)f_coef); free ((char *)brev); } /************************************************************************/ /* the FFT is done by doing a pass using special coefficients to perform * the zooming; followed by several (or none) normal passes, followed by * several partial passes to effect the output pruning * * Some parts of this algorithm should be changed for greater efficiency * (most notably the recurring floating point division to address W(k)). */ fft(data, coef, f_coef, points, zoom, prune) complex *data, /* Pointer to data */ *coef, /* Pointer to coefficients */ *f_coef; /* Pointer to first pass coefficients */ int points, /* Number of points */ zoom, /* Zoom factor */ prune; /* Prune factor */ { int i, j, pass, /* Pass number */ start, /* Where to begin a set of butterflies */ prunepts, /* Number of points after pruning */ logpts, /* log to base 2 of points */ logzoom, /* log to base 2 of zoom factor */ logprune, /* log to base 2 of prune factor */ _do = 1, /* Number of butterflies to do in a set */ skip = 2; /* # of data points to skip between sets*/ complex *a, /* Index of first point in butterfly */ *b, /* Index of second point in butterfly */ *a2, /* Index of first source for butterfly */ *b2, /* Index of second source for butterfly */ A, /* Butterfly operation = */ B, /* A' = A + BW */ W, /* B' = A - BW */ BW; /* Use these for temporary storage */ /* For a given zoomfactor n, we */ /* can skip the first log2(n) */ logpts = ilog2(points); /* passes */ logzoom = ilog2(zoom); logprune = ilog2(prune); for(i = 0; i < logzoom; i++) { _do *= 2; /* Have to get right skip factor*/ skip *= 2; } a2 = data + _do - 1; b2 = a2 + _do; A = *a2; B = *b2; for(j = 0; j < _do; j++) { /* First butterfly set on first */ a = data + j; /* pass uses normal coefficients*/ b = a + _do; W=coef[(int)(points*((float)j/(float)_do))/2]; BW.re = B.re*W.re - B.im*W.im; BW.im = B.im*W.re + B.re*W.im; a->re = A.re + BW.re; a->im = A.im + BW.im; b->re = A.re - BW.re; b->im = A.im - BW.im; } for(i = 1; i < points/skip; i++) { /* Rest of first real pass - */ start = i * skip; /* special coefficients */ a2 = data + start + _do - 1; b2 = a2 + _do; A = *a2; B = *b2; for(j = 0; j < _do; j++) { a = data + start + j; b = a + _do; W=f_coef[j]; BW.re = B.re*W.re - B.im*W.im; BW.im = B.im*W.re + B.re*W.im; a->re = A.re + BW.re; a->im = A.im + BW.im; b->re = A.re - BW.re; b->im = A.im - BW.im; } } _do *= 2; skip *= 2; /* Now we do normal FFT passes */ for (pass = logzoom + 1; pass < logpts - logprune; pass++) { for(i = 0; i < points/skip; i++) { start = i * skip; for(j = 0; j < _do; j++) { a = data + start + j; b = a + _do; W=coef[(int)(points*((float)j/(float)_do))/2]; A = *a; B = *b; BW.re = B.re*W.re - B.im*W.im; BW.im = B.im*W.re + B.re*W.im; a->re = A.re + BW.re; a->im = A.im + BW.im; b->re = A.re - BW.re; b->im = A.im - BW.im; } } _do *= 2; skip *= 2; } prunepts = points / prune; for (; pass < logpts; pass++) { /* Output pruning passes */ for(i = 0; i < points/skip; i++) { start = i * skip; for(j = 0; j < prunepts; j++) { a = data + start + j; b = a + _do; W=coef[(int)(points*((float)j/(float)_do))/2]; A = *a; B = *b; BW.re = B.re*W.re - B.im*W.im; BW.im = B.im*W.re + B.re*W.im; a->re = A.re + BW.re; a->im = A.im + BW.im; } } _do *= 2; skip *= 2; } } /************************************************************************/ swap_data(data, brev, points, zoom) complex *data; /* Pointer to data */ int *brev, /* Pointer to bitreversed array */ points, /* Number of points */ zoom; /* Zoom factor */ { int i, j; complex dummy; /* Temporary space */ /* First we will reorder the data */ for(i = 1; i < points-1; ++i) { if(brev[i] < i) { dummy = data[i]; data[i] = data[brev[i]]; data[brev[i]] = dummy; } } /* Then move'er out! */ for(i = points-1; i >= 0; i--) data[(i+1) * zoom - 1] = data[i]; /* move out to last spot */ } /************************************************************************/ int *rbit(points) int points; /* Number of points to be bitreversed */ { int i, j, logpts, /* Log base 2 of points */ fiddle = 1, /* Half of some (or all) of the points */ *brev; /* Pointer to bitreversed array */ /* Allocate space for array & initialize*/ brev = (int *)malloc((unsigned)(points*sizeof(int))); brev[0] = 0; logpts = ilog2 (points); for (i = 0; i < logpts; i++) { /* Generate bitreversed sequence*/ for(j = 0; j < fiddle; j++) { brev[j] *= 2; brev[j + fiddle] = brev[j]+1; } fiddle *= 2; } return(brev); /* Return pointer to array */ } /************************************************************************/ int ilog2(num) int num; { int result = 0; while (num > 1){ num /= 2; result += 1; } return (result); } /************************************************************************/ calc_first_coefs(fcoefs, points, zoom, inv) complex *fcoefs; /* Pointer to coefficient array */ int points, /* Number of points */ zoom, /* Zoom factor */ inv; /* Flag set for inverse FFT */ { int i, j; float sign; /* Sign of phases (depends on inv) */ polar *phases; /* Phase correction factors */ phases = (polar *)malloc ((unsigned)(2*zoom*sizeof(polar))); for (i = 0; i < zoom; i++) { /* Initialize correction factors*/ phases[i].mag = 1.0; phases[i].pha = 0.0; } for (i = 1; i <= zoom; i *= 2) { /* Calculate correction */ for( j = 0; j < i; j++) { phases[j].pha += (float)j/(float)i; phases[j+i].mag = -1.0*phases[j].mag; phases[j+i].pha = phases[j].pha; } } sign = ((inv == TRUE) ? (1.0) : (-1.0)); for (i = 0; i < zoom; i++) { /* Convert to rectangular */ fcoefs[i].re = phases[i].mag * cos(sign * M_PI*phases[i].pha); fcoefs[i].im = phases[i].mag * sin(sign * M_PI*phases[i].pha); } free((char*)phases); } /************************************************************************/ calc_coefs(coefs, points, inv) complex *coefs; /* Pointer to coefficients */ int points, /* Number of points */ inv; /* Flag set for inverse FFT */ { int i; float angle = 0.0, /* Angle of complex sinusoid */ inc; /* Increment for complex sinusoid */ inc = ((inv == TRUE) ? (1.0) : (-1.0)) * M_PI / (float)(points/2); for (i = 0; i < points/2; i++) {/* We only need N/2 coefficients */ /* to do an N point FFT */ coefs[i].re = cos(angle); coefs[i].im = sin(angle); angle += inc; } } /************************************************************************/ abort(error) char *error; { fprintf(stderr,"ERROR! %s!\n",error); exit(1); }
smit@.ucalgary.ca (Theo Smit) (01/13/90)
Additional notes to my previous posting: Obviously, the compile line in the source was intended for Sun-3 systems. Modify to suit your tastes. I will not be reading news regularly from now on, since I start another job at NovAtel Communications on the 15th. I will try to post a forwarding address if at all possible. However, personal email should still work if you use smit@enel.ucalgary.ca If this doesn't work, you can send messages to me via knud@enel.ucalgary.ca (Steve Knudsen) He's a regular reader of comp.dsp, so I will hear about comments through him, if my acount here gets toasted sooner than I think it will be. happy dsp'ing, Theo Smit.