[comp.dsp] EZFFT Source

smit@enelc.uucp (01/13/90)

/*
 *
 *	Efficient Zooming Fast Fourier Transform  (EZFFT)
 *
 *	FFT Sinc Interpolation for Center Padded Data
 *
 *	Copyright (C) 1989 Theo J. Smit
 *
 *	To compile : cc -O4 -ffpa    -o ezfft ezfft.c -lm
 *                          -f68881
 *                          -fswitch
 *
 * To use binary input and output files, add -DBINARY in the above 'cc'
 * statement. Default is ascii (human-readable) form. The input file
 * should have one real and one imaginary datum per line. Output
 * will also be one complex data point per line.
 *
 * Algorithm description:
 *
 * This is a 1D fft, intended to interpolate data sequences that look
 * like this:
 *		|--         --|
 *              |  \       /  |
 *              |   \     /   |
 *              |____\___/____|
 *
 * This is what most frequency spectra look like (roughly), so this
 * algorithm is most useful for inverse transforming and interpolating
 * simultaneously.
 *
 * The algorithm is based on the fact that the data occupies known
 * places in the input data sequence. The zeros that are usually
 * inserted in the center give known results for the first several
 * passes, so these calculations can be eliminated.
 *
 * We define the (DIT) butterfly operation as:
 *
 *		A' = A + BW(k)
 *
 *		B' = A - BW(k)
 *
 * where A and B are complex quantities, and W(k) = exp(2*pi*k/N), where
 * N is the zoomed FFT point size. A will be referred to as the 'upper'
 * input to the butterfly, and B is the 'lower' input. Similarly, 
 * A' and B' are the upper and lower outputs, respectively.
 * It helps to draw these things out as usually done in the texts,
 * or as in the AMD Am29540 data sheets.
 *
 * We also assume there are M = 2^m input points, to be zoomed into
 * N = 2^n FFT points. If we don't want to compute all of the output points,
 * we can prune the output to get only the first R = 2^r output points.
 * 
 * End padding the data sequence is a trivial case of efficient zooming,
 * since for the first several passes, the lower (B) inputs are always
 * zero, which means A' = B' = A. Therefore, we can eliminate the first
 * (n-m) passes of the FFT, simply replicate the data points as required,
 * and perform all of the remaining butterflies.
 *
 * Center padding the data is more involved, since in some cases the upper input
 * of the butterfly will be zero, giving the result:
 *
 *		A' = BW(k)
 *
 *		B' = -BW(k)
 *
 * This means we will need to actually do some calculating before we can
 * start the (n-m+1) pass of the FFT. The algorithm I developed sums up all of
 * the W(k) along each path to the n-m+1 pass (this is easy to do in polar form)
 * and multiplies signs as required, to generate a special set of coefficients
 * for the (n-m+1) pass, W'(k).
 * We then perform this pass using the W'(k), and the rest of the passes using
 * the W(k) coefficients.
 *
 * Pruning:
 *
 * Often it is not required to look at all of the output data. In this case,
 * we don't need to perform all of the last pass of butterflies. If we restrict
 * ourselves to looking only at the first R = 2^r output points, we can
 * systematically eliminate the calculation of all unnecessary butterflies.
 * This program also does that.
 *
 * The end padding and output pruning have been previously described in IEEE
 * papers by Screenivas and Rao, Skinner, and Markel.
 *
 * This center padding algorithm will be described more fully in the (hopefully)
 * September issue of IEEE ASSP (transactions?) magazine.
 *
 * I'm placing this program in the public domain. If you like it - great. If
 * you don't - oh well. If you use this algorithm in a commercial application,
 * I'd like to hear about it.
 *
 * Theo Smit
 * Department of Electrical Engineering
 * University of Calgary
 * 2500 University Drive, NW
 * Calgary AB T2N 1N4
 * CANADA
 *
 * smit@enel.ucalgary.ca
 *
 * DISCLAIMER: Use the contents of this program and its output at your own risk.
 */
#include <stdio.h>
#include <strings.h>
#include <sys/file.h>
#include <math.h>

#define TRUE	(1)
#define FALSE	(0)
#define R_PERM	(0400)
#define W_PERM	(0644)
#define R_OPT	(O_RDONLY)
#define	W_OPT	(O_WRONLY|O_CREAT|O_TRUNC)

typedef struct _complex { float re, im; } complex;
typedef struct _polar {float mag, pha;} polar;

char	*malloc(),		/* System space allocation routine	*/
	*progname;		/* Pointer to program name		*/
double	atof();			/* System ascii to double converter	*/
int	*rbit(),		/* Bit erversal routine			*/
	ilog2();		/* Routine to find log to base 2	*/

main(argc, argv)
int	argc;
char	**argv;
{
	FILE	*infp = stdin,
		*outfp = stdout,
		*fopen();

	char	*inname,	/* Input file name pointer		*/
		*outname,	/* Output file name pointer		*/
		S[80];

	int	i,j,
		points = 256,	/* Number of points in input 		*/
		zoomfact = 1,	/* Input zoom factor			*/
		prunefact = 1,	/* Output pruning factor		*/
		zoompts = 256,	/* Number of points after zooming	*/
		prunepts = 256,	/* Number of points after pruning	*/
		*brev,		/* Pointer to bitreversed array		*/
		infd = 0,	/* Input file descriptor (default=stdin)*/
		outfd = 1,	/* Output file descriptor (stdout)	*/
		NAME = FALSE,	/* Flag set when filenames used		*/
		INVERSE = FALSE;/* Flag set for inverse FFT		*/

	complex	*data,		/* Pointer to data array		*/
		*coef,		/* Pointer to coefficient array		*/
		*f_coef;	/* Pointer to first pass coefficients	*/

	progname = argv[0];		/* Decode command line args	*/
	if (argc != 4 && argc < 5) {
		fprintf (stderr,"usage: %s [infile] points zoom prune [inv]\n",
				progname);
		exit(1);
	}
	if (argc == 4) { 
		points = atoi (argv[1]);
		zoomfact = atoi(argv[2]);
		prunefact = atoi(argv[3]);
	}
	else {
		inname = argv[1];
		NAME = TRUE;
		outname = malloc((unsigned)(strlen(inname)+strlen(".fft")));
		strcpy (outname, inname);
		strcat (outname,".fft");
		points = atoi (argv[2]);
		zoomfact = atoi(argv[3]);
		prunefact = atoi(argv[4]);
		if (argc == 6) 
			if (strcmp(argv[5],"inv") == 0) INVERSE = TRUE;
	}

	if (NAME == TRUE) {		/* Open input and output files	*/
#ifdef BINARY
		if ((infd = open(inname,R_OPT,R_PERM)) == -1) abort(inname);
		if ((outfd = open(outname,W_OPT,W_PERM)) == -1) abort(outname);
#else
		if ((infp = fopen(inname,"r")) == NULL) abort(inname);
		if ((outfp = fopen(outname,"w")) == NULL) abort(outname);
#endif BINARY
	}

	zoompts = points * zoomfact;	/* Allocate space for arrays	*/
	prunepts = zoompts/prunefact;

	data = (complex *)malloc((unsigned)(zoompts * sizeof(complex)));
	coef = (complex *)malloc((unsigned)(zoompts/2 * sizeof(complex)));
	f_coef = (complex *)malloc((unsigned)(zoomfact * sizeof(complex)));

					/* Read in input data		*/
#ifdef BINARY
	fprintf(stderr,"%s: read %d bytes\n",progname,
		read(infd,data,points*sizeof(complex)));
	close (infd);
#else
	for(i = 0; i < points; i++) {
		fgets(S, 80, infp);
		sscanf(S, "%lf%lf", &(data[i].re), &(data[i].im));
	}
#endif BINARY
					/* Calculate coefficients	*/
	calc_coefs(coef, zoompts, INVERSE);
	calc_first_coefs(f_coef, points, zoomfact, INVERSE);

					/* Rearrange data for FFT	*/
	brev = rbit (points);
	swap_data(data, brev, points, zoomfact);

	for(i = 0; i < zoompts; i++)
		printf("%d: %lf %lf\n", i, data[i].re,data[i].im);
					/* Do FFT			*/
	fft(data, coef, f_coef, zoompts, zoomfact, prunefact);

					/* Write output data		*/
#ifdef BINARY
	fprintf(stderr,"%s: wrote %d bytes\n",progname,
		write(outfd,(char *)data,prunepts*sizeof(complex)));
	close (outfd);
#else
	for(i = 0; i < prunepts; i++) {
		fprintf(outfp, "%lf %lf\n", data[i].re, data[i].im);
	}
	fclose (outfp);
#endif BINARY

	free ((char *)data);
	free ((char *)coef);
	free ((char *)f_coef);
	free ((char *)brev);
}

/************************************************************************/
/* the FFT is done by doing a pass using special coefficients to perform
 * the zooming; followed by several (or none) normal passes, followed by
 * several partial passes to effect the output pruning
 *
 * Some parts of this algorithm should be changed for greater efficiency
 * (most notably the recurring floating point division to address W(k)).
 */
fft(data, coef, f_coef, points, zoom, prune)
complex	*data,			/* Pointer to data			*/
	*coef,			/* Pointer to coefficients		*/
	*f_coef;		/* Pointer to first pass coefficients	*/
int	points,			/* Number of points			*/
	zoom,			/* Zoom factor				*/
	prune;			/* Prune factor				*/
	{
	int	i,
		j,
		pass,		/* Pass number				*/
		start,		/* Where to begin a set of butterflies	*/
		prunepts,	/* Number of points after pruning	*/
		logpts,		/* log to base 2 of points		*/
		logzoom,	/* log to base 2 of zoom factor		*/
		logprune,	/* log to base 2 of prune factor	*/
		_do = 1,	/* Number of butterflies to do in a set	*/
		skip = 2;	/* # of data points to skip between sets*/

	complex *a,		/* Index of first point in butterfly	*/
		*b,		/* Index of second point in butterfly	*/
		*a2,		/* Index of first source for butterfly	*/
		*b2,		/* Index of second source for butterfly	*/
		A,		/* Butterfly operation =		*/
		B,		/*		A' = A + BW		*/
		W,		/* 		B' = A - BW		*/
		BW;		/* Use these for temporary storage	*/
			
					/* For a given zoomfactor n, we	*/
					/* can skip the first log2(n) 	*/
	logpts = ilog2(points);		/* passes			*/
	logzoom = ilog2(zoom);
	logprune = ilog2(prune);
	for(i = 0; i < logzoom; i++) {
		_do *= 2;		/* Have to get right skip factor*/
		skip *= 2;
	}

	a2 = data + _do - 1;
	b2 = a2 + _do;
	A = *a2;
	B = *b2;
	for(j = 0; j < _do; j++) {	/* First butterfly set on first	*/
		a = data + j;		/* pass uses normal coefficients*/
		b = a + _do;		
		W=coef[(int)(points*((float)j/(float)_do))/2];
		BW.re = B.re*W.re - B.im*W.im;
		BW.im = B.im*W.re + B.re*W.im;
		a->re = A.re + BW.re;
		a->im = A.im + BW.im;
		b->re = A.re - BW.re;
		b->im = A.im - BW.im;
	}

	for(i = 1; i < points/skip; i++) { /* Rest of first real pass - */
		start = i * skip;	   /* special coefficients 	*/
		a2 = data + start + _do - 1;
		b2 = a2 + _do;
		A = *a2;
		B = *b2;
		for(j = 0; j < _do; j++) {
			a = data + start + j;
			b = a + _do;
			W=f_coef[j];
			BW.re = B.re*W.re - B.im*W.im;
			BW.im = B.im*W.re + B.re*W.im;
			a->re = A.re + BW.re;
			a->im = A.im + BW.im;
			b->re = A.re - BW.re;
			b->im = A.im - BW.im;
		}
	}
	_do *= 2;
	skip *= 2;
					/* Now we do normal FFT passes	*/
	for (pass = logzoom + 1; pass < logpts - logprune; pass++) {
		for(i = 0; i < points/skip; i++) {
			start = i * skip;
			for(j = 0; j < _do; j++) {
				a = data + start + j;
				b = a + _do;
				W=coef[(int)(points*((float)j/(float)_do))/2];
				A = *a;
				B = *b;
				BW.re = B.re*W.re - B.im*W.im;
				BW.im = B.im*W.re + B.re*W.im;
				a->re = A.re + BW.re;
				a->im = A.im + BW.im;
				b->re = A.re - BW.re;
				b->im = A.im - BW.im;
			}
		}
		_do *= 2;
		skip *= 2;
	}
	prunepts = points / prune;
	for (; pass < logpts; pass++) {	/* Output pruning passes	*/
		for(i = 0; i < points/skip; i++) {
			start = i * skip;
			for(j = 0; j < prunepts; j++) {
				a = data + start + j;
				b = a + _do;
				W=coef[(int)(points*((float)j/(float)_do))/2];
				A = *a;
				B = *b;
				BW.re = B.re*W.re - B.im*W.im;
				BW.im = B.im*W.re + B.re*W.im;
				a->re = A.re + BW.re;
				a->im = A.im + BW.im;
			}
		}
		_do *= 2;
		skip *= 2;
	}
}
/************************************************************************/
swap_data(data, brev, points, zoom)
complex	*data;			/* Pointer to data			*/
int	*brev,			/* Pointer to bitreversed array		*/
	points,			/* Number of points			*/
	zoom;			/* Zoom factor				*/
	{
	int 	i, j;
	complex	dummy;		/* Temporary space			*/

					/* First we will reorder the data */
	for(i = 1; i < points-1; ++i) {
		if(brev[i] < i) {
			dummy = data[i];
			data[i] = data[brev[i]];
			data[brev[i]] = dummy;
		}
	}
					/* Then move'er out!		*/
	for(i = points-1; i >= 0; i--) 
			data[(i+1) * zoom - 1] = data[i];
					/* move out to last spot	*/
}

/************************************************************************/
int *rbit(points)
int	points;			/* Number of points to be bitreversed	*/
	{
	int	i,
		j,
		logpts,		/* Log base 2 of points			*/
		fiddle = 1,	/* Half of some (or all) of the points	*/
		*brev;		/* Pointer to bitreversed array		*/

				/* Allocate space for array & initialize*/
	brev = (int *)malloc((unsigned)(points*sizeof(int)));
	brev[0] = 0;

	logpts = ilog2 (points);

	for (i = 0; i < logpts; i++) {	/* Generate bitreversed sequence*/
		for(j = 0; j < fiddle; j++) {
			brev[j] *= 2;
			brev[j + fiddle] = brev[j]+1;
		}
		fiddle *= 2;
	}
	return(brev);		/* Return pointer to array 		*/
}
 
/************************************************************************/
int ilog2(num)
int num;
	{
	int	result = 0;
	
	while (num > 1){
		num /= 2;
		result += 1;
	}
	return (result);
}
 
/************************************************************************/
calc_first_coefs(fcoefs, points, zoom, inv)
complex	*fcoefs;		/* Pointer to coefficient array		*/
int	points,			/* Number of points			*/
	zoom,			/* Zoom factor				*/
	inv;			/* Flag set for inverse FFT		*/

	{
	int	i, j;
	float	sign;		/* Sign of phases (depends on inv)	*/
	polar	*phases;	/* Phase correction factors		*/

	phases = (polar *)malloc ((unsigned)(2*zoom*sizeof(polar)));

	for (i = 0; i < zoom; i++) {	/* Initialize correction factors*/
		phases[i].mag = 1.0;
		phases[i].pha = 0.0;
	}

	for (i = 1; i <= zoom; i *= 2) { /* Calculate correction	*/
		for( j = 0; j < i; j++) {
			phases[j].pha += (float)j/(float)i;
			phases[j+i].mag = -1.0*phases[j].mag;
			phases[j+i].pha = phases[j].pha;
		}
	}

	sign = ((inv == TRUE) ? (1.0) : (-1.0));

	for (i = 0; i < zoom; i++) {	/* Convert to rectangular	*/

		fcoefs[i].re = phases[i].mag * cos(sign * M_PI*phases[i].pha);
		fcoefs[i].im = phases[i].mag * sin(sign * M_PI*phases[i].pha);
	}
	free((char*)phases);
}
/************************************************************************/
calc_coefs(coefs, points, inv)
complex	*coefs;			/* Pointer to coefficients		*/
int	points,			/* Number of points			*/
	inv;			/* Flag set for inverse FFT		*/
	{
	int	i;
	float	angle = 0.0,	/* Angle of complex sinusoid		*/
		inc;		/* Increment for complex sinusoid	*/

	inc = ((inv == TRUE) ? (1.0) : (-1.0)) * M_PI / (float)(points/2);

	for (i = 0; i < points/2; i++) {/* We only need N/2 coefficients */
					/* to do an N point FFT	*/
		coefs[i].re = cos(angle);
		coefs[i].im = sin(angle);
		angle += inc;
	}
}
/************************************************************************/
abort(error)
char	*error;
	{
	fprintf(stderr,"ERROR! %s!\n",error);
	exit(1);
}

smit@.ucalgary.ca (Theo Smit) (01/13/90)

Additional notes to my previous posting:

Obviously, the compile line in the source was intended for Sun-3 systems.
Modify to suit your tastes.

I will not be reading news regularly from now on, since I start another
job at NovAtel Communications on the 15th. I will try to post a forwarding
address if at all possible. However, personal email should still work if
you use

smit@enel.ucalgary.ca

If this doesn't work, you can send messages to me via

knud@enel.ucalgary.ca (Steve Knudsen)

He's a regular reader of comp.dsp, so I will hear about comments through
him, if my acount here gets toasted sooner than I think it will be.

happy dsp'ing,

Theo Smit.