[comp.dsp] 48k to 44.1k sample rate conversion

joe@media-lab.media.mit.edu.MEDIA.MIT.EDU (Joe Chung) (05/13/91)

In the digital audio world, a lot of sound is recorded at a 48k sample
rate. A CD, however, is at 44.1k. As we all know, the classic
technique of sample rate conversion would have us upsample to the
lowest common multiple, low-pass filter, and downsample to the desired
rate. The only problem is that the LCM of 44.1k and 48k is 21.168 Meg!
A 4 minute stereo soundfile at 48k takes up about 50 megabytes, which
means that it's 21.168M upsampled version would be over 22 gigabytes!

Obviously, sample rate converters don't store the entire intermediate,
upsampled version, but I was wondering if anyone has a good idea of 
what sorts of schemes they do use. Do they do some sort of overlap-
save method akin to block convolution? Does anyone have any code
to do this?

- Joe Chung

mark@adec23.uucp (Mark Salyzyn) (05/13/91)

>As we all know, the classic technique of sample rate conversion would have us
>upsample to the lowest common multiple, low-pass filter, and downsample to the
>desired rate. The only problem is that the LCM of 44.1k and 48k is 21.168 Meg!

Why not, on the fly in a DSP chip for example, perform a first order hold
function (linear interpolation) and use the interpolated values to feed a
48K sampling rate output function. You could perform the first order hold
every 21.168M to proove your point, but all you need do is figure out the
interpolated value at a 48K rate. The DSP chip would still have to be pretty
fast since a TMS320C25 would be hard pressed to do this and perform it's
A/D and D/A interface functions ...

Sorry, no code, since that would be telling secrets :-) :-)

Ciao, Mark Salyzyn  mark@adec23.UUCP

hanke@nessie.cs.id.ethz.ch (Norbert Hanke) (05/14/91)

In article <5826@media-lab.media.mit.edu.MEDIA.MIT.EDU> joe@media-lab.media.mit.edu (Joe Chung) writes:
( some stuff deleted )

>rate. The only problem is that the LCM of 44.1k and 48k is 21.168 Meg!

It is 7.056 MHz

>Obviously, sample rate converters don't store the entire intermediate,
>upsampled version, but I was wondering if anyone has a good idea of 
>what sorts of schemes they do use. Do they do some sort of overlap-
>save method akin to block convolution? Does anyone have any code
>to do this?

Sample rate conversion is not that difficult: when done by a FIR filter
the filter coefficients have to be calculated for the 7.056 MHz sampling
frequency. But fortunately, 146 out of 147 input samples (at 7.056 MHz) are
0 for 48 kHz input, and 159 out of 160 output samples don't have to be
calculated. These two facts together make the computing power requirements
reasonable for sampling rate converters. More difficult, however, is the
calculation of about 10000 filter coefficients for a high quality sampling
rate converter.

Norbert Hanke
ETH Zurich, Switzerland 

bweeks@sirius.UVic.CA (Brent Weeks) (05/14/91)

I recently tackled the sample-rate conversion problem in some of my
own work.  The paper that got me on the right track is:

"Interpolation and Decimation of Digital Signals - A Tutorial Review"
by Ronald E. Crochiere and Lawrence R. Rabiner.
This paper is in Proceedings of the IEEE, Vol. 69, No. 3, 1981, pp.
300-331.

In short, efficient implementation of Sample Rate Conversion involves
filtering short windows (of length Q) of the input data using a
polyphase filter (FIR with Q taps) operating at the output sample
rate.  The polyphase filter has periodically time-varying
coefficients.

The sample rate converter should be quite do-able on a DSP Chip.  Use
an up-conversion factor of 160 and a down-conversion factor of 147.
(44.1 * 160/147 = 48)  The polyphase filter cycles through 160 states,
then repeats.  The filter coefficients would be stored in 160*Q words.
For each output sample, besides managing windowing of the input data,
just Q multiply and accumulates need to be performed.  Typical values
of Q range from 5 to 20.

As far as determining the filter coefficients goes, read the paper!

I have implemented a sample rate converter along these lines in
Matlab.  It is good enough for my work, but is too slow for signals
greater than a few thousand samples in length.

				Brent Weeks
				bweeks@fermat.uvic.ca

fede@bernina.ethz.ch (Federico Bonzanigo) (05/15/91)

In article <1991May13.173129.18295@bernina.ethz.ch> hanke@nessie.cs.id.ethz.ch (Norbert Hanke) writes:
>
>Sample rate conversion is not that difficult: when done by a FIR filter
>the filter coefficients have to be calculated for the 7.056 MHz sampling
>frequency. But fortunately, 146 out of 147 input samples (at 7.056 MHz) are
>0 for 48 kHz input, and 159 out of 160 output samples don't have to be
>calculated. These two facts together make the computing power requirements
>reasonable for sampling rate converters. More difficult, however, is the
>calculation of about 10000 filter coefficients for a high quality sampling
>rate converter.
>
>Norbert Hanke
>ETH Zurich, Switzerland 

I have to add the following:

- It should be possible to design a FIR filter of such a length with a
suitable Remez algorithm and computer power. A filter of length up to
ca. 15000 has been designed on an Alliant FX80. A modified
Parks-McClellan program should do the job. The modifications should
reduce the arithmetic errors and use better initial values to ensure
the convergence of the Remez algorithm. Some of them are described in
the last two sections of a paper of mine "Some Improvements to the
Design Programs for Equiripple FIR Filters", Proc. ICASSP 82, pp.
274-277.  Other improvements aimed at reducing the computation time
will surely help, such as the ones described in: A. Antoniou, "New
Improved Method for the Design of Weighted-Chebyshev, Nonrecursive,
Digital Filters", IEEE Trans. on Circuits and Systems, Vol. CAS-10, pp.
740-750 (Oct.  1983).  Circuits and Systems, Vol. CAS-30, No. 10, pp.
740-750.  Obviously I claim to have a better suited program...

- A trick that has been used in this field to avoid the direct design
of such a long filter is to design a filter of length around 1000-2000
using the Remez algorithm and then to interpolate the coefficients to
the desired filter length. 


Federico Bonzanigo
  Electrical Engineering Dept.
  Swiss Federal Institute of Technology (ETH)
  ETH-Zentrum
  CH-8092 Zurich, Switzerland
E-mail:	      bonzanigo@nimbus.ethz.ch
EARN/BITNET:  BONZANIGO@CZHETH5A.bitnet
EUNET/UUCP:   fede@ethz.UUCP or ...!mcvax!chx400!ethz!fede
VAX/PSI MAIL: PSI%022847931149411::BONZANIGO
Phone: +41 (1) 256-5134      (+ = whatever you have to dial
Fax:   +41 (1) 251-2172       to call outside your country)
Telex: 817 115 vaw ch

mfvargo@netcom.COM (Michael Vargo) (05/16/91)

I have been following this thread, and it seems the approaches are quite
elegant, but maybe a little bit complicated.

1. If there really is significant energy in the 22,500 to 24,000Hz band,
then a low pass filter will be required to prevent aliasing on the
downsample.  Maybe this signal is dog whistles so something will be needed.
It seems a simple FIR filter could be applied to the 48kHz sampled data to
wipe this out.  I have an Excel spreadsheet to calculate coefs if your
interested.  But, I wouldn't bother.  Of course, I play guitar. :-)

2. An interpolation can be done using a function similar to the following.
I wrote this to downsample Mac sounds before converting them to mu-law PCM
at 8kHz.


int crush(x,y,ix,iy)
unsigned char	*x,*y;
int		ix,iy;
{
	/*  This function will take ix samples and crush them into iy samples.		*/
	/*  This can be used to down sample a data stream to a lower date rate.		*/
	
	int	i,ipoint;
	int	delta;
	
	
	for ( i=0 ; i<iy ; i++ )
	{
		ipoint = (i*ix)/iy;
		delta = x[ipoint+1] - x[ipoint];
		delta =  (delta*((i*ix)-(ipoint*iy)))/iy;
		y[i] = x[ipoint] + delta;
 
	}
}
In this case, the samples are only a byte wide.  You could go crazy and use
double floats.  This function used to do that but it was quite slow.  It
seems my Mac IIsi is missing the math co-processor!

Now my question:

Does the act of interpolation actually do some sort of low pass filtering?
I think not but maybe it could be looked at as some sort of averaging?

lance@motcsd.csd.mot.com (lance.norskog) (05/17/91)

In princeton.edu:pub/music/resample.tar.Z is a source code shar for
a general resampling filter.  It appears to do the sophisticated
analog processing that previous posters have claimed is the right way.
I haven't tried it out yet.  (I only have a Sound Blaster, so 
cheesy linear interpolation is probably fine for me.)

Lance Norskog

cas@media-lab.media.mit.edu.MEDIA.MIT.EDU (Scud) (05/17/91)

In article <1991May16.052920.21060@netcom.COM> mfvargo@netcom.COM (Michael Vargo) writes:
>Now my question:
>
>Does the act of interpolation actually do some sort of low pass filtering?
>I think not but maybe it could be looked at as some sort of averaging?

No. It's a cascade of Hilbert transforms. With the right sequence of phase
shifts it has been determined by this newsgroup that just about any DSP
operation is possible.

Cas

hanke@nessie.cs.id.ethz.ch (Norbert Hanke) (05/17/91)

In article <1991May16.052920.21060@netcom.COM> mfvargo@netcom.COM (Michael Vargo) writes:
(some stuff deleted)
>1. If there really is significant energy in the 22,500 to 24,000Hz band,
>then a low pass filter will be required to prevent aliasing on the
>downsample.  Maybe this signal is dog whistles so something will be needed.
>It seems a simple FIR filter could be applied to the 48kHz sampled data to
>wipe this out.  I have an Excel spreadsheet to calculate coefs if your

The ~10000-Tap FIR filter I described earlier does exactly that. It has
to be designed as a FIR filter for a sampling frequency of 7.056 MHz with
the following properties: passband 0..20 kHz, stopband 24.1 kHz..fs/2
(or 22.05 kHz..fs/2 if you don't want to have alias products in the range
20..22.05 kHz).

>Does the act of interpolation actually do some sort of low pass filtering?
>I think not but maybe it could be looked at as some sort of averaging?

An FIR filter does nothing more than a weighted interpolation between samples.
Simple interpolation is FIR filtering with all filter coefficients having the
same value, which results in a sin(x)/x - shaped frequency response.

Norbert Hanke
ETH Zurich, Switzerland

eyh@quiet.Eng.Sun.COM (Chuck Hsiao) (05/18/91)

My paper "Polyphase Filter Matrix for Rational Sampling Rate Conversions" on
Proc. of ICASSP '87, pp. 2173-2176 should help for this design need. The
polyphase filter matrix downsamples first and then upsamples for the exacly
same output. So the intermediate sampling rate is 300Hz instead of 7.056MHz.
What a difference!

- Chuck

lou@caber.valid.com (Louis K. Scheffer) (05/18/91)

joe@media-lab.media.mit.edu.MEDIA.MIT.EDU (Joe Chung) writes:


>In the digital audio world, a lot of sound is recorded at a 48k sample
>rate. A CD, however, is at 44.1k. As we all know, the classic
>technique of sample rate conversion would have us upsample to the
>lowest common multiple, low-pass filter, and downsample to the desired
>rate. The only problem is that the LCM of 44.1k and 48k is 21.168 Meg!
>A 4 minute stereo soundfile at 48k takes up about 50 megabytes, which
>means that it's 21.168M upsampled version would be over 22 gigabytes!

>Obviously, sample rate converters don't store the entire intermediate,
>upsampled version, but I was wondering if anyone has a good idea of 
>what sorts of schemes they do use. Do they do some sort of overlap-
>save method akin to block convolution? Does anyone have any code
>to do this?

The most straightforward way is as follows:  You interpolate 146 zeros
between every sample to get a data rate of 7.056 MHz.  You then
design a digital filter to low-pass this to 20 KHz.  This can be an
FIR filter with perfect phase and arbitrarily good frequency response,
subject only to the limits of your computing power.  One DSP chip
will easily get you +-0.05 db.  You then discard 159 of every 160
samples to get 44.1 KHz.  Since there is no content above 22.05 KHz,
this loses nothing.  Assuming you have done all high-precision math up to this
point, then the last step, rounding back to 16 bits, is the only
one that introduces any error.  This error will most likely appear
as white noise spread over the whole spectrum (just like the noise
introduced on the first A/D conversion).  The level should be very
low (-96 db, more or less).

Computationally, you never store the interpolated 0s, and you never
compute the samples that will be discarded.  Say the FIR filter with
a 7 MHz rate and a 20KHz cutoff has 20000 coefficients (just a guess).
Since 146/147 of the data are 0, you must do 20000/147 = 136 multiplies
to get each output point.  Hence you keep the last 136 samples at
48 KHz, and multiply by 136 coefficients (spaced 147 apart) from the
FIR filter.  Since all possible phases between the two signals are
present (otherwise you did not have the least common multiple) you must
keep all 20000 coefficients.  Each output point uses a different 136
element subset up to the 147th output point, which uses the same
coefficients as the first output point, and the cycle repeats.

So the only question is computing the coefficients to use in the FIR
filter.  This is conceptually straightforward (See, for example, the
fortran program in Rabiner and Gold, "Theory and Application of
Digital Signal Processing") but such a long filter may stress the
limits of computer arithmetic.  Since you only need to compute it
once, however, it should be simple to adapt the algorithm to higher
precision arithmetic.

I have the coefficients lying around somewhere for 48.0 -> 44.0 KHz
conversion (an easier problem) but not for 48.0 -> 44.1.  If someone
computes them, or has them, maybe they could post a message saying
where they can be found.

Hope this helps,
 Lou Scheffer

paulr@syma.sussex.ac.uk (Paul Russell) (05/18/91)

From article <3962@motcsd.csd.mot.com>, by lance@motcsd.csd.mot.com (lance.norskog):
> In princeton.edu:pub/music/resample.tar.Z is a source code shar for
> a general resampling filter.

Just thought I'd better mention that it's

    princeton.edu:pub/music/resample.src.Z
                                     ^^^

//Paul

-- 
           Paul Russell, Department of Experimental Psychology
         University of Sussex, Falmer, Brighton BN1 9QG, England
     Janet: paulr@uk.ac.sussex.syma  Nsfnet: paulr@syma.sussex.ac.uk
    Bitnet: paulr%sussex.syma@ukacrl.bitnet  Usenet: ...ukc!syma!paulr

lou@caber.valid.com (Louis K. Scheffer) (05/19/91)

bweeks@sirius.UVic.CA (Brent  Weeks) writes:

>In short, efficient implementation of Sample Rate Conversion involves
>filtering short windows (of length Q) of the input data using a
>polyphase filter (FIR with Q taps) operating at the output sample
>rate.  The polyphase filter has periodically time-varying
>coefficients.

>The sample rate converter should be quite do-able on a DSP Chip.  Use
>an up-conversion factor of 160 and a down-conversion factor of 147.
>(44.1 * 160/147 = 48)  The polyphase filter cycles through 160 states,
>then repeats.  The filter coefficients would be stored in 160*Q words.
>For each output sample, besides managing windowing of the input data,
>just Q multiply and accumulates need to be performed.  Typical values
>of Q range from 5 to 20.

   For audio work, you will need more coefficients.  You probably want
   delta(passband) = 0.01 (about .1 db ripple)
   delta(stopband) = 0.00001 (100 db of rejection)
   delta F = 2 KHz (flat to 20, down 100 db by 22 KHz)
   F = 7 MHz
   Looking at the chart, D(delta(s), Delta(p)) is about 4, so
   you need about F/deltaF * D = 14000 coefficents.  Hence Q is about
   14000/160 = 88 for good audio work.

   This is more than the examples in the paper because audio work
      a) tries to get real close to the sampling limit, leading to steep
	 transition bands.
      b) tries to get very flat response in the passband
      c) tries to get very good signal to noise. (100 db)
   All of these increase Q.

parks@bennett.berkeley.edu (Tom Parks) (05/19/91)

In article <457@valid.valid.com>, lou@caber.valid.com (Louis K. Scheffer) writes:

|> The most straightforward way is as follows:  You interpolate 146 zeros
|> between every sample to get a data rate of 7.056 MHz.  You then
|> design a digital filter to low-pass this to 20 KHz.

This filter would have a cutoff frequency of approximately 0.003 times the
sampling rate, which is not a very easy filter to design.  *MANY* filter
coefficients would be required for such a filter.  Instead, look at the prime
factorization of the sampling rates:

44100 = 2^2 * 3^2 * 5^2 * 7^2
48000 = 2^7 * 3   * 5^3

To convert from 48000 to 44100, you can cascade several sample rate converting
filters with much smaller ratios.  For example, increase the sampling rate by a
factor of 3, then decrease by a factor of 2, increase by a factor of 7,
decrease by a factor of 2, decrease by a factor of 5, increase by a factor of
7, then decrease by a factor of 2 three times.

Some of these filters can have *VERY* broad transition bands since previous
filters will have removed components which could cause aliasing.  This means
that you will only have to design a set of very simple filters, instead of
having to design one filter with a huge number of coefficients.  You'll find
that the computational requirement to implement the cascaded filters is lower.
You can also play around with the order in which you raise and lower the
sampling rate.

Tom Parks	parks@janus.Berkeley.EDU
University of California, Berkeley

tom@syssoft.com (Rodentia) (05/22/91)

In article <5147@syma.sussex.ac.uk> paulr@syma.sussex.ac.uk (Paul Russell) writes:
 [stuff deleted]
>Just thought I'd better mention that it's
>
>    princeton.edu:pub/music/resample.src.Z
>-- 
>           Paul Russell, Department of Experimental Psychology
>         University of Sussex, Falmer, Brighton BN1 9QG, England
>     Janet: paulr@uk.ac.sussex.syma  Nsfnet: paulr@syma.sussex.ac.uk
>    Bitnet: paulr%sussex.syma@ukacrl.bitnet  Usenet: ...ukc!syma!paulr

I was all set to have this mailed over when I found out that the ftp
server at bitftp1@pucc.bitnet now talks to BITNET, EARN, and some other
net customers only.

Can anyone direct me to an ftp server that will talk to me?  Or would
some kind soul offer to mail this file to me?  (please start with the
offer, not the file; I'd hate to stuff my mailbox with the same thing).
Is it small enough to post?  (again, there should be some negotiation
before 5 copies of the same thing get posted).

Thank you for your assistance.

-- 
Thomas Roden                                      | tom@syssoft.com
Systems and Software, Inc.                        | Voice: (714) 833-1700 x454 
"If the Beagle had sailed here, Darwin would have | FAX:   (714) 833-1900
come up with a different theory altogether." - me |

lou@caber.valid.com (Louis K. Scheffer) (05/22/91)

mfvargo@netcom.COM (Michael Vargo) writes:

>2. An interpolation can be done using a function similar to the following.
>I wrote this to downsample Mac sounds before converting them to mu-law PCM
>at 8kHz.

[...code omitted ...]

>Now my question:

>Does the act of interpolation actually do some sort of low pass filtering?
>I think not but maybe it could be looked at as some sort of averaging?

Yes - interpolation is a low-pass filter.  The easiest way to see this is
to note that interpolating is exactly equal to convolving with a time function
that looks like:

	 *                  <- 1
	* *
       *   *
      *     *
******       *************  <- 0

 |   |   |   |   |   |   |  <input sample rate

Since a convolution is the same as a filter, this is a filter, in this case
a low pass.  If you want to know the exact response, you look at the
transform of the triangle function.  This is most easily derived by noting
that a triangle is the convolution of a rectangle with itself.  Since a
rectangle has response sin(f)/f, where the first 0 occurs at the half
sampling rate, then the triangle above has response sin^2(f)/f^2, with the
first 0 at the input half sampling rate.  You can also look at this as
averaging with the above function as a weighting factor, and get the same
answer.

  -Lou Scheffer

lance@motcsd.csd.mot.com (lance.norskog) (05/24/91)

lou@caber.valid.com (Louis K. Scheffer) writes:
>mfvargo@netcom.COM (Michael Vargo) writes:
>>Does the act of interpolation actually do some sort of low pass filtering?
>>I think not but maybe it could be looked at as some sort of averaging?
>Yes - interpolation is a low-pass filter.  The easiest way to see this is
>to note that interpolating is exactly equal to convolving with a time function
>that looks like:

> ... explanation of filtration as a triangular-shaped filter 

Ok, now we're getting somewhere.  The SPUT sound operating system
for DOS (shareware by Adrienne Cousins) has high-pass and low-pass
filtration options for playing back sound samples.  SPUT is a DOS
TSR (background program) comprising 5K of 8086 code, so we know
it doesn't have 13,000 coefficients.

If interpolation is being used for the low-pass operation, is it
also being used for the high-pass problem?  You need high-pass
for systems to chop out the bass, because those little desktop
speakers can't handle it and it just distorts the midrange.

Lance

lou@caber.valid.com (Louis K. Scheffer) (05/26/91)

parks@bennett.berkeley.edu (Tom Parks) writes:

>In article <457@valid.valid.com>, lou@caber.valid.com (Louis K. Scheffer) writes:

>|> The most straightforward way is as follows:  You interpolate 146 zeros
>|> between every sample to get a data rate of 7.056 MHz.  You then
>|> design a digital filter to low-pass this to 20 KHz.

>This filter would have a cutoff frequency of approximately 0.003 times the
>sampling rate, which is not a very easy filter to design.  *MANY* filter
>coefficients would be required for such a filter.  

True, but you only design it once (and it's done already, I've got the
coefficients for those who want them).  It takes about 10 CPU hours on a fast
machine to design the filter (using Remez exchange, anyway).  It takes 14000
coefficients to get +- 0.05 db passband, 104 db rejection stopband.

>Instead, look at the prime factorization of the sampling rates:

>44100 = 2^2 * 3^2 * 5^2 * 7^2
>48000 = 2^7 * 3   * 5^3

>To convert from 48000 to 44100, you can cascade several sample rate converting
>filters with much smaller ratios.  For example, increase the sampling rate by a
>factor of 3, then decrease by a factor of 2, increase by a factor of 7,
>decrease by a factor of 2, decrease by a factor of 5, increase by a factor of
>7, then decrease by a factor of 2 three times.

>Some of these filters can have *VERY* broad transition bands since previous
>filters will have removed components which could cause aliasing.  This means
>that you will only have to design a set of very simple filters, instead of
>having to design one filter with a huge number of coefficients.

Agreed, but you only design it once, and the program source is published in
"Theory and Application of Digital Signal Processing" by Rabiner and Gold.
(Actually, you need to tune it up a little for such large filters, but that's
only a few days work.)

>You'll find
>that the computational requirement to implement the cascaded filters is lower.
>You can also play around with the order in which you raise and lower the
>sampling rate.

This appraoch will work, but it's not clear the computational requirement is
any less.  Direct convolution by the 14000 tap filter takes about 95 multiply
adds per output point (since 146/147 of the input samples are 0).  In your
example, the very first filter does:

	increase by 3
	decrease by 2 at 144 KHz  (but 2/3 inputs are 0)
	throw away 1 of 2, rate now 72 KHz.

This filter looks like a 3x oversampling filter, and will require
about 130 coefficients, for a 130/3 = 43 multiplies per output sample (and
output is at a 72 KHz rate, so this implies 43 * 72/44.1  = 70 multiplies per
(44.1 KHz) output sample.

You then need to:

	increase by 7
        decrease by 5 at 504 KHz ( 6/7 non zero)
	throw away 4 of every 5 samples, rate now 100.8 KHz
	decrease by 2 at 100.8 KHz (all non-zero)
	throw away 1/2 to get 50.4 KHz
	increase by 7 to get 352.8 KHz.
	reduce by 2 at 352.8 KHz (6/7 non-zero)
	throw away 1/2 to get 176.4 KHz.
	reduce by two at 176.4 (all non-zero)
	throw away 1/2
	reduce by two at 88.2 (all non-zero)
	throw away 1/2 to get 44.1

These remaining filters are simpler, as you pointed out,
but I am sure you cannot do them all for less than 30 multiplies/output sample.

Thus if you implement the filters by multiply/adds, this takes more operations,
plus the book-keeping is *much* more complex.  The direct approach is a one
or two instruction loop on most DSP chips, for 100 or 200 cycles per output
point.  I doubt you can do the bookkeeping alone in less than 100 cycles
per output point for the method above.

However, maybe you can do some of the reduction filters with so few
coefficients that you can do adds and shifts instead of multiply/adds.  In
this case the cascaded scheme might be an overall improvement if multiplies are
expensive.  These filters will not be very flat, but this can be compensated
for by the first filter, which must be a multiply/add variety anyway,
and can be tuned up to have an arbitrary frequency response at almost
no additional cost.  Another advantage is that you do not need to store
14000 coefficients.  200 will probably do.

In short, the achilles heel of this scheme is the very first filter, which
MUST have a sharp cutoff to prevent aliasing.  This filter alone takes 0.7
of the work of the direct method, and the rest of the work/book-keeping
will make the method worse in practice, except in special cases.

-Lou Scheffer

parks@bennett.berkeley.edu (Tom Parks) (05/27/91)

In article <498@valid.valid.com>, lou@caber.valid.com (Louis K. Scheffer) writes:
|> True, but you only design it once (and it's done already, I've got the
|> coefficients for those who want them).  It takes about 10 CPU hours on a fast
|> machine to design the filter (using Remez exchange, anyway).  It takes 14000
|> coefficients to get +- 0.05 db passband, 104 db rejection stopband.

Ok, I went ahead and did a design for converting from 48 kHz to 44.1 kHz.
I cascaded 3 FIR filters to do this.  Their sample rate conversion ratios were 3:2, 7:5, and 7:16 and they required 147, 75, and 53 taps respectively.  Each filter had a pass band ripple of less than 0.05 dB and stop band rejection of over 100 dB.  It took me less than an hour of my time to design these filters, and it took less than 1 minute of CPU time on a SPARCstation.

|> Agreed, but you only design it once, and the program source is published in
|> "Theory and Application of Digital Signal Processing" by Rabiner and Gold.
|> (Actually, you need to tune it up a little for such large filters, but that's
|> only a few days work.)

You only have to design this set of filters once, too, and I didn't have to spend any time tuning up any software.

|> This appraoch will work, but it's not clear the computational requirement is
|> any less.  Direct convolution by the 14000 tap filter takes about 95 multiply
|> adds per output point (since 146/147 of the input samples are 0).  In your
|> example, the very first filter does:
|> 
|> 	increase by 3
|> 	decrease by 2 at 144 KHz  (but 2/3 inputs are 0)
|> 	throw away 1 of 2, rate now 72 KHz.
|> 
|> This filter looks like a 3x oversampling filter, and will require
|> about 130 coefficients, for a 130/3 = 43 multiplies per output sample (and
|> output is at a 72 KHz rate, so this implies 43 * 72/44.1  = 70 multiplies per
|> (44.1 KHz) output sample.

The first filter requires 49 macs per 72 kHz sample, or 3528000 macs/second.
The second filter requires 10.7 macs per 100.8 kHz sample, or 1080000 macs/sec.
The third filter requires 7.6 macs per 44.1 kHz sample, or 333900 macs/second.
The total is 4941900 macs to produce 44100 samples, or 112 macs/sample.
So this is comparable to the 14000 tap filter implementation as far as macs/second are concerned.

|> Thus if you implement the filters by multiply/adds, this takes more operations,
|> plus the book-keeping is *much* more complex.  The direct approach is a one
|> or two instruction loop on most DSP chips, for 100 or 200 cycles per output
|> point.  I doubt you can do the bookkeeping alone in less than 100 cycles
|> per output point for the method above.
|>

Since the three filters are cascaded, there isn't really any addtional overhead.  Each filter is a one or two instruction loop, and there's one of these loops for each filter.

So my implementation requires 112 multiply-accumulates per output sample (compared to 95), storage for 275 filter taps (compared to 14000), and takes less than a minute of CPU time to design (compared to 10 hours).

Tom Parks

lou@caber.valid.com (Louis K. Scheffer) (05/27/91)

lance@motcsd.csd.mot.com (lance.norskog) writes:

>Ok, now we're getting somewhere.  The SPUT sound operating system
>for DOS (shareware by Adrienne Cousins) has high-pass and low-pass
>filtration options for playing back sound samples.  SPUT is a DOS
>TSR (background program) comprising 5K of 8086 code, so we know
>it doesn't have 13,000 coefficients.

>If interpolation is being used for the low-pass operation, is it
>also being used for the high-pass problem?

I don't know this software, so I'm guessing.  It is probably not
interpolation, but a simple digital filter.  These work by taking
the last N samples, multiplying them by a bunch of constants
a[1],...,a[n] and adding the results.  You can (optionally) take
the computed output and use it in the computation on the next
sample as well.  You only need a few coefficients (maybe 10) to
get low-pass, high-pass, and bandpass responses, with responses
like analog audio equipment. The zillion coefficient case is only
when you are trying to do something tricky.

-Lou Scheffer

lou@caber.valid.com (Louis K. Scheffer) (05/28/91)

parks@bennett.berkeley.edu (Tom Parks) writes:

>In article <498@valid.valid.com>, lou@caber.valid.com (Louis K. Scheffer) writes:

[...Lots of discussion deleted, with Lou advocating direct convolution and
Tom advocating cascaded filters...]
>|> 14000 coefficients gets +- 0.05 db passband, 104 db rejection stopband.

>Ok, I went ahead and did a design for converting from 48 kHz to 44.1 kHz.
>I cascaded 3 FIR filters to do this.  Their sample rate conversion ratios were
>3:2, 7:5, and 7:16 and they required 147, 75, and 53 taps respectively.
>Each filter had a pass band ripple of less than 0.05 dB and stop band
>rejection of over 100 dB.

This is not quite as good a filter, since at some point all the ripples will
add up to .15 db of ripple.  If you have N cascaded filters then the sum of
their errors should be the same as the single filter.  You probably do best
by making the small filters really flat and putting most of the error in the
big one.

>|> plus the book-keeping is *much* more complex.  The direct approach is a one
>|> or two instruction loop on most DSP chips, for 100 or 200 cycles per output
>|> point.  I doubt you can do the bookkeeping alone in less than 100 cycles
>|> per output point for the method above.

>Since the three filters are cascaded, there isn't really any addtional
> overhead.  Each filter is a one or two instruction loop,
>and there's one of these loops for each filter.

You're right - the book-keeping overhead per
output point can be made negligible with either method.  However, the
cascaded method will require considerably more code.  It takes the same amount
per filter, plus figuring out when to run each filter.  This does not look like
a code size problem, though (perhaps 60? instructions vs 10 for direct)

However, the direct method has the advantage of taking a uniform time per
output point.  Although the average of the cascaded filter is not much worse,
it will be quite a bit more variable.  In a real time environment such as
digital audio, this means either planning for the peak computational needs
of the cascaded filter, or using considerable RAM for buffering.

[... 275 coefficients required for cascaded filter ... ]

If you need to go both from 48->44.1 and 44.1->48, you can use the same
coefficients in the direct case, but you will need another completely
different filter in the cascaded case.  However, if you needed one of the
intermediate rates (perhaps using the 100.8 KHz for oversampled analog output,
for example) you can get this for free with the cascaded filter.

Thus the direct approach has the advantages:
   Fewer multiply accumulates for equivalent specs, 95 vs 112
   Simpler code (10 instructions vs 60? )
   Lower peak compute requirement per point.

The cascaded approach offers:
   Fewer coefficients to store. 275 (550 for both ways) vs 14000
   Some intermediate rates available for free
   Easier to design (1 min cpu vs 10 hours)

-Lou Scheffer

mhorne@jomega.rain.com (Michael T. Horne) (05/30/91)

In a recent article by lou@caber.valid.com (Louis K. Scheffer):
> 
> This is not quite as good a filter, since at some point all the ripples will
> add up to .15 db of ripple...
> 
> ...However, the cascaded method will require considerably more code....
> 
> ...However, the direct method has the advantage of taking a uniform time per
> output point....
> 
> Thus the direct approach has the advantages:
>    Fewer multiply accumulates for equivalent specs, 95 vs 112
>    Simpler code (10 instructions vs 60? )
>    Lower peak compute requirement per point.
> 
> The cascaded approach offers:
>    Fewer coefficients to store. 275 (550 for both ways) vs 14000
>    Some intermediate rates available for free
>    Easier to design (1 min cpu vs 10 hours)
> 

I trust that everyone is aware that any design should be done within the
context of an actual problem.  One can hardly claim implementation advantages
or disadvantages without context.  Both methods are superior in a given
implementation with specific constraints (e.g. computation time, passband
quality, available code space, etc.).

Mike

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Michael T. Horne                  DSP Hardware and Software Consulting Services
jOmega Systems                                   E-mail: mhorne@jomega.rain.com
Beaverton, OR                                             Phone: (503) 649-8957

tom@syssoft.com (Rodentia) (05/30/91)

I have been following this discussion for some time, and I have a few 
comments and questions.  Warning:  I am not a DSP engineer by trade, but 
I suspect there are many readers with my level of expertise.

1)  How bad is interpolation?  Clearly, the signal probably did not look
    like the linear interpolation of the samples (I know, look is not
    important; hear or dB is), but it seems to be a "reasonable" 
    approximation.  Roughly how much dB or % distortion is added?
    I have read here that a linear interpolation is a special form of
    low pass filter.  Is its problem that it doesn't have high or band
    pass characteristics as well?

2)  How good is digital filtering?  Does the filtered signal actually 
    include the sample points, or can the filter turn a V into a U 
    (imagine a U that doesn't go down as far) at a high frequency 
    area?  Is this considered distortion or just a loss of amplitude
    (that can be compensated by rescaling)?

3)  It appears that a filters instantaneous response is based on its
    current internal state (charge on the capacitors, etc.) and the
    input signal.  It seems that it should be possible to construct
    equations that continuously reflect the output of a filter between
    samples.  It also seems that these equations would be exponential 
    or perhaps 2nd-order or greater.

    I know that this would involve more complex math and perhaps numeric
    errors (underflow, etc.), but it seems to have the advantage of 
    generating only those samples needed (as opposed to a high i
    intermediate frequency).  Has anyone compared this more mathematical
    approach to the current number-crunching mac (multipy-accumulate)
    approach?  Is the theoretical distortion any different?  Are computers
    just so much better at the simpler math (with the increased possibilty
    of pipelining) that the more mathematical approach has been discarded  
    for the time being?

Sorry if that dragged on a bit.  Please respond by followup as I truly
believe this to be of interest, for historical reasons if nothing else
(you see, I have reinvented some of the greatest solutions to problems
of all times, and this is probably just another of them ;-{) ).
-- 
Thomas Roden                                      | tom@syssoft.com
Systems and Software, Inc.                        | Voice: (714) 833-1700 x454 
"If the Beagle had sailed here, Darwin would have | FAX:   (714) 833-1900
come up with a different theory altogether." - me |

rob@columbia.eng.ohio-state.edu (Rob Carriere) (05/31/91)

In article <1991May29.204046.12494@syssoft.com> tom@syssoft.com (Rodentia)
writes: 
>1)  How bad is interpolation?  Clearly, the signal probably did not look
>    like the linear interpolation of the samples (I know, look is not
>    important; hear or dB is), but it seems to be a "reasonable" 
>    approximation.  Roughly how much dB or % distortion is added?
>    I have read here that a linear interpolation is a special form of
>    low pass filter.  Is its problem that it doesn't have high or band
>    pass characteristics as well?

An interpolator is indeed a simple low-pass filter.  The problem is mainly with
the `simple'.  I couldn't give you distortion numbers off the top of my head;
maybe someone else can help here?

>2)  How good is digital filtering?  Does the filtered signal actually 
>    include the sample points, or can the filter turn a V into a U 
>    (imagine a U that doesn't go down as far) at a high frequency 
>    area?  Is this considered distortion or just a loss of amplitude
>    (that can be compensated by rescaling)?

There exists a rather surprising result that says that for bandlimited signals
sampled at a sufficiently high frequency, perfect reconstruction is possible.
`The sufficiently high frequency' should be at least twice the bandwidth of
the signal.  This sample rate is known as the Nyquist rate.  For example,
suppose that the original signal has a bandwidth of 20KHz, then both 44.1KHz
and 48KHz meet the Nyquist criterion, so it is possible to perfectly
reconstruct the signal from the 48K sampled version, and resample at 44.1K
_without_any_loss_of_information_or_fidelity_.

The digital filtering techniques attempt to perform this trick without having
to explicitly reconstruct the analog signal, i.e. go directly from one
digital representation to the other.  Again, this is possible without losing
anything.  Whether you actually lose anything and if so, how much, depends on
how sophisticated a design you are willing to tolerate.

>3)  It appears that a filters instantaneous response is based on its
>    current internal state (charge on the capacitors, etc.) and the
>    input signal.  It seems that it should be possible to construct
>    equations that continuously reflect the output of a filter between
>    samples.  It also seems that these equations would be exponential 
>    or perhaps 2nd-order or greater.

If the filter is linear, it can be described by a linear differential
equation.  Solutions to such equations look like (a sum of) exponential
functions plus other stuff that depends on the input. (Slightly more
mathematically, it is the input signal convolved with exponential functions.)

>    I know that this would involve more complex math and perhaps numeric
>    errors (underflow, etc.), but it seems to have the advantage of 
>    generating only those samples needed (as opposed to a high i
>    intermediate frequency).  Has anyone compared this more mathematical
>    approach to the current number-crunching mac (multipy-accumulate)
>    approach?  Is the theoretical distortion any different?  Are computers
>    just so much better at the simpler math (with the increased possibilty
>    of pipelining) that the more mathematical approach has been discarded  
>    for the time being?

As long as the Nyquist criterion is satisfied[1], this approach doesn't give
you anything extra, as the digital filter can be made to reproduce the
behavior of the analog filter.  (It doesn't help with the resampling either,
as you have to simulate the analog filter in discrete time -- the computer
doesn't do any other kind -- so you get the exact same resampling choices.)

On the other hand, simulating an analog filter would be a significant
computational load compared to the digital approach.

However, your general idea isn't so strange: there exist several design
methods for digital filters that work by first designing an analog filter that
will do the job and then `translating' the analog design into a digital one.

For the problem at hand though, it might be more enligthening to think of the
filters not as filters, but as sophisticated interpolators.  This is the
reverse viewpoint of your first paragraph, where you looked at a linear
interpolator as a simple filter; here we think of a filter as a fancy
interpolator.

This shows you that filter is functioning at a much higher sample rate than
the signal (or it couldn't interpolate between the data points).  In fact, the
filter operates at the LCM of 48K and 44.1K (~71M).  Since you only want the
`bottom' 44.1K of the signal (out of the 71M that pours out of the filter),
the filter is going to be some kind of a low-pass filter.  It could be very
simple, like a linear interpolator, or very complex, like some of the designs
that have been discussed here in the past couple of days.  The question is how
close do you want to come to the ideal resampling without information loss.

SR
[1] And if the Nyquist criterion isn't satisfied, you're in trouble anyway. :-)
---

toma@hpsad.HP.COM (Tom Anderson) (05/31/91)

>   How bad is interpolation?  Clearly, the signal probably did not look
>   like the linear interpolation of the samples (I know, look is not
>   important; hear or dB is), but it seems to be a "reasonable" 
>   approximation.

  I have been using interpolation in a sampled system with irregularly
spaced time points.  This is a different and more data dependent problem
than the sample rate conversion being discussed here, but my conclusions
are:

1. Linear interpolation can be real bad.
2. Interpolation by Gauss' method is good for time domain data.
3. Pade interpolation should be good for frequency domain data, but I 
   haven't verified this yet.
4. Interpolation by Gauss' method and Pade interpolation are really slow.

Tom Anderson    toma@hpsad.hp.com     "It's only hardware"

piet@cs.ruu.nl (Piet van Oostrum) (05/31/91)

A long time ago I programmed a simple low-pass digital filter. I did the
design by applying my knowledge of Linear Algebra and Fourier transforms.
In fact I wrote the filter equations as an expression in the
differentiation operator and approximated that with a Taylor series. That
translated in a three-or-so stage filter (Too long ago to remember the
details). However, I did'nt have any formal training in DSP. Therefore the
following question:

Can you recommend an introductory textbook on DSP?
-- 
Piet* van Oostrum, Dept of Computer Science, Utrecht University,
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands.
Telephone: +31 30 531806   Uucp:   uunet!mcsun!ruuinf!piet
Telefax:   +31 30 513791   Internet:  piet@cs.ruu.nl   (*`Pete')