dsp@oakhill.UUCP (DSP Account) (08/03/86)
CASCADABLE ADAPTIVE FINITE IMPULSE RESPONSE DIGITAL FILTER This document provides a technical summary of the DSP56200, a Cascadable Adaptive Finite Impulse Response (CAFIR) digital filter chip. For users not familiar with digital filtering, an appendix on this topic is also included. The DSP56200 is an algorithm specific, digital signal processing (DSP) peripheral designed to perform computationally-intensive tasks associated with digital filtering. Two principal functions are performed by the DSP56200 - FIR filtering and adaptive FIR filtering using the Least-Mean-Square (LMS) algorithm. A flexi- ble chip-cascading scheme enables the user to easily build filters with extended tap lengths and/or increased speed. Its performance, features, and simple interface make the DSP56200 a natural solution for problems such as echo cancelling, telephone line equal- ization, noise cancelling, conventional filtering, and many other DSP applications. Key features of the DSP56200 include the following: +---------------------------------------------------------------------------+ | Notes for below: | | [1] Options unique to adaptive filtering mode | | [2] The DSP56200 implements a true cascade. A true cascade means that | | the error term in the adaptive filter mode is calculated from the | | partial sums of ALL chips in the cascade. | +---------------------------------------------------------------------------+ * 3 Modes of Operation - Single FIR Filter - Dual FIR Filter (2 Independent FIR Filters) - Single Adaptive FIR Filter * High Performance Hardware - 24x16 bit Multiplication with 40 bit Accumulation - 10.25 MHz Internal Operation - Single-Cycle Multiply-Accumulate - Single-Cycle Update of a Coefficient - Ultra-Low-Power Standby Mode - 256 x 24 bit Coefficient RAM - 256 x 16 bit Data RAM - Unused RAM Available for System Storage - 28 pin DIP package * Architecture Optimized for Digital Filtering - 3 Execution Units Operate in Parallel - Multiple Internal Buses * Digital Filtering Features - 16-bit Rounding Option on Filtered Output - dc Tap Option - Programmable Filter Tap Length (4 to 256 taps) - Programmable Loop Gain [1] - Programmable Coefficient Leakage Term [1] - Adaptation Disable Capability [1] - LMS Adaptation Algorithm Used in Adaptive Filter Mode * Unlimited Cascadability Offers [2] - Longer-Tap Filters - Higher Sampling Rates * High Throughput Rates - Examples of Possible Configurations - 227 KHz FIR Filter (32 taps, 1 DSP56200) - 37 KHz FIR Filter (256 taps, 1 DSP56200) - 115 KHz Adaptive Filter (256 taps, 8 cascaded DSP56200s) - 19 KHz Adaptive Filter (256 taps, 1 DSP56200) - Many Other Configurations Possible * Simple Interface to Popular Hosts - Microprocessors - Microcomputers - General Purpose Digital Signal Processors +---------------------------------------------------------------------------+ | Notes for above: | | [1] Options unique to adaptive filtering mode | | [2] The DSP56200 implements a true cascade. A true cascade means that | | the error term in the adaptive filter mode is calculated from the | | partial sums of ALL chips in the cascade. | +---------------------------------------------------------------------------+ Vdd | _______v_______ | | | |<-----CLOCK D0-D7 <---->| | | |<-----START | | | | | | A0-A3 ----->| | | DSP56200 | | | | |<-----SDI | |----->SDO ~CS----->| | | |<-----SSI ~RD----->| |----->SSO ~WR----->| | | |<-----SEI |_______________| ^ | Vss Figure 1. Functional Signal Groups ARCHITECTURE DESCRIPTION The DSP56200 offers two advantages over general purpose DSPs - performance and minimal development time. The high performance of the DSP56200 results from an algorithm specific architecture featuring three execution units - the Asynchronous Parallel Interface, the Cascade Interface, and the Computation Unit - which execute in parallel (Figure 2). Development time is minimized because the DSP56200 is an off-the-shelf chip which requires no software development. The algorithms are implemented in hardware. Careful design of the interface units has virtu- ally eliminated the need for "glue" logic when inter- facing to a host processor or cascading DSP56200 chips. The parallel interface resembles that of a fast static RAM, and allows interfacing to fast, general purpose DSPs and MPUs having tight timing requirements. The cascade interface performs all the functions associated with cascading, thereby simplifying the design of multi-chip systems. The Computation Unit is responsible for performing the arithmetic necessary in FIR and adaptive FIR filtering (Figure 3). It contains the hardware neces- sary for implementing a 256-tap FIR filter with adapta- tion capability, including a 256 x 24-bit Coefficient RAM, a 256 x 16-bit Data RAM, and an Arithmetic Unit. The Data RAM is configured as a variable length circu- lar queue, allowing it to function as a virtual shift register. The Arithmetic Unit configures itself for both the LMS adaptation algorithm and the FIR filter calculation, implements the available digital filtering options, performs 100ns 24x16 bit multiply-accumulates, and performs 100ns coefficient updates. It is further described in the Internal Arithmetic Description. -------------------------------------------- | | | ------------------------ | Data Bus | | Asynchronous | | <--------|-------->| Parallel Interface | | | | Unit | | | ------------------------ | | | | | | ------------------------ | | | Computation | | | | Unit | | | ------------------------ | | | | Cascade | ------------------------ | Bus | | Cascade | | <--------|-------->| Interface | | | | Unit | | | ------------------------ | | | -------------------------------------------- Figure 2. Parallelism within the DSP56200 ^ | 8-bit External | Data Bus v ------------------------ ----------------->| Parallel Interface |<------------------ | ------------------------ | | | | --------------------- -------------- | | | Coefficient RAM | | Data RAM | | | | (256x24) | | (256x24) | | | --------------------- -------------- | | ^ ^ | | | | | | v v | | -------------------- -------------------- | ----->| Bus Controller |--- ---| Bus Controller |<----- -------------------- | | -------------------- ^ ^ v v | ^ | | --------------------- | | Error ---- | | Arithmetic Unit | | ---- New Data In | --------------------- | Sample | | | | | | ---------------- ------> Last Data Sample Figure 3. Computation Unit Block Diagram MODES OF OPERATION The DSP56200 is designed to operate in one of three modes - Single FIR Filter, Dual FIR Filter, or Single Adaptive FIR Filter. Functional diagrams of these three configurations are shown in Figure 4. Users unfamiliar with digital filtering should refer to Appendix 1. FIR filtering is a simple way to perform digital filtering. The Single FIR Filter mode is used to implement one FIR filter, either on a single chip or on several DSP56200's in a cascade (with up to 256 taps per chip). Using software design tools, filter coeffi- cients can be calculated to obtain the desired fre- quency response for performing functions such as dif- ferentiating, bandpass filtering, and lowpass filter- ing. In this mode, the DSP56200 is configured to implement the FIR filtering equation (see Appendix 1). The Dual FIR Filter mode is an extension of the Single FIR Filter mode. It allows two independent inputs, x1(n) and x2(n), to be FIR filtered using only one DSP56200. Note that both filters must use the same number of taps up to a maximum of 128. The DSP56200 is not adaptive or cascadable in the Dual FIR Filter mode. The Single Adaptive FIR Filter mode provides a unique solution to problems such as echo cancelling and adaptive equalization of telephone lines. This mode is used to implement one adaptive filter, either on a sin- gle chip or on several DSP56200's in cascade. Adaptive filters must perform the FIR filter multiply-accumulate operation, followed by an adaptation operation to modify the coefficients. The DSP56200 updates every filter coefficient once during each sample period using the Least-Mean-Squares (LMS) adaptation algorithm. ----------- x(n) ----->| H (z) |------> y(n) ----------- (a) Single FIR Filter Mode ----------- x (n) ----->| H (z) |------> y (n) 1 | 1 | 1 ----------- ----------- x (n) ----->| H (z) |------> y (n) 2 | 2 | 2 ----------- (b) Dual FIR Filter Mode -------------- x(n) ----->| Adaptive |-----+----> e(n) d(n) ----->| Filter | | -------------- | ^ | | | -------------- (c) Single Adaptive FIR Filter Mode Figure 4. DSP56200 Configurations INTERNAL ARITHMETIC DESCRIPTION The key to the accuracy of the DSP56200 is its Arithmetic Unit. Accuracy is affected by the number of bits in the coefficient and the errors due to rounding. In FIR filter applications, the actual frequency response will deviate from the user's desired response if not enough bits are used to represent the coeffi- cients. Wider coefficient word widths also enable adaptive filters to more closely approximate a desired impulse response, resulting in smaller error terms. Roundoff errors decrease as the size of the accumulator and coefficients increase. The DSP56200 uses a 24-bit coefficient, and will accept data words having up to 16 bits. The results of the multiply-accumulate operation are stored in a 40-bit accumulator. Both the 16-bit data samples and 24-bit coefficients are represented as signed fractional numbers. In FIR filtering applications, the filtering pro- cess is described by the following equation: N-1 __ \ y(n) = /_ h(i) * x(n-i) ( 1 ) i = 0 The "ith" coefficient is represented by h(i), and x(n-i) represents one of the previous data samples. In this mode, the Arithmetic Unit is configured as a multiplier-accumulator as shown in Figure 5. Adaptive filtering is a two-step process. First, the error term (the filter's output) is calculated and second, the error term is used to update the coeffi- cients. The error calculation is described by: N-1 __ \ e(n) = d(n) - /_ h(i) * x(n-i) ( 2 ) i = 0 where d(n) represents the reference (desired) input of the adaptive filter. During this first step, an FIR filtering operation is performed so that the the Arith- metic Unit is again configured as shown in Figure 5. The second step involves modifying each coefficient using the LMS adaptation equation: h(i) = h(i) + Kex(n-i) +/- Leakage ( 3 ) new old where K is a programmable loop gain, and e is the error term calculated in equation 2. The Arithmetic Unit is configured as shown in Figure 6 during the adaptation step, allowing for single-cycle updates. When a narrowband signal such as a sine wave is applied to the filter's input, the coefficients drift from their true values because there is not enough fre- quency content present in the signal. The coefficients will erroneously grow in magnitude, resulting in a larger error term. The leakage term is included to compensate for this. This term is an 8-bit user- programmable constant located in bits 2**-16 through 2**-23. The leakage term slowly pushes the coeffi- cients towards zero, effectively offsetting the slow growth in coefficient magnitude. The term is small enough that it does not affect the convergence of the filter, but adequate to prevent the slow coefficient drift occurring for narrowband input signals. Rounding is another function performed within the Arithmetic Unit. When the DSP56200 is multiply- accumulating (equation 1 or 2), the final result can optionally be rounded to a sixteen bit fractional result, prior to being output. Also, when updating coefficients (equation 3), each new coefficient is convergently rounded to a twenty-four bit fractional number, and then written back into the 24-bit Coeffi- cient RAM. The Arithmetic Unit also includes a dc tap option. This option is useful for an adaptive filter which has a dc component on one of the filter's inputs. An adap- tive filter with a dc tap can synthesize a signal to cancel this component (such as a dc signal introduced through an A/D converter). The dc tap can also be used to add a dc offset to the output when configured as an FIR filter. h(i) x(n-i) | | | --- | ----->| X |<----- --- | --- ------>| + |<--------------- --- | | | v | ------------------------ | | 40-bit Accumulator | | ------------------------ | | | | | -------------------- Figure 5. Arithmetic Unit in Multiply-Accumulate Mode Ke x(n-i) h(i) Leakage old | | | | | --- | | ----- | ---->| X |<---- ---->| +/- |<---- --- ----- | --- | ---------->| + |<---------- --- | v ------------------------ ----| Convergent Rounding | | ------------------------ | | v v Overflow h(i) new Figure 6. Arithmetic Unit in Coefficient Update Mode PERFORMANCE In measuring the performance of the DSP56200, it is important to consider the nature of real-time appli- cations. In a real-time DSP system dealing with digi- tized waveforms, all processing associated with the current sample must be completed before a new sample arrives. The simplified flowchart in Figure 7 shows the flow of control in a real-time environment. The performance of the DSP56200 is therefore measured by the amount of time it requires to complete the process- ing associated with the current sample. This length of time determines the minimal time allowable between sam- ples, and is referred to as the minimum sampling period. The reciprocal of the minimum sampling period is the system's maximum sampling frequency, fs. ----------------------- | Initialize System | ----------------------- | +<---------------------- | | ------------------------ | | Input a New Sample | | ------------------------ | | | ---------------------------- | | Perform All Processing | | | Associated with the | | | Current Sample | | ---------------------------- | | | +<--------------- | | | | ----------------- No | | / Has a New \_____| | \ Sample Arrived? / | ----------------- | | | | Yes | ------------------------- Figure 7. Flowchart for Real-Time Systems There are four parameters which determine the max- imum sampling rate of the DSP56200: - The number of DSP56200s cascaded together - The number of taps used on each DSP56200 - The clock frequency of the DSP56200 - The selected mode of operation The formulas below are used to calculate the DSP56200's maximum sampling frequency for a given system. In many cases, this maximum rate can be increased by cascading more DSP56200 chips together and using fewer taps on each chip. Maximum fs <= fck / #cycles Where: fck = DSP56200 input clock frequency 12 + N + q : Single FIR Filter Mode #cycles = 18 + 2N + q : Dual FIR Filter Mode 17 + 2N + r : Single Adaptive Filter Mode q = 29 + n - N : (29 + n - N) > 0 0 : otherwise r = 30 + n - N : (30 + n - N) > 0 0 : otherwise n = Number of chips cascaded together N = Number of taps used on each chip The intermediate terms, q and r, can never be negative. Also note that in Dual FIR Filter mode, N is the number of taps available for each of the two independent filters, and therefore N <= 128. Table I shows some performance figures for the DSP56200 in different con- figurations assuming a 10.0 MHz external clock. TABLE I - DSP56200 Performance Figures SINGLE CHIP -------------------------------------------- | Maximum Sampling Frequency (KHz) | -------------------------------------------- | | Number of Taps | | Mode | 32 64 128 256 | -------------------------------------------- | FIR Filter | 227 132 71 37 | | Adaptive Filter | 123 69 37 19 | | Dual FIR Filters | 122 68 36 * | -------------------------------------------- FOUR CHIPS -------------------------------------------- | Maximum Sampling Frequency (KHz) | -------------------------------------------- | | Number of Taps | | Mode | 32 64 128 256 | -------------------------------------------- | FIR Filter | 222 132 71 37 | | Adaptive Filter | 120 69 37 19 | -------------------------------------------- EIGHT CHIPS -------------------------------------------- | Maximum Sampling Frequency (KHz) | -------------------------------------------- | | Number of Taps | | Mode | 32 64 128 256 | -------------------------------------------- | FIR Filter | 204 132 71 37 | | Adaptive Filter | 115 69 37 19 | -------------------------------------------- SIXTEEN CHIPS -------------------------------------------- | Maximum Sampling Frequency (KHz) | -------------------------------------------- | | Number of Taps | | Mode | 32 64 128 256 | -------------------------------------------- | FIR Filter | 175 132 71 37 | | Adaptive Filter | 105 69 37 19 | -------------------------------------------- SIGNAL DESCRIPTION The DSP56200 is a 28-pin dual-in-line package (DIP) integrated circuit (Figure 1). Its signals can logically be grouped into the following categories: Host Interface Cascade Interface Clocks Power Descriptions of these signals are presented in the fol- lowing paragraphs. HOST INTERFACE D0-D7 (Data Bus) These eight pins provide a bidirectional data bus for communication with a host processor. The pins remain in the high-impedance state unless both ~RD and ~CS are asserted. A0-A3 (Register Address pins) A0-A3 select (in conjunction with the least significant bit of the Configuration Register) which register will be addressed when the Chip Select line is brought low and a read or write operation is performed. ~CS (Chip Select) This pin (active low) enables accesses to the chip operating registers. When not asserted, the D0-D7 lines will go into the high impedance state and all access to the chip will be disabled. ~RD (Read Strobe) When ~RD is asserted, the contents of the register specified by A0-A3 will be driven onto D0-D7. When ~RD is high, pins D0-D7 go into the high impedance state. ~WR (Write Strobe) This pin (active low) enables host writes to the regis- ter specified by A0-3. Data on D0-D7 must be valid for the specified setup time before the rising edge of ~WR. CASCADE INTERFACE SDI (Serial Data Input) This pin is used in the cascade mode to receive data from the last tap of the data shift register in the preceding DSP56200 chip. It connects to the SDO pin of the previous chip in cascade. If the chip is first in cascade or is used in standalone mode, this pin should be grounded. SDO (Serial Data Output) This pin is used in the cas- cade mode to pass the last data sample in the data shift register to the next DSP56200 in the cascade. It connects to the SDI pin of the next DSP56200. The out- put of this pin is typically not connected if the chip is used in standalone mode or if it is the last chip in a cascaded system. SSI (Serial Sum Input) This pin is used in the cascade mode to receive the partial sums from preceding stages. If the chip is first in the cascade or is used in standalone mode, this pin should be grounded. SSO (Serial Sum Out) This pin is primarily used in the cascade mode to pass the partial sums to the next DSP56200 in the cascade. The SSO pin is usually connected to the SSI pin of the next chip in the cascade. In the adaptive filter mode, the SSO pin of the last chip in the cascade is con- nected to the SEI pin on all chips cascaded including itself. This pin should not be connected in standalone FIR modes (single and dual FIR filters). SEI (Serial Error Input) This pin is used in the adaptive filter mode. It pro- vides the means of receiving the error term output from the last chip in the cascade. In standalone Adaptive Filter mode, this pin is tied to the SSO pin. In cas- cade Adaptive Filter mode, this pin is tied to the SSO pin of the last chip in the cascade. This pin should be grounded in the Single FIR Filter mode or the Dual FIR Filter mode. CLOCKS AND POWER CLOCK (Clock Input) This pin accepts the input clock for the DSP56200. The internal and external clocking frequencies are the same, and the maximum frequency for this input is 10.25 MHz. In cascaded systems, all DSP56200s must be driven from the same clock source. When CLOCK is held low, the device enters a power-down mode. START (Start Processing Command) This pin is used to provide a second clock to the chip at the system's sampling rate. This clock must be syn- chronized with the signal on the CLOCK pin in order to ensure proper operation. VDD (Power) - 3 pins VSS (Ground) - 3 pins INTERFACE DESCRIPTION The DSP56200 has been designed for simple inter- facing to a large number of host processors. The chip's asynchronous interface meets the tight timing margins of fast DSP processors, and it also interfaces to slower microcomputer/microprocessor and DSP chips. The parallel interface resembles that of a fast static RAM. The speed required of the host processor is deter- mined by two considerations - the time required for I/O with the DSP56200, and the time required by the host for additional processing of a sample. The sum of these intervals must be less than the time between sam- ples (the sampling period). Slower, single-chip micro- computers may be adequate hosts in systems with lower sampling rates, and one host may even be able to sup- port several different filters. For high sample rate applications, a fast DSP processor may be required to complete the necessary I/O in the allotted time. REGISTER MODEL The DSP56200 is initialized and accessed by the host through a set of control and data transfer regis- ters. In addition, the registers provide access to values in the Coefficient and Data RAMS, allowing unused memory to be used as auxilliary system storage. All register access occurs through the Asynchronous Parallel Interface Unit. The control and data regis- ters are double buffered. Information is only passed from the first buffer into the second when the chip receives a pulse on its start pin. Note that no host access is allowed during the time when the signal on the START pin is asserted. Registers in the DSP56200 have been divided into 2 banks of sixteen registers (Figure 9). Bank 0 contains the registers commonly accessed during real-time pro- cessing and Bank 1 registers are used for initializing the chip. The two banks share one common register, the Configuration Register, located at address 0F (hexa- decimal) in each bank. Switching banks is done by com- plementing the least significant bit of this common register. Once the desired bank has been selected, the registers are accessed using A0-A3, ~RD or ~WR, and ~CS. The registers have been ordered so that they can be accessed by the host using a simple autoincrementing address mode. Upon power-up, the user must initialize the chip's registers and RAMS. This normally involves writing into the configuration register to select the mode of operation and to select Bank 1. Then, the FIR Tap Length register is written, which programs the number of taps and also resets the chip. Upon reset, the chip timing is initialized and the contents of the Data RAM are undefined. Next, the Configuration register is accessed to change to Bank 0. Then the RAMs are usu- ally initialized by writing a valid coefficient into each location of the Coefficient RAM and a zero into each location of the Data RAM. One Data RAM location and one Coefficient RAM location can be written each sampling period. The DSP56200 is then ready for real- time filtering. +---------------------------------------------------------------------------+ | Notes for below: | | [1] The Configuration register is readable at 0E and 0F (hexadecimal) | | [2] Not Available for Use | +---------------------------------------------------------------------------+ BANK 0 (write) BANK 0 (read) -------------- ------------- Adr Register Name Register Name 0 X1 - High Output - 3 1 X1 - Low Output - 2 2 D - High Output - 1 3 D - Low Output - 0 4 K - High Last Tap 1 - High 5 K - Low Last Tap 1 - Low 6 X2 - High Last Tap 2 - High 7 X2 - Low Last Tap 2 - Low 8 Data RAM - High Data RAM - High 9 Data RAM - Low Data RAM - Low A Coeff RAM - High Coeff RAM - High B Coeff RAM - Mid Coeff RAM - Mid C Coeff RAM - Low Coeff RAM - Low D RAM Address [2] E [2] Configuration register [1] F Configuration register Configuration register [1] BANK 1 (write) BANK 1 (read) -------------- ------------- Adr Register Name Register Name 0 Leakage [2] 1 FIR Tap Length (FTL) [2] 2 [2] [2] 3 [2] [2] 4 [2] [2] 5 [2] [2] 6 [2] [2] 7 [2] [2] 8 [2] [2] 9 [2] [2] A [2] [2] B [2] [2] C [2] [2] D [2] [2] E [2] Configuration Register [1] F Configuration Register Configuration Register [1] Figure 9. Programming Model +---------------------------------------------------------------------------+ | Notes for above: | | [1] The Configuration register is readable at 0E and 0F (hexadecimal) | | [2] Not Available for Use | +---------------------------------------------------------------------------+ REGISTER DESCRIPTION X1 Register, X2 Register The X1 Register is a 16-bit register that functions as the data input register for both the FIR mode and the Adaptive FIR Filter mode. Data from the X1 register is copied into the Data RAM shift register once per START cycle. In the dual FIR mode an additional register called X2 functions as the second filter's data input register. The X2 register operates in a manner similar to X1. Data from the X2 register is copied into the Data RAM shift register of the second FIR filter once per START cycle. D Register (Adaptive FIR Filter Mode Only) The D Register is a 16-bit register that functions as the reference (echo) input when the DSP56200 is used in the echo cancel/adaptive FIR filter mode. It is represented as d(n) in the adaptive filtering equa- tions. This register is not used in the nonadaptive modes of operation. K Register (Adaptive FIR Filter Mode Only) The K register is a 16-bit register used only in the adaptive filtering mode. K is the loop gain used in the LMS adaptation process. It is multiplied by the error term and tap X(n-i) to generate an updated value for coefficient h(i) (see Appendix 1). Caution: K must always be a positive number, i.e., Bit 7 of the Most Significant Byte should be zero at all times. Configuration Register The Configuration Register is used to configure the modes and options of the DSP56200. The bits are defined as follows: 7 6 5 4 3 2 1 0 __ __ __ __ __ __ __ __ |__|__|__|__|__|__|__|__| ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | |__ Bank 0 / Bank 1 Select | | | | | | | | | | | | | |_____ Leakage Enable | | | | | | | | | | | |________ dc Tap Enable | | | | | | | | | |___________ Coeff. Update Disable | | | | | | | |______________ Rounding Enable | | | | | |__________________ Position in Cascade | | | |______________________ Single FIR / Dual FIR | |_________________________ FIR / Adaptive FIR Bit 7, FIR / Adaptive FIR, determines whether the chip will operate as a finite impulse response (FIR) filter or as an adaptive FIR filter. This bit is set to "1" for an adaptive FIR filter. Bit 6, Single FIR / Dual FIR, determines whether the filter will be configured as a single-channel or dual- channel FIR filter. In the dual mode, the tap lengths of both filters are the same and are controlled by the FIR Tap Length (FTL) register. The maximum value for the FTL in dual mode is limited to 127 (128 taps). In the dual mode, Output Data Bytes 3 and 2 contain chan- nel 1 output while Output Data Bytes 1 and 0 contain channel 2 output data (refer to Figure 9). Table II summarizes the valid configurations for the chip: TABLE II. DSP56200 Mode Configurations Adaptive/Nonadaptive Single/Dual (Configuration Bit 7) (Configuration Bit 6) Mode 0 0 Single FIR Filter 0 1 Dual FIR Filter 1 0 Single Adaptive FIR Filter 1 1 (Operation Undefined) Bit 5, Position in Cascade, selects whether the chip is configured to operate as stand-alone/first in cascade, or in cascade but not the first chip. The first confi- guration is selected when this bit is a zero. The second configuration is selected when this bit is a one. This bit must be set to a zero whenever the DSP56200 is used in the Dual FIR Filter Mode. TABLE III. DSP56200 Cascade Configurations Configuration Bit 5 System Configuration 0 Single DSP56200 System (Standalone) 0 First DSP56200 in a Cascaded System 1 Not First DSP56200 in a Cascaded System Bit 4, 16 Bit Rounding, selects whether the filter out- put will be represented as a 32-bit result in the Out- put Register, or as a rounded 16-bit result. In the latter case, data bytes 3 and 2 in the Output Register contain the valid 16-bit rounded result, and data bytes 1 and 0 contain invalid data. This bit is set to "1" for 16 bit rounding. Bit 3, Adaptation Disable, is used to disable the LMS algorithm in the adaptive filter mode. When this bit is set to a "1", the chip will continue to compute error terms using the last set of updated filter coeffi- cients. In echo cancelling applications, this bit is typically set when 'double talk' is detected. Bit 2, dc Tap Enable, is set to a "1" to turn on the dc tap. The dc tap looks like a tap in the data shift register with a fixed value of 7FFF (hexadecimal). The dc tap is multiplied by its corresponding (last) coef- ficient during the FIR filtering phase, and is also used when updating the last coefficient in adaptive filter mode. The dc tap is normally used in the adap- tive filter mode to remove any dc offset in the A/D converter. The dc tap is substituted for the last tap in the data shift register when it is enabled. The last tap data is not lost, however, and will be transferred correctly. Bit 1, Leakage Enable, is set to a "1" when the use of leakage is desired in the coefficient update calcula- tion. Bit 0, Register Bank Select, selects which Register Bank is to be accessed. The configuration register appears in both Bank 0 and Bank 1, allowing this bit to always be available for control of register bank selec- tion. This bit is set to "1" for access to Bank 1. Leakage Register (Adaptive FIR Filter Mode Only) The leakage register is used in the adaptive filter mode when coefficient update is enabled (Bit 4 of the Configuration register) and when the Leakage enable bit is set to a "1". Leakage is an 8-bit magnitude used to control coefficient drift in adaptive filtering. FIR Tap Length Register This 8-bit register determines the number of taps used in the FIR filter. The register is loaded with the number of taps minus one. For example, if a 256-tap filter is desired, this register is loaded with 255. Valid values range from 3 to 255 in Single FIR or Sin- gle Adaptive FIR Filter modes, and from 3 to 127 in the Dual FIR mode. Writing to this register also resets the chip. Normally this register is written by the host upon power-up. RAM Address Register This 8-bit register specifies which location will be selected during accesses to the Coefficient and Data RAM. It allows access to taps being used within the filter and to any memory not used in the FIR filter calculation, and automatically postincrements once each sampling period. Note that the RAM used for auxilliary system storage will not be overwritten during the FIR filtering process. Coefficient RAM Access Register This 24-bit register allows the user to read or write any location in the Coefficient RAM. The host proces- sor reads a RAM location by asynchronously writing the desired address into the Coefficient/Data Address Register, waiting for two pulses to occur on the START pin, i.e., 2 sample periods, and then reading the value of this register. A write is performed by asynchro- nously writing the desired value to this register. The actual write operation to the Coefficient RAM occurs during the following sampling period. All locations in the Coefficient RAM can be accessed using the RAM Address Register. Data RAM Access Register This 16-bit register allows the user to read or write any location in the Data RAM. The host processor reads a RAM location by asynchronously writing the desired address into the Coefficient/Data Address Register, waiting for two pulses to occur on the START pin (i.e. 2 sample periods), and then reading the value of this register. A write is performed by asynchronously writ- ing the desired value to this register. The actual write operation to the Data RAM occurs during the fol- lowing sampling period. If the desired address resides within the FIR filter structure, the DSP56200 automati- cally performs a logical-to-physical address conversion and correctly accesses the desired filter tap. Last Tap 1, Last Tap 2 The Last Tap 1 Register provides the user with a copy of the last data sample in the data shift register. When in the Dual FIR Filter mode, Last Tap 2 contains the last data sample for the second FIR filter. If the dc tap option is selected, these registers contain the last data sample instead of containing the constant 7FFF (hexadecimal) used in in dc tap calculations. These registers are provided only for convenience and are not required when cascading chips, since the last tap is also transmitted serially to the next chip in the cascade. These registers in conjunction with the X1 and X2 input registers are useful for signal power calculations. Output Data Register The Output Data Register bytes 3 through 0 contain the final FIR or adaptive FIR filter output. In the single channel mode the result will be 4 bytes (32 bits) unless the 16-bit rounding mode is set. In this case, bytes 3 and 2 contain a valid rounded 16-bit result and bytes 1 and 0 contain invalid data. Only 16 bits of output are available for each channel in Dual FIR Filter mode. Bytes 3 and 2 contain the output for the first FIR filter and bytes 1 and 0 contain the output for the second. APPENDIX 1 - DIGITAL FILTER BASICS 1. Finite Impulse Response Filters Finite impulse response (FIR) filtering is a common method of performing digital filtering. This filtering process can be viewed as a weighted moving average of past data values. The FIR filtering equation (A-1) describes how the filter's output is related to its input: N-1 __ \ y(n) = /_ h(i) * x(n-i) ( A-1 ) i = 0 where n = the time index i = the filter tap index N = the number of taps in the FIR filter h(i) = the "ith" coefficient in the FIR filter x(n-i) = the "ith" most recent data sample Thus, the FIR filtering process is an accumulation of N product terms. The coefficients are usually pre-calculated using software design tools, and are based on the user's desired frequency response. These coefficients represent the impulse response of the desired filter. Equation A-1 describes the discrete convolution of a digi- tized input waveform with an impulse response function. FIR filtering can be readily implemented in hardware as shown in Figure A-1. Each z**-1 represents a memory element with a delay of one sample period. N of these memory elements are cascaded together to form a tapped shift register of length N. This shift register is used to store the N most current data samples, each of which is referred to as a "filter tap". The width of the shift register is the number of bits used to represent the data samples - 16 bits in the case of the DSP56200. At the start of a sample period, a new data sample is entered into the shift regis- ter and the oldest data sample is discarded. Hardware is also required to calculate and accumulate the product terms. This is usually done with a single multiplier-accumulator (MAC) in order to conserve hardware. The MAC multiplies a coefficient with its corresponding data sample and adds this product with the sum of the previous products already in the accumulator. Thus, N multiply-accumulates occur each time a new sample is shifted into the FIR filter. ------- ------- ------- Input, | -1 | | -1 | | -1 | Last x(n) -->| z |--+- ->| z |--+-- ... --->| z |------> Tap | | | | | | | | | ------- | ------- | ------- | v v v --- --- --- h(0) ->| X | h(1) ->| X | ... h(N-1) ->| X | --- --- --- | | | | ------ | | | | | v | | --- | ----------------->| + |<--------------- --- | v Output, y(n) Figure A-1. Structure of a Finite Impulse Response (FIR) Filter FIR filters have numerous advantages. One benefit of digital hardware is that there is no drift with temperature, voltage or age, giving digital filters a significant advantage over their analog counterparts. In addition, FIR filters can be designed to have a linear phase characteristic, and are inherently stable since they have no poles. The linear phase characteristic is desir- able since it ensures that an input data signal is always delayed by a constant length of time, independent of its frequency content. FIR filters find uses in many areas of signal processing. Typical applications include dif- ferentiating, notch filtering, bandsplitting filters, matched filtering, filter banks, interpo- lation, and Hilbert Transforms. 2. Adaptive Filters Adaptive filters are a special class of filters used to solve a unique set of signal pro- cessing problems. These filters have two inputs, x(n) and d(n), which are usually correlated in some manner (Figure A-2). One input, x(n), is passed through an internal time varying filter, which tries to form an estimate of the desired input, d(n). The parameters of this internal filter eventually converge to a point where the internal filter can accurately estimate the desired input, minimizing the adaptive filter's output. This output represents the difference or error, e(n), between the desired signal and the estimate of the desired signal. There are two aspects to adaptive filters - the internal filter structure and the adaptation algorithm. d(n) ------------------------------- + | v -------------- - --- x(n) ----->| FIR |------->| + |-------> e(n) | Structure | --- -------------- | ^ | | | ------------------- Figure A-2. Block Diagram of an Adaptive Filter The most common adaptive filter implementa- tion is based on an FIR filter structure whose coefficients are adapted using the Least-Mean- Square (LMS) algorithm. First, the signal x(n) is FIR filtered using equation A-1 to provide an estimate of the second input, d(n). The error in the estimate is then calculated: N-1 __ \ e(n) = d(n) - /_ h(i) * x(n-i) ( A-2 ) i = 0 This error term is then used to modify every FIR filter coefficient using the following equation derived from the LMS adaptation algorithm: h(i) = h(i) + Kex(n-i) ( A-3 ) new old where h(i) = the value of the updated "ith" coefficient new to be used during the next sample period. h(i) = the value of the "ith" coefficient during old the current sample period. K = the gain constant e = the calculated error term x(n-i) = the "ith" most recent data sample At the beginning of the next sample period, two new samples, x(n) and d(n), are shifted into the system and the process repeats. After several iterations, the FIR filter coefficients converge to values which consistently minimize the mean square error (the filter's out- put). At this point, the adaptive filter is now able to estimate the d-input by passing the x- input through the FIR filtering hardware. It is interesting to note that once the filter has con- verged, the coefficients of the internal FIR filter resemble the impulse response of a filter whose input is x(n) and output is d(n). Equation A-3 is derived from an approximation of an equa- tion from the steepest descent algorithm, where the minimum error point is found by updating in the direction opposite the gradient[1,2]. The unique characteristics of adaptive filters allow them to be used in applications such as echo cancellers, adaptive line equalization of telephone lines, noise cancelling, system model- ing, prediction, deconvolution, and adaptive con- trol. Adaptive Filtering Reference Books: 1. M Bellanger, Digital Processing of Signals, John Wiley & Sons, New York, N.Y. 1984. 2. B. Widrow and S.D. Stearns, Adaptive Signal Processing, Prentice-Hall Inc., Englewood Cliffs, N.J., 1985. APPENDIX 2 - ECHO CANCELLING WITH THE DSP56200 One popular application of adaptive filters is in the area of echo cancellation. Consider the telephone system shown in Figure A-3. A 4-wire link is used for long distance transmission and is converted to a local 2-wire link through a hybrid transformer. Ideally, all of the transmitted message should pass through the transformer to the 2-wire link, but due to impedance mismatches, some of the signal is actually reflected into the Rx path at the hybrid. This reflection results in an audible echo received by the speaker. As the distance of transmission increases, the delay of the echo also increases. Tx -->--------------------+------------- | \ | x(n) \ v \ --------- \ | | \ Long | | ---------- Local Distance ---->| | | 4:2 |<---> 2-Wire Link | | | | Hybrid | Link | | | ---------- | | | / | --------- / | | / | v / e(n) | --- d(n) / Rx --<-----------+------| + |<---------- --- Figure A-3. Echo Canceller Application An adaptive filter provides an excellent solu- tion to this problem. The filter synthesizes an impulse response modeling the hybrid connection and the transmission delays. The input is passed through the FIR structure, producing a synthetic echo which is then subtracted from the actual echo. As the adaptive filter converges, the error decreases, resulting in less echo returned to the speaker. The Motorola DSP56200 is well suited to solving the above problem. Long delays can be cancelled by a set of chips in cascade. The adaptation process can be inhibited with a control bit, eliminating any incorrect adaptation when both parties talk simultaneously, a situation called doubletalk. A programmable leakage term is included on the chip to prevent the coefficient drift which results from narrowband input signals such as tones. The DSP56200 can also be used in a similar manner to provide acoustic echo cancellation for speaker phones.