tj@xn.ll.mit.edu (Thomas E. Jones) (11/19/90)
Since many people had asked about low rate speech vocoding and Sinusoudal coding references, I thought I'd post them. Sinusoidal Coding: This method gives very good quality speech and is also quite robust, working well under high background noise or music. Robert J. McAulay and Thomas F. Quatieri can be contacted at: Speech Systems Technology Group, MIT Lincoln Lab, 244 Wood Street, Lexington, MA 02172 (617)981-7240. (1) R. J. McAulay and T.F. Quatieri, "Speech Processing Based on a Sinusoidal Model," The Lincoln Laboratory Journal, Fall 1988, V. 1, N. 2. (2) R. J. McAulay and T. F. Quatieri, "Speech Analysis/ Synthesis Based on a Sinusoidal Representation," IEEE Trans. Acoust. Speech Signal Process. ASSP-34, 744 (1986) (3) R. J. McAulay and T. F. Quatieri, "Multirate Sinusoidal Transform Coding at Rates from 2.4 kbps to 8.0 kbps, "Int. Conf. on Acoustics, Speech, and Signal Processing 87 (IEEE, New yuork, 1987), p 1645. (4) T. F. Quatieri and R. J. McAulay, "Sinewave-based Phase Dispersion for Audio Preprocessing, "Int Conf. on Acoustics, Speech, and Signal Processing 88 (IEEE New York, 1988) p 2558. 300 BPS LPC-Segment Vocoder: This algorithm has plenty of areas left for improvement. It requires a good deal of hardware, both speed and memory for implementation, but I beleive that off the shelf DSP plug-in boards with lots of memory may soon make this implementable without quite as much time and cost. Chunks of speech spectra from a codebook are matched to incoming speech and the index (plus a couple other parameters) are transmitted. Work on codebook design, possibly different weightings when calculating euclidian distance, and codebook size reduction are just a few of the many areas that could be investigated to improve quality. This algorithm was developed by BBN, and a Real time implentation was made by Dr. Edward Hofstadter at MIT Lincoln Lab., who can be reached at the same address above. William Russell at BBN helped consult with the real-time implementation. (1) S. Roucos, A. Wilgus, and William Russell, "A Segment Vocoder Algorithm for Real-Time Implementation," ICASSP '87 (IEEE New York, 1987) p. 1949 Thomas E. Jones - tj@xn.ll.mit.edu -- tj@xn.ll.mit.edu or tj@ll-xn.arpa (one of these should work) Thomas E. Jones, home (617) 924-8326 work (617) 981-5093