rjk@sequent.UUCP (Robert Kelley) (09/27/89)
I came across an article: Kronland-Martinet, Richard. 1988. "The Wavelet Transform for Analysis, Synthesis, and Processing of Speech and Music Sounds" Computer Music Journal 12(4):11-20. It describes a method of signal analysis and synthesis which seems ideally suited to the pitch-shifting or CD speed-altering task mentioned here recently. I'm fascinated with this article and would like to learn more about this technique. It reminds me of a windowed DFT in a way, except that the analyzing function is windowed, not the input data. I would like to hear about anyone experimenting with this technique, or even thinking about it. By the way, the EDN article confused me because what was being described was not an FFT but a DFT.
malcolm@Apple.COM (Malcolm Slaney) (10/01/89)
In article <22313@sequent.UUCP> rjk@sequent.UUCP (Robert Kelley) writes: >I came across an article: >Kronland-Martinet, Richard. 1988. "The Wavelet Transform for Analysis, > Synthesis, and Processing of Speech and Music Sounds" Computer Music > Journal 12(4):11-20. >It describes a method of signal analysis and synthesis which seems ideally >suited to the pitch-shifting or CD speed-altering task mentioned here >recently. I think the Wavelet Transform is a partially developed rehash of two ideas that have been around for a long time. If you are interested in recognition and detection than you should look at the Scale Space theory first proposed by Andy Whitkin somewhere around 1984 in AI literature. The other idea is often called the Wigner distribution and was proposed in the 50's. The latest reference to it I have seen is an MIT thesis by Riley called something like "Time Frequency Representation of Speech." The thesis was published as a book last year. Drop me a note if you need better references. Malcolm Slaney Speech and Hearing Project malcolm@apple.com
dean@image.soe.clarkson.edu (Dean Swan) (10/03/89)
From article <35148@apple.Apple.COM>, by malcolm@Apple.COM (Malcolm Slaney): > In article <22313@sequent.UUCP> rjk@sequent.UUCP (Robert Kelley) writes: >>I came across an article: >>Kronland-Martinet, Richard. 1988. "The Wavelet Transform for Analysis, >> Synthesis, and Processing of Speech and Music Sounds" Computer Music >> Journal 12(4):11-20. >>It describes a method of signal analysis and synthesis which seems ideally >>suited to the pitch-shifting or CD speed-altering task mentioned here >>recently. > > I think the Wavelet Transform is a partially developed rehash of two ideas > that have been around for a long time. If you are interested in recognition > and detection than you should look at the Scale Space theory first proposed > by Andy Whitkin somewhere around 1984 in AI literature. The other idea > is often called the Wigner distribution and was proposed in the 50's. The > latest reference to it I have seen is an MIT thesis by Riley called something > like "Time Frequency Representation of Speech." The thesis was published as > a book last year. > > Drop me a note if you need better references. > > Malcolm Slaney > Speech and Hearing Project > malcolm@apple.com