boris@PRODIGAL.PSYCH.ROCHESTER.EDU (Boris "transceiver" Goldowsky) (11/05/89)
Ever since I heard Laurie Anderson I've been wondering... how exactly do they do this filtering to change the sound of someone's voice, or to make it sound like 3 people singing at once...? Enlighten me, O net! (or recommend a good book) Bng
jsd@GAFFA.MIT.EDU (Jon Drukman) (11/05/89)
In article <3801@ur-cc.UUCP> boris@prodigal.psych.rochester.edu writes: >Ever since I heard Laurie Anderson I've been wondering... how exactly >do they do this filtering to change the sound of someone's voice, or >to make it sound like 3 people singing at once...? Well, there's two kinds of boxes. The 'vocoder' takes a microphone input and modulates the signal with an external source, usually a synthesizer circuit of some kind playing one note. The vocoder works great for speech, not so much for singing, although there's a brilliant passage in "Boom! There She Was" by Scritti Politti which features Roger Troutman singing some scat vocals through a vocoder which is being modulated by a minimoog... The harmonizer is a device which takes your voice and electronically alters the frequencies in it (how, I'm not exactly sure) to produce a harmony line with it. With the advent of digital sampling technology, all this stuff is now a piece of cake, and you can buy cheap boxes to do it. I have a cartridge for my computer which when coupled with appropriate software can transform the pitch of any incoming signal. If you put a digeridu into it playing only one note (since that's all they can play) and then played a melody on a MIDI keyboard, it would 'play' the melody with a digeridu sound. I used this effect for the live version of "Running Up That Hill" which was done at the Katemas party in July. I used it as basically a digital version of the Chipmunks vocal effect. It has the advantage that you can sing in your normal voice at normal speed and you still come out sounding like a weasel on helium, whereas the Chipmunks stuff was all done by speeding up the tape and involved speaking... really... slowly... so that the pacing came out right when the tape was played back fast. All clear? +---------------------- Is there any ESCAPE from NOISE? ----------------------+ | | |\ | jsd@gaffa.mit.edu | "I like George Bush, but this `kinder, | | \|on |/rukman | jsd@umass.bitnet | gentler' crap is killing us." - D.Trump | +-----------------------------------------------------------------------------+
donley@BLAKE.ACS.WASHINGTON.EDU (E. Donley Olson) (11/05/89)
In article <3801@ur-cc.UUCP> boris@prodigal.psych.rochester.edu writes: > >Ever since I heard Laurie Anderson I've been wondering... how exactly >do they do this filtering to change the sound of someone's voice, or >to make it sound like 3 people singing at once...? > John Drukman has answered some of this... regarding vocoders and such. Harmonizers are strange little boxes. I beleive that the harmonizers may work by digitally sampling (or otherwise) the incomming sound, ie Laurie's voice, and then playing it back "sped up" almost instantaneously... The problem with this method is that you have to "drop" bits of the sound when you do this because when you play it back "sped up" it takes less time for the sound to occur and when you play it back "slowed down" it takes too MUCH time to play it back. The solution found in tape decks that use this technique is to sample very quickly and to "cut out" the pieces that are extra, or stick in an extra sample every period if the voice is sped up. This is the least likely method because the "cuts" leave annoying glitches in the sound. The other possible way this is accomplished is by doing some sort of frequency counting on the incomming sound and then dividing the frequency by a certain amount (or multiplying to make her sound like Dolly Parton). I once built a divider of this sort, but it was rather crude... But it might be what they do... People have been able to digitize the human voice into square waves, so why not... I would NOT expect that harmonizers do complete spectral analysis on samples in real time -- even the Fairlight doesn't do that! Any other possibilities? - Eo
bloch%mandrill@UCSD.EDU (Steve Bloch) (11/05/89)
boris@prodigal.psych.rochester.edu writes: >Ever since I heard Laurie Anderson I've been wondering... how exactly >do they do this filtering to change the sound of someone's voice, or >to make it sound like 3 people singing at once...? Jon Drukman writes: >Well, there's two kinds of boxes. The 'vocoder' ... > >The harmonizer is a device which takes your voice and electronically >alters the frequencies in it (how, I'm not exactly sure) to produce a >harmony line with it. Donley describes a sample-and-chop approach and a frequency-dividing approach. >I would NOT expect that harmonizers do complete spectral analysis >on samples in real time -- even the Fairlight doesn't do that! Of course, the special-purpose FFT chips are getting faster every day. Let's see... to do it in real time, assuming mono input and say a 24KHz sampling rate, you need to do a 1024-point FFT in 40 msec. That's within the capabilities of current hardware, I think. 'Course, you'd only get a precision of 24Hz, which wouldn't be good enough to produce a clean harmony in the 500-2000Hz range (1-3 quarter-tones). If you can do a 2048-point FFT in real-time (here you have 80 msec to do it), the precision becomes 12Hz. Check the newsgroup comp.dsp for more accurate answers. >Any other possibilities? Back to Jon: >With the advent of digital sampling technology, all this stuff is now >a piece of cake, and you can buy cheap boxes to do it. I have a >cartridge for my computer which when coupled with appropriate software >can transform the pitch of any incoming signal. If you put a digeridu >into it playing only one note (since that's all they can play) and >then played a melody on a MIDI keyboard, it would 'play' the melody >with a digeridu sound. If I understand this right, it's just straight play-back-faster-or- slower, which changes the ADSR parameters if you take it very far (like, more than half an octave or so). But if all you want to do is echo a voice at a particular pitch (or several pitches), and you don't want to change what pitch it is too often, you can do it with an IIR filter, using very short digital feedback to build resonances at whatever pitches strike your fancy. I've always assumed that was how Laurie did it, as it's computa- tionally very easy (to resonate two pitches, you need a four-pole filter, which only requires three adds and four multiplies per sample, and an 8086 can do that.) The only problem is that DESIGNING the IIR filter, figuring out the coefficients to suit the pitches you want to resonate, takes some work, and a grasp of complex analysis doesn't hurt. You don't want to do it in real-time. By the way, you notice that whenever Laurie has her voice echoed on a particular fixed harmony it "rings" for a while? That's a direct effect of the IIR ("infinite impulse response" means that technically it rings forever, but it may drop below audibility in less than a second). How long it rings depends on how precise you want your pitches to be; if you want absolutely perfect tuning, it WILL ring forever, without attenuating. Boris writes again: >Or suggest a good book to read? [or words to that effect] How about _Digital_Audio_Signal_Processing_(an_Anthology)_, edited by John Strawn, Wm. Kaufmann 1985? Or you could type "g comp.dsp". "Writers are a funny breed -- I should know." -- Jane Siberry bloch%cs@ucsd.edu