[comp.dsp] speech + noise

mlaprade@x102a.harris-atd.com (laprade maria 42641) (04/05/91)

I have a clean speaker database where I use the following to extract
features:
use energy and zero crossing to find voiced speech,
pre-emphasize and high pass filter,
autocorrelate.

I was satisfied with those results and decided to add noise. I added
white noise and processed it exactly the same way. For most of the speakers
I have lost about half the number of features extracted ( I think I'm
losing them because the pitch is calculated as a too high frequency. I
do know that the zero crossing value has now doubled, and the energy 
threshold has increased about 100x for a SNR of 10dB.) Which I thought was
reasonable. However I am left with less than 10 samples for 1 particular
speaker. 

My question is am I incorrect to add in white noise, or do I need to
modify my processing equations. I'm neither a speech nor a signal
processing type engineer, I just need a database to feed to my neural 
network so please be explicit with your help. Thanks.



-- 
Maria Laprade			ARPA: mlaprade@x102a.harris-atd.com
Harris Corporation - GASD	UUCP: ...uunet!x102a!mlaprade
Palm Bay, Florida		voice: (407)727-4920

malcolm@Apple.COM (Malcolm Slaney) (04/06/91)

In article laprade@x102a.ess.harris.com (laprade maria 42641) writes:
>My question is am I incorrect to add in white noise, or do I need to
>modify my processing equations. I'm neither a speech nor a signal
>processing type engineer

Ooooh, I love it....the blind (somebody who doesn't understand the problem)
leading the blind (a neural net).

Nothing personal......but wouldn't it make more sense to do NN research with 
a problem that you understand?

If not, check out Ron Cole's paper in the Albuquerque ICASSP (1990).
He compares several different representations as input to a NN.  Your data
isn't necessarily bad, but your representations are't robust in the
face of noise.  Also check out Melvyn Hunt's paper in the NY ICASSP.

								Malcolm