Quantile based noise estimation for spectral subtraction and Wiener filtering
read more
Citations
Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging
Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled
Noise estimation by minima controlled recursive averaging for robust speech enhancement
Speech enhancement for non-stationary noise environments
A noise-estimation algorithm for highly non-stationary environments
References
Statistical Digital Signal Processing and Modeling
Enhancement of speech corrupted by acoustic noise
Spectral Subtraction Based on Minimum Statistics
Experiments with a Nonlinear Spectral Subtractor (NSS), Hidden Markov Models and the projection, for robust speech recognition in cars
Noise estimation techniques for robust speech recognition
Related Papers (5)
Frequently Asked Questions (10)
Q2. What are the future works mentioned in the paper "Quantile based noise estimation for spectral subtraction and wiener filtering" ?
The latter seems superior according to experimental evidence ( Section 5 ).
Q3. How much noise is in the frequency bands?
Roughly in 80-90% of the frames the signal energy in the frequency bands is low, i.e. close to the noise energy level and only in 10-20% of the time the frequency band carries high energy, voiced speech.
Q4. How many mel frequency coe cients are passed to the recognizer?
After a discrete cosine transform of the logarithmic lterbank outputs the authors obtain 12 mel frequency cepstral coe cients, which, augmented by 12 regression coe cients, are passed to the recognizer.
Q5. How many frames are used to estimate the power spectrum of a speech recognizer?
As the estimation is more reliable if more data is available, the authors use the notation N(!; t) to denote an estimation of N(!) using all frames from the beginning of the utterance up to frame t.
Q6. What is the relative reduction for the optimal choice q?
The word error rate without any noise reduction method is 11.7%, i.e. the relative reduction is 26% for the optimal choice q = 0:55.
Q7. What is the recursion of the utterance?
The recursion is initialized by N(!; 0) = X(!; 0), which re ects the assumption that the rst frame of an utterance does not contain speech.
Q8. how long did the car noise take to be inserted?
In order to verify this theoretical consideration by an experiment, the authors inserted 0.5 seconds of car noise from a BMW 540 at 50 km/h before the beginning of each sound le of the test set.
Q9. What is the cost of the quantile based noise estimation method?
The quantile based noise estimation method gives signi - cantly better results but is more expensive in terms of computing time and memory.
Q10. How can the authors estimate the noise power spectrum from the speech signal?
This observation can be used to estimate a noise power spectrum N(!) from the observed speech signal X(!; t) by taking the q-th quantile over time in every frequency band.