A vector Taylor series approach for environment-independent speech recognition
read more
Citations
Application of Hidden Markov Models in Speech Recognition
Ideal ratio mask estimation using deep neural networks for robust speech recognition
Power-normalized cepstral coefficients (PNCC) for robust speech recognition
Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition
Histogram equalization of speech representation for robust speech recognition
References
Acoustical and environmental robustness in automatic speech recognition
Probabilistic optimum filtering for robust speech recognition
A fast and flexible implementation of parallel model combination
Multivariate-Gaussian-based cepstral normalization for robust speech recognition
Environmental adaptation for robust speech recognition
Related Papers (5)
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
Frequently Asked Questions (8)
Q2. What is the key of the new VTS algorithms?
The key of the new VTS algorithms is to approximate the generic vector function with a vector Taylor series approximation:where is the vector function evaluated at a par-ticular vector point.
Q3. What is the description of the VTS algorithm?
the authors speculate that using more generic polynomial approximations that are opti-mized to minimize the error for the parameters of the distribution of z may give us even better performance.
Q4. How was the effectiveness of the VTS algorithm evaluated?
The effectiveness of the VTS algorithms was evaluated by artificially contaminating utterances from the CMU census database [3] and from the ARPA Wall Street Journal task with white noise at different SNRs.
Q5. What is the purpose of the VTS algorithm?
Once the pdf of the noisy speech is computed, minimum mean square estimation (MMSE) can be used to predict the unobserved clean speech sequence.
Q6. What is the function of the linear filter?
As in previous papers the authors assume a model of the environment in which speech is corrupted by unknown additive stationary noise and linearly filtered by an unknown channel:where represents the power spectrum of the degradedspeech, is the power spectrum of the clean speech,is the transfer function of the linear filter, and is the power spectrum of the additive noise.
Q7. What is the MMSE estimate of the speech?
Once the parameters of the distribution of z are computed, an MMSE estimate is used to calculate the clean speech given the observed noisy speech
Q8. What are the main advantages of the CDCN algorithm?
Still other algorithms (e.g. [6]) use knowledge of noise statistics and extensive computation to adapt the HMMs of clean speech to a new environment.