scispace - formally typeset
Proceedings ArticleDOI

Non-negative subspace projection during conventional MFCC feature extraction for noise robust speech recognition

Reads0
Chats0
TLDR
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition.
Abstract
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).

read more

Citations
More filters
Posted Content

Feature Normalisation for Robust Speech Recognition

TL;DR: An MLLR-based computationally efficient run-time noise adaptation method in SPLICE framework has been proposed, and a modification to the training process of SPLICE algorithm for noise robust speech recognition is proposed.
Book ChapterDOI

Stressed Speech Recognition Using Similarity Measurement on Inner Product Space

TL;DR: From experiment, it is observed that, stress information of stressed speech is not present in the complement cosine (1-cosine) times of stress speech on different inner product space.
References
More filters
Journal ArticleDOI

Learning the parts of objects by non-negative matrix factorization

TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.

Learning parts of objects by non-negative matrix factorization

D. D. Lee
TL;DR: In this article, non-negative matrix factorization is used to learn parts of faces and semantic features of text, which is in contrast to principal components analysis and vector quantization that learn holistic, not parts-based, representations.
Proceedings Article

Algorithms for Non-negative Matrix Factorization

TL;DR: Two different multiplicative algorithms for non-negative matrix factorization are analyzed and one algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence.
Journal ArticleDOI

Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition

TL;DR: The results show that the hybrid system performed substantially better than source separation or missing data mask estimation at lower signal-to-noise ratios (SNRs), achieving up to 57.1% accuracy at SNR = -5 dB.
Journal ArticleDOI

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

TL;DR: Theoretical results to the problem of speech recognition are applied and word-error reduction in systems that employed both diagonal and full covariance heteroscedastic Gaussian models tested on the TI-DIGITS database is observed.
Related Papers (5)