Proceedings ArticleDOI
Linear feature space projections for speaker adaptation
George Saon,Geoffrey Zweig,Mukund Padmanabhan +2 more
- Vol. 1, pp 325-328
Reads0
Chats0
TLDR
The well-known technique of constrained maximum likelihood linear regression is extended to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data, and the resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection.Abstract:
We extend the well-known technique of constrained maximum likelihood linear regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian distribution. Subsequently, we compute the projection and its complement using maximum likelihood techniques. The resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection. Our method is in contrast to traditional approaches which use a single speaker-independent projection, and execute speaker adaptation in the resulting subspace. Experimental results on Switchboard show a 3% relative improvement in the word error rate over constrained MLLR in the projected subspace only.read more
Citations
More filters
Journal ArticleDOI
An overview of noise-robust automatic speech recognition
TL;DR: A thorough overview of modern noise-robust techniques for ASR developed over the past 30 years is provided and methods that are proven to be successful and that are likely to sustain or expand their future applicability are emphasized.
Proceedings ArticleDOI
High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series
TL;DR: A model-domain environment-robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task and adaptation of the dynamic portion of the HMM mean and variance parameters is critical to the success of the algorithm.
Proceedings ArticleDOI
Joint uncertainty decoding for noise robust speech recognition
Hank Liao,Mark J. F. Gales +1 more
TL;DR: This paper describes a new approach within this framework, Joint uncertainty decoding, which is compared with the uncertainty decoding version ofSPLICE, standardSPLice, and a new form of front-end CMLLR and is evaluated on a medium vocabulary speech recognition task with artificially added noise.
Journal ArticleDOI
A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions
TL;DR: A model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task without discriminative training of the HMM system, using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation.
Book ChapterDOI
Automatic Speech Recognition
Gerasimos Potamianos,Lori Lamel,Matthias Wölfel,Jing Huang,Etienne Marcheret,Claude Barras,Xuan Zhu,John McDonough,Javier Hernando,Dusan Macho,Climent Nadeu +10 more
TL;DR: In this paper, an ASR system for close-talking microphones is presented, where a best-case acoustic channel scenario is used to compare against a similar scenario in a CHIL environment.
References
More filters
Journal ArticleDOI
Maximum likelihood linear transformations for HMM-based speech recognition
TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.
Journal ArticleDOI
Semi-tied covariance matrices for hidden Markov models
TL;DR: A new form of covariance matrix which allows a few "full" covariance matrices to be shared over many distributions, whilst each distribution maintains its own "diagonal" covariancy matrix is introduced.
Proceedings ArticleDOI
A compact model for speaker-adaptive training
TL;DR: A novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition that jointly annihilates the inter-speaker variation and estimates the HMM parameters of the SI acoustic models.
Journal ArticleDOI
Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition
N. Kumar,Andreas G. Andreou +1 more
TL;DR: Theoretical results to the problem of speech recognition are applied and word-error reduction in systems that employed both diagonal and full covariance heteroscedastic Gaussian models tested on the TI-DIGITS database is observed.
Proceedings ArticleDOI
Maximum likelihood discriminant feature spaces
TL;DR: A new approach to HDA is presented by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions, and it is shown that, under diagonal covariance Gaussian modeling constraints, applying a diagonalizing linear transformation to the HDA space results in increased classification accuracy even though HDA alone actually degrades the recognition performance.