Linear feature space projections for speaker adaptation

doi:10.1109/ICASSP.2001.940833

Proceedings ArticleDOI

Linear feature space projections for speaker adaptation

George Saon, +2 more

- Vol. 1, pp 325-328

Chats0

TLDR

The well-known technique of constrained maximum likelihood linear regression is extended to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data, and the resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection.

Abstract:

We extend the well-known technique of constrained maximum likelihood linear regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian distribution. Subsequently, we compute the projection and its complement using maximum likelihood techniques. The resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection. Our method is in contrast to traditional approaches which use a single speaker-independent projection, and execute speaker adaptation in the resulting subspace. Experimental results on Switchboard show a 3% relative improvement in the word error rate over constrained MLLR in the projected subspace only.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

An overview of noise-robust automatic speech recognition

Jinyu Li, +3 more

- 01 Apr 2014 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: A thorough overview of modern noise-robust techniques for ASR developed over the past 30 years is provided and methods that are proven to be successful and that are likely to sustain or expand their future applicability are emphasized.

...read moreread less

Proceedings ArticleDOI

High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series

Jinyu Li, +4 more

TL;DR: A model-domain environment-robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task and adaptation of the dynamic portion of the HMM mean and variance parameters is critical to the success of the algorithm.

...read moreread less

Proceedings ArticleDOI

Joint uncertainty decoding for noise robust speech recognition

Hank Liao, +1 more

TL;DR: This paper describes a new approach within this framework, Joint uncertainty decoding, which is compared with the uncertainty decoding version ofSPLICE, standardSPLice, and a new form of front-end CMLLR and is evaluated on a medium vocabulary speech recognition task with artificially added noise.

...read moreread less

Journal ArticleDOI

A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions

Jinyu Li, +4 more

- 01 Jul 2009 -

Computer Speech & Language

TL;DR: A model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task without discriminative training of the HMM system, using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation.

...read moreread less

Book ChapterDOI

Automatic Speech Recognition

Gerasimos Potamianos, +10 more

TL;DR: In this paper, an ASR system for close-talking microphones is presented, where a best-case acoustic channel scenario is used to compare against a similar scenario in a CHIL environment.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Maximum likelihood linear transformations for HMM-based speech recognition

Mark J. F. Gales

- 01 Apr 1998 -

Computer Speech & Language

TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.

...read moreread less

Journal ArticleDOI

Semi-tied covariance matrices for hidden Markov models

Mark J. F. Gales

- 01 May 1999 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: A new form of covariance matrix which allows a few "full" covariance matrices to be shared over many distributions, whilst each distribution maintains its own "diagonal" covariancy matrix is introduced.

...read moreread less

Proceedings ArticleDOI

A compact model for speaker-adaptive training

Tasos Anastasakos, +3 more

TL;DR: A novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition that jointly annihilates the inter-speaker variation and estimates the HMM parameters of the SI acoustic models.

...read moreread less

Journal ArticleDOI

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

N. Kumar, +1 more

- 01 Dec 1998 -

Speech Communication

TL;DR: Theoretical results to the problem of speech recognition are applied and word-error reduction in systems that employed both diagonal and full covariance heteroscedastic Gaussian models tested on the TI-DIGITS database is observed.

...read moreread less

Proceedings ArticleDOI

Maximum likelihood discriminant feature spaces

George Saon, +3 more

TL;DR: A new approach to HDA is presented by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions, and it is shown that, under diagonal covariance Gaussian modeling constraints, applying a diagonalizing linear transformation to the HDA space results in increased classification accuracy even though HDA alone actually degrades the recognition performance.

...read moreread less

Linear feature space projections for speaker adaptation

Citations

An overview of noise-robust automatic speech recognition

High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series

Joint uncertainty decoding for noise robust speech recognition

A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions

Automatic Speech Recognition

References

Maximum likelihood linear transformations for HMM-based speech recognition

Semi-tied covariance matrices for hidden Markov models

A compact model for speaker-adaptive training

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

Maximum likelihood discriminant feature spaces

Related Papers (5)

Maximum likelihood linear transformations for HMM-based speech recognition

Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models

A compact model for speaker-adaptive training

Semi-tied covariance matrices for hidden Markov models

HMM adaptation using vector taylor series for noisy speech recognition.

Trending Questions (1)