scispace - formally typeset
Proceedings ArticleDOI

Linear feature space projections for speaker adaptation

George Saon, +2 more
- Vol. 1, pp 325-328
Reads0
Chats0
TLDR
The well-known technique of constrained maximum likelihood linear regression is extended to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data, and the resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection.
Abstract
We extend the well-known technique of constrained maximum likelihood linear regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian distribution. Subsequently, we compute the projection and its complement using maximum likelihood techniques. The resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection. Our method is in contrast to traditional approaches which use a single speaker-independent projection, and execute speaker adaptation in the resulting subspace. Experimental results on Switchboard show a 3% relative improvement in the word error rate over constrained MLLR in the projected subspace only.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

An overview of noise-robust automatic speech recognition

TL;DR: A thorough overview of modern noise-robust techniques for ASR developed over the past 30 years is provided and methods that are proven to be successful and that are likely to sustain or expand their future applicability are emphasized.
Proceedings ArticleDOI

High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series

TL;DR: A model-domain environment-robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task and adaptation of the dynamic portion of the HMM mean and variance parameters is critical to the success of the algorithm.
Proceedings ArticleDOI

Joint uncertainty decoding for noise robust speech recognition

TL;DR: This paper describes a new approach within this framework, Joint uncertainty decoding, which is compared with the uncertainty decoding version ofSPLICE, standardSPLice, and a new form of front-end CMLLR and is evaluated on a medium vocabulary speech recognition task with artificially added noise.
Journal ArticleDOI

A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions

TL;DR: A model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task without discriminative training of the HMM system, using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation.
Book ChapterDOI

Automatic Speech Recognition

TL;DR: In this paper, an ASR system for close-talking microphones is presented, where a best-case acoustic channel scenario is used to compare against a similar scenario in a CHIL environment.
References
More filters
Journal ArticleDOI

Maximum likelihood linear transformations for HMM-based speech recognition

TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.
Journal ArticleDOI

Semi-tied covariance matrices for hidden Markov models

TL;DR: A new form of covariance matrix which allows a few "full" covariance matrices to be shared over many distributions, whilst each distribution maintains its own "diagonal" covariancy matrix is introduced.
Proceedings ArticleDOI

A compact model for speaker-adaptive training

TL;DR: A novel approach to estimating the parameters of continuous density HMMs for speaker-independent (SI) continuous speech recognition that jointly annihilates the inter-speaker variation and estimates the HMM parameters of the SI acoustic models.
Journal ArticleDOI

Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition

TL;DR: Theoretical results to the problem of speech recognition are applied and word-error reduction in systems that employed both diagonal and full covariance heteroscedastic Gaussian models tested on the TI-DIGITS database is observed.
Proceedings ArticleDOI

Maximum likelihood discriminant feature spaces

TL;DR: A new approach to HDA is presented by defining an objective function which maximizes the class discrimination in the projected subspace while ignoring the rejected dimensions, and it is shown that, under diagonal covariance Gaussian modeling constraints, applying a diagonalizing linear transformation to the HDA space results in increased classification accuracy even though HDA alone actually degrades the recognition performance.
Related Papers (5)
Trending Questions (1)
How do I make a speaker amplifier circuit?

Our method is in contrast to traditional approaches which use a single speaker-independent projection, and execute speaker adaptation in the resulting subspace.