Speaker and Session Variability in GMM-Based Speaker Verification

doi:10.1109/TASL.2007.894527

Journal ArticleDOI

Speaker and Session Variability in GMM-Based Speaker Verification

Patrick Kenny, +3 more

- 01 May 2007 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 15, Iss: 4, pp 1448-1460

TLDR

A corpus-based approach to speaker verification in which maximum-likelihood II criteria are used to train a large-scale generative model of speaker and session variability which is called joint factor analysis is presented.

Abstract:

We present a corpus-based approach to speaker verification in which maximum-likelihood II criteria are used to train a large-scale generative model of speaker and session variability which we call joint factor analysis. Enrolling a target speaker consists in calculating the posterior distribution of the hidden variables in the factor analysis model and verification tests are conducted using a new type of likelihood II ratio statistic. Using the NIST 1999 and 2000 speaker recognition evaluation data sets, we show that the effectiveness of this approach depends on the availability of a training corpus which is well matched with the evaluation set used for testing. Experiments on the NIST 1999 evaluation set using a mismatched corpus to train factor analysis models did not result in any improvement over standard methods, but we found that, even with this type of mismatch, feature warping performs extremely well in conjunction with the factor analysis model, and this enabled us to obtain very good results (equal error rates of about 6.2%)

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

Najim Dehak, +4 more

- 01 May 2011 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.

...read moreread less

Journal ArticleDOI

An overview of text-independent speaker recognition: From features to supervectors

Tomi Kinnunen, +1 more

- 01 Jan 2010 -

Speech Communication

TL;DR: This paper starts with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling and elaborate advanced computational techniques to address robustness and session variability.

...read moreread less

Proceedings ArticleDOI

Deep neural networks for small footprint text-dependent speaker verification

Ehsan Variani, +4 more

TL;DR: Experimental results show the DNN based speaker verification system achieves good performance compared to a popular i-vector system on a small footprint text-dependent speaker verification task and is more robust to additive noise and outperforms the i- vector system at low False Rejection operating points.

...read moreread less

Journal ArticleDOI

Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

Patrick Kenny, +3 more

- 01 May 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown how the two approaches to the problem of session variability in Gaussian mixture model (GMM)-based speaker verification, eigenchannels, and joint factor analysis can be implemented using essentially the same software at all stages except for the enrollment of target speakers.

...read moreread less

Bayesian Speaker Verification with Heavy-Tailed Priors.

Patrick Kenny

TL;DR: A new approach to speaker verification is described which is based on a generative model of speaker and channel effects but differs from Joint Factor Analysis in several respects, including each utterance is represented by a low dimensional feature vector rather than by a high dimensional set of Baum-Welch statistics.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Speaker Verification Using Adapted Gaussian Mixture Models

Douglas A. Reynolds, +2 more

- 01 Jan 2000 -

Digital Signal Processing

TL;DR: The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.

...read moreread less

Journal ArticleDOI

Score Normalization for Text-Independent Speaker Verification Systems

Roland Auckenthaler, +2 more

- 01 Jan 2000 -

Digital Signal Processing

TL;DR: The test normalization method is extended to use knowledge of the handset type, and the world, cohort, and zero normalization techniques are explained.

...read moreread less

Journal ArticleDOI

Joint Factor Analysis Versus Eigenchannels in Speaker Recognition

Patrick Kenny, +3 more

- 01 May 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown how the two approaches to the problem of session variability in Gaussian mixture model (GMM)-based speaker verification, eigenchannels, and joint factor analysis can be implemented using essentially the same software at all stages except for the enrollment of target speakers.

...read moreread less

Journal ArticleDOI

EM Algorithms for ML Factor Analysis.

Donald B. Rubin, +1 more

- 01 Mar 1982 -

Psychometrika

TL;DR: In this paper, the authors present EM algorithms for both exploratory and confirmatory models for maximum likelihood factor analysis, which are essentially the same for both cases and involve only simple least squares regression operations; the largest matrix inversion required is for aq ×q symmetric matrix whereq is the matrix of factors.

...read moreread less

Feature Warping for Robust Speaker Verification

Jason W. Pelecanos, +1 more

TL;DR: In this paper, the authors proposed a target mapping method that warps the distribution of a cepstral feature stream to a standardised distribution over a specified time interval, which is robust to channel mismatch, additive noise and to some extent, non-linear effects attributed to transducers.

...read moreread less