Journal ArticleDOI
Efficient text-independent speaker verification with structural Gaussian mixture models and neural network
Bing Xiang,T. Berger +1 more
TLDR
The experimental results show that computational reduction by a factor of 17 can be achieved with 5% relative reduction in equal error rate (EER) compared with the baseline, and the SGMM-SBM shows some advantages over the recently proposed hash GMM, including higher speed and better verification performance.Abstract:
We present an integrated system with structural Gaussian mixture models (SGMMs) and a neural network for purposes of achieving both computational efficiency and high accuracy in text-independent speaker verification. A structural background model (SBM) is constructed first by hierarchically clustering all Gaussian mixture components in a universal background model (UBM). In this way the acoustic space is partitioned into multiple regions in different levels of resolution. For each target speaker, a SGMM can be generated through multilevel maximum a posteriori (MAP) adaptation from the SBM. During test, only a small subset of Gaussian mixture components are scored for each feature vector in order to reduce the computational cost significantly. Furthermore, the scores obtained in different layers of the tree-structured models are combined via a neural network for final decision. Different configurations are compared in the experiments conducted on the telephony speech data used in the NIST speaker verification evaluation. The experimental results show that computational reduction by a factor of 17 can be achieved with 5% relative reduction in equal error rate (EER) compared with the baseline. The SGMM-SBM also shows some advantages over the recently proposed hash GMM, including higher speed and better verification performance.read more
Citations
More filters
Journal ArticleDOI
An overview of text-independent speaker recognition: From features to supervectors
Tomi Kinnunen,Haizhou Li +1 more
TL;DR: This paper starts with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling and elaborate advanced computational techniques to address robustness and session variability.
Book ChapterDOI
Statistical Pattern Recognition
TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.
Journal ArticleDOI
Real-time speaker identification and verification
TL;DR: This paper focuses on optimizing vector quantization (VQ) based speaker identification, which reduces the number of test vectors by pre-quantizing the test sequence prior to matching, and thenumber of speakers by pruning out unlikely speakers during the identification process.
Journal ArticleDOI
Speaker Identification Using Instantaneous Frequencies
Marco Grimaldi,Fred Cummins +1 more
TL;DR: A novel parametrization of speech that is based on the AM-FM representation of the speech signal and to assess the utility of these features in the context of speaker identification is presented.
Patent
Speaker verification system
TL;DR: In this article, a text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Book ChapterDOI
Learning internal representations by error propagation
TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Book
Learning internal representations by error propagation
TL;DR: In this paper, the problem of the generalized delta rule is discussed and the Generalized Delta Rule is applied to the simulation results of simulation results in terms of the generalized delta rule.
Book
Introduction to Statistical Pattern Recognition
TL;DR: This completely revised second edition presents an introduction to statistical pattern recognition, which is appropriate as a text for introductory courses in pattern recognition and as a reference book for workers in the field.
Journal ArticleDOI
Speaker Verification Using Adapted Gaussian Mixture Models
TL;DR: The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.
Related Papers (5)
Speaker Verification Using Adapted Gaussian Mixture Models
Robust text-independent speaker identification using Gaussian mixture speaker models
Douglas A. Reynolds,Richard Rose +1 more