Journal ArticleDOI
Rapid speaker adaptation in eigenvoice space
Reads0
Chats0
TLDR
A new model-based speaker adaptation algorithm called the eigenvoice approach, which constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free parameters to be estimated from adaptation data.Abstract:
This paper describes a new model-based speaker adaptation algorithm called the eigenvoice approach. The approach constrains the adapted model to be a linear combination of a small number of basis vectors obtained offline from a set of reference speakers, and thus greatly reduces the number of free parameters to be estimated from adaptation data. These "eigenvoice" basis vectors are orthogonal to each other and guaranteed to represent the most important components of variation between the reference speakers. Experimental results for a small-vocabulary task (letter recognition) given in the paper show that the approach yields major improvements in performance for tiny amounts of adaptation data. For instance, we obtained 16% relative improvement in error rate with one letter of supervised adaptation data, and 26% relative improvement with four letters of supervised adaptation data. After a comparison of the eigenvoice approach with other speaker adaptation algorithms, the paper concludes with a discussion of future work.read more
Citations
More filters
Journal ArticleDOI
Statistical Parametric Speech Synthesis
TL;DR: This paper gives a general overview of techniques in statistical parametric speech synthesis, and contrasts these techniques with the more conventional unit selection technology that has dominated speech synthesis over the last ten years.
Journal ArticleDOI
Speaker Recognition by Machines and Humans: A tutorial review
John H. L. Hansen,Taufiq Hasan +1 more
TL;DR: A comparative study of human versus machine speaker recognition is concluded, with an emphasis on prominent speaker-modeling techniques that have emerged in the last decade for automatic systems.
Journal ArticleDOI
An overview of noise-robust automatic speech recognition
TL;DR: A thorough overview of modern noise-robust techniques for ASR developed over the past 30 years is provided and methods that are proven to be successful and that are likely to sustain or expand their future applicability are emphasized.
Journal ArticleDOI
Eigenvoice modeling with sparse training data
TL;DR: This work derives an exact solution to the problem of maximum likelihood estimation of the supervector covariance matrix used in extended MAP (or EMAP) speaker adaptation and shows how it can be regarded as a new method of eigenvoice estimation.
Journal ArticleDOI
Speech Synthesis Based on Hidden Markov Models
TL;DR: This paper gives a general overview of hidden Markov model (HMM)-based speech synthesis, which has recently been demonstrated to be very effective in synthesizing speech.
References
More filters
Book
Principal Component Analysis
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Journal ArticleDOI
Eigenfaces for recognition
Matthew Turk,Alex Pentland +1 more
TL;DR: A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture.
Journal ArticleDOI
Application of the Karhunen-Loeve procedure for the characterization of human faces
Michael Kirby,Lawrence Sirovich +1 more
TL;DR: The use of natural symmetries (mirror images) in a well-defined family of patterns (human faces) is discussed within the framework of the Karhunen-Loeve expansion, which results in an extension of the data and imposes even and odd symmetry on the eigenfunctions of the covariance matrix.
Journal ArticleDOI
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
TL;DR: An important feature of the method is that arbitrary adaptation data can be used—no special enrolment sentences are needed and that as more data is used the adaptation performance improves.
Journal ArticleDOI
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
Jean-Luc Gauvain,Chin-Hui Lee +1 more
TL;DR: A framework for maximum a posteriori (MAP) estimation of hidden Markov models (HMM) is presented, and Bayesian learning is shown to serve as a unified approach for a wide range of speech recognition applications.
Related Papers (5)
Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models
Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains
Jean-Luc Gauvain,Chin-Hui Lee +1 more