scispace - formally typeset
Open AccessProceedings Article

A computationally efficient approach to warp factor estimation in VTLN using EM algorithm and sufficient statistics.

Reads0
Chats0
TLDR
This paper develops a computationally efficient approach for warp factor estimation in Vocal Tract Length Normalization (VTLN) that has recognition performance that is comparable to conventional VTLN and yet is computationally more efficient.
Abstract
In this paper, we develop a computationally efficient approach for warp factor estimation in Vocal Tract Length Normalization (VTLN). Recently we have shown that warped features can be obtained by a linear transformation of the unwarped features. Using the warp matrices we show that warp factor estimation can be efficiently performed in an EM framework. This can be done by collecting Sufficient Statistics by aligning the unwarped utterances only once. The likelihood of warped features, which are necessary for warp factor estimation, are computed by appropriately modifying the sufficient statistics using the warp matrices. We show using OGI, TIDIGITS and RM task that this approach has recognition performance that is comparable to conventional VTLN and yet is computationally more efficient.

read more

Citations
More filters
Proceedings Article

Prior Information for Rapid Speaker Adaptation

TL;DR: A count-smoothing framework for incorporating prior information is extended to allow for the use of different forms of dynamic prior and improve the robustness of transform estimation on small amounts of data.
Journal ArticleDOI

Vocal Tract Length Normalization for Statistical Parametric Speech Synthesis

TL;DR: This paper presents an efficient implementation of VTLN using expectation maximization and addresses the key challenges faced in implementing V TLN for synthesis.
Journal ArticleDOI

VTLN Using Analytically Determined Linear-Transformation on Conventional MFCC

TL;DR: A method to analytically obtain a linear-transformation on the conventional Mel frequency cepstral coefficients (MFCC) features that corresponds to conventional vocal tract length normalization (VTLN)-warped MFCC features, thereby simplifying the VTLN processing.

Implementation of vtln for statistical speech synthesis

TL;DR: The EM formulation helps to embed the feature normalization in the HMM training and enables the use of multiple (appropriate) warping factors for different state clusters of the same speaker.
Proceedings Article

Acoustic Class Specific VTLN-Warping Using Regression Class Trees

TL;DR: This paper has shown that, in this framework of VTLN, and using the idea of regression class tree, one can obtain separate V TLN-warping for different acoustic classes, and the recognition performance of the proposed acoustic class specific warp-factor is shown.
References
More filters
Journal ArticleDOI

Maximum likelihood linear transformations for HMM-based speech recognition

TL;DR: The paper compares the two possible forms of model-based transforms: unconstrained, where any combination of mean and variance transform may be used, and constrained, which requires the variance transform to have the same form as the mean transform.
Journal ArticleDOI

A frequency warping approach to speaker normalization

TL;DR: An efficient means for estimating a linear frequency Warping factor and a simple mechanism for implementing frequency warping by modifying the filterbank in mel-frequency cepstrum feature analysis are presented.
Proceedings ArticleDOI

Using VTLN for broadcast news transcription.

TL;DR: A new, simple, linear approximation to VTLN allows the Jacobian to be exactly computed and can be highly efficient in terms of warp factor estimation and application of the warp factors.
Proceedings ArticleDOI

Linear transformation approach to VTLN using dynamic frequency warping.

TL;DR: The proposed DFW approach provides computational advantage of not having to recompute features for each warp factor in VTLN and can obtain a transformation matrix for any arbitrary warping even when it does not know the functional form or mapping of the warping function.
Related Papers (5)