Open AccessProceedings Article
Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features
Pedro A. Torres-Carrasquillo,Pedro A. Torres-Carrasquillo,Elliot Singer,Mary A. Kohler,Richard J. Greene,Douglas A. Reynolds,John R. Deller +6 more
TLDR
Two GMM-based approaches to language identification that use shifted delta cepstra (SDC) feature vectors to achieve LID performance comparable to that of the best phone-based systems are described.Abstract:
Published results indicate that automatic language identification (LID) systems that rely on multiple-language phone recognition and n-gram language modeling produce the best performance in formal LID evaluations. By contrast, Gaussian mixture model (GMM) systems, which measure acoustic characteristics, are far more efficient computationally but have tended to provide inferior levels of performance. This paper describes two GMM-based approaches to language identification that use shifted delta cepstra (SDC) feature vectors to achieve LID performance comparable to that of the best phone-based systems. The approaches include both acoustic scoring and a recently developed GMM tokenization system that is based on a variation of phonetic recognition and language modeling. System performance is evaluated on both the CallFriend and OGI corpora.read more
Citations
More filters
Journal ArticleDOI
Support vector machines for speaker and language recognition
William M. Campbell,Joseph P. Campbell,Douglas A. Reynolds,Elliot Singer,Pedro A. Torres-Carrasquillo +4 more
TL;DR: This work considers the application of SVMs to speaker and language recognition and uses a sequence kernel that compares sequences of feature vectors and produces a measure of similarity to build upon a simpler mean-squared error classifier to produce a more accurate system.
Proceedings ArticleDOI
A Comparison of Features for Synthetic Speech Detection
TL;DR: Comparative results indicate that features representing spectral information in high-frequency region, dynamic information of speech, and detailed information related to subband characteristics are considerably more useful in detecting synthetic speech detection task.
Journal ArticleDOI
Spoken Language Recognition: From Fundamentals to Practice
Haizhou Li,Bin Ma,Kong Aik Lee +2 more
TL;DR: This paper attempts to provide an introductory tutorial on the fundamentals of the theory and the state-of-the-art solutions of spoken language recognition, from both phonological and computational aspects.
Journal ArticleDOI
A Vector Space Modeling Approach to Spoken Language Identification
Haizhou Li,Bin Ma,Chin-Hui Lee +2 more
TL;DR: The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances.
Proceedings Article
Language Recognition in iVectors Space
TL;DR: To recognize language in the iVector space, three different linear classifiers are experiment with: one based on a generative model, where classes are modeled by Gaussian distributions with shared covariance matrix, and two discriminative classifiers, namely linear Support Vector Machine and Logistic Regression.
References
More filters
Journal ArticleDOI
Comparison of four approaches to automatic language identification of telephone speech
TL;DR: Four approaches for automatic language identification of speech utterances are compared: Gaussian mixture model (GMM) classification; single-language phone recognition followed by languaged dependent, interpolated n-gram language modeling (PRLM); parallel PRLM, which uses multiple single- language phone recognizers, each trained in a different language; and languagedependent parallel phone recognition (PPR).
Proceedings Article
The OGI multi-language telephone speech corpus.
TL;DR: The recording protocol, data collection procedure, ongoing corpus development, preliminary results of the statistical evaluation of the 10 languages, and plans to provide orthographic transcriptions of the speech are described.
Proceedings ArticleDOI
Language identification using Gaussian mixture model tokenization
TL;DR: Phone tokenization followed by n-gram language modeling has consistently provided good results for the task of language identification, but this technique is generalized by using Gaussian mixture models as the basis for tokenizing.
Proceedings ArticleDOI
Language identification using shifted delta cepstra
M.A. Kohler,M. Kennedy +1 more
TL;DR: A method for finding the optimal parameters for identifying a set of languages, specifies these parameters for a language identification task, and provides a performance comparison.