scispace - formally typeset
Open AccessProceedings Article

Approaches to Language Identification using Gaussian Mixture Models and Shifted Delta Cepstral Features

TLDR
Two GMM-based approaches to language identification that use shifted delta cepstra (SDC) feature vectors to achieve LID performance comparable to that of the best phone-based systems are described.
Abstract
Published results indicate that automatic language identification (LID) systems that rely on multiple-language phone recognition and n-gram language modeling produce the best performance in formal LID evaluations. By contrast, Gaussian mixture model (GMM) systems, which measure acoustic characteristics, are far more efficient computationally but have tended to provide inferior levels of performance. This paper describes two GMM-based approaches to language identification that use shifted delta cepstra (SDC) feature vectors to achieve LID performance comparable to that of the best phone-based systems. The approaches include both acoustic scoring and a recently developed GMM tokenization system that is based on a variation of phonetic recognition and language modeling. System performance is evaluated on both the CallFriend and OGI corpora.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Support vector machines for speaker and language recognition

TL;DR: This work considers the application of SVMs to speaker and language recognition and uses a sequence kernel that compares sequences of feature vectors and produces a measure of similarity to build upon a simpler mean-squared error classifier to produce a more accurate system.
Proceedings ArticleDOI

A Comparison of Features for Synthetic Speech Detection

TL;DR: Comparative results indicate that features representing spectral information in high-frequency region, dynamic information of speech, and detailed information related to subband characteristics are considerably more useful in detecting synthetic speech detection task.
Journal ArticleDOI

Spoken Language Recognition: From Fundamentals to Practice

TL;DR: This paper attempts to provide an introductory tutorial on the fundamentals of the theory and the state-of-the-art solutions of spoken language recognition, from both phonological and computational aspects.
Journal ArticleDOI

A Vector Space Modeling Approach to Spoken Language Identification

TL;DR: The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances.
Proceedings Article

Language Recognition in iVectors Space

TL;DR: To recognize language in the iVector space, three different linear classifiers are experiment with: one based on a generative model, where classes are modeled by Gaussian distributions with shared covariance matrix, and two discriminative classifiers, namely linear Support Vector Machine and Logistic Regression.
References
More filters
Journal ArticleDOI

Comparison of four approaches to automatic language identification of telephone speech

TL;DR: Four approaches for automatic language identification of speech utterances are compared: Gaussian mixture model (GMM) classification; single-language phone recognition followed by languaged dependent, interpolated n-gram language modeling (PRLM); parallel PRLM, which uses multiple single- language phone recognizers, each trained in a different language; and languagedependent parallel phone recognition (PPR).
Proceedings Article

The OGI multi-language telephone speech corpus.

TL;DR: The recording protocol, data collection procedure, ongoing corpus development, preliminary results of the statistical evaluation of the 10 languages, and plans to provide orthographic transcriptions of the speech are described.
Proceedings ArticleDOI

Language identification using Gaussian mixture model tokenization

TL;DR: Phone tokenization followed by n-gram language modeling has consistently provided good results for the task of language identification, but this technique is generalized by using Gaussian mixture models as the basis for tokenizing.
Proceedings ArticleDOI

Language identification using shifted delta cepstra

TL;DR: A method for finding the optimal parameters for identifying a set of languages, specifies these parameters for a language identification task, and provides a performance comparison.
Related Papers (5)