scispace - formally typeset
Proceedings ArticleDOI

Improving of Open-Set Language Identification by Using Deep SVM and Thresholding Functions

TLDR
This paper proposes a deep SVM based LID back-end system to improve the target languages identification and defines three OOS thresholding formulations, which are used to decide whether the speech segment is a target or OOS language.
Abstract
State-of-the-art language identification (LID) systems are based on an iVector feature extractor front-end followed by a multi-class recognition back-end. Identification accuracy degrades considerably when LID systems face open-set languages. As compared to in-set identification task, the open-set task is adequate to mimic the real challenge of language identification. In this paper, we propose an approach to the problem of out-of-set (OOS) data detection in the context of open-set language identification with zero-knowledge for OOS languages. The main feature of this study is the emphasis on the in-set (target) language identification, on the one hand, and on OOS language detection, on the other hand. Accordingly, we propose a deep SVM based LID back-end system to improve the target languages identification. Along with that, we define three OOS thresholding formulations. These formulations are used to decide whether the speech segment is a target or OOS language. The experimental results demonstrate the effectiveness of the deep SVM back-end system as compared to state-of-the-art techniques. Besides that, the thresholding functions perfectly detect and reject the OOS data. A relative decrease of 6% in Equal Error Rate (EER) is reported over classical OOS detection methods, in discriminating target and OOS languages.

read more

Citations
More filters
Journal ArticleDOI

Deep learning for spoken language identification: Can we visualize speech signal patterns?

TL;DR: This paper proposes to use speech signal patterns for spoken language identification, where image-based features are used and the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results.
Journal ArticleDOI

Image-based features for speech signal classification

TL;DR: This paper proposes image-based features for speech signal classification because it is possible to identify different patterns by visualizing their speech patterns and the highest accuracy of 94.51% was obtained.
Journal ArticleDOI

Linear Predictive Coefficients-Based Feature to Identify Top-Seven Spoken Languages

TL;DR: Speech recognition in multilingual scenario is not trivial in the case when multiple languages are used in one conversation and language must be identified before speech recognition as such...
Journal ArticleDOI

Modernizing Open-Set Speech Language Identification

TL;DR: This work tackles the open-set task by adapting two modern-day state-of-the-art approaches to closed-set language identification: the first using a CRNN with attention and the second using a TDNN.
Journal ArticleDOI

Addressing the semi-open set dialect recognition problem under resource-efficient considerations

Spandan Dey, +1 more
- 01 Jul 2023 - 
TL;DR: In this article , the authors proposed a semi-open set approach for the spoken dialect recognition task, where a closed set model is exposed to unknown class inputs and utterances from other unknown classes are also included.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

Front-End Factor Analysis for Speaker Verification

TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.
Proceedings Article

Support vector machines for multi-class pattern recognition.

TL;DR: A formulation of the SVM is proposed that enables a multi-class pattern recognition problem to be solved in a single optimisation and a similar generalization of linear programming machines is proposed.
Proceedings Article

Kernel Methods for Deep Learning

TL;DR: A new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets are introduced that can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that the authors call multilayers kernel machines (MKMs).
Journal ArticleDOI

A Study of Interspeaker Variability in Speaker Verification

TL;DR: It is shown that when a large joint factor analysis model is trained in this way and tested on the core condition, the extended data condition and the cross-channel condition, it is capable of performing at least as well as fusions of multiple systems of other types.
Related Papers (5)
Trending Questions (1)
Is SVM a part of deep learning?

The experimental results demonstrate the effectiveness of the deep SVM back-end system as compared to state-of-the-art techniques.