Improving of Open-Set Language Identification by Using Deep SVM and Thresholding Functions

doi:10.1109/AICCSA.2017.119

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Deep learning for spoken language identification: Can we visualize speech signal patterns?

[...]

Himadri Mukherjee¹, Subhankar Ghosh², Shibaprasad Sen³, Obaidullah Sk⁴, K. C. Santosh⁵, Santanu Phadikar⁶, Kaushik Roy¹ - Show less +3 more•Institutions (6)

West Bengal State University¹, Indian Statistical Institute², Future Institute of Engineering and Management³, Aliah University⁴, University of South Dakota⁵, Islamic Azad University⁶

01 Dec 2019-Neural Computing and Applications

TL;DR: This paper proposes to use speech signal patterns for spoken language identification, where image-based features are used and the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results.

...read moreread less

Abstract: Western countries entertain speech recognition-based applications. It does not happen in a similar magnitude in East Asia. Language complexity could potentially be one of the primary reasons behind this lag. Besides, multilingual countries like India need to be considered so that language identification (words and phrases) can be possible through speech signals. Unlike the previous works, in this paper, we propose to use speech signal patterns for spoken language identification, where image-based features are used. The concept is primarily inspired from the fact that speech signal can be read/visualized. In our experiment, we use spectrograms (for image data) and deep learning for spoken language classification. Using the IIIT-H Indic speech database for Indic languages, we achieve the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results. Furthermore, for a relative decrease of 4018.60% in the signal-to-noise ratio, a decrease of only 0.50% in accuracy tells us the fact that our concept is fairly robust.

...read moreread less

20 citations

Journal Article•DOI•

Image-based features for speech signal classification

[...]

Himadri Mukherjee¹, Ankita Dhar¹, Sk Md Obaidullah², Santanu Phadikar³, Kaushik Roy¹ - Show less +1 more•Institutions (3)

West Bengal State University¹, Aliah University², Islamic Azad University³

01 Dec 2020-Multimedia Tools and Applications

TL;DR: This paper proposes image-based features for speech signal classification because it is possible to identify different patterns by visualizing their speech patterns and the highest accuracy of 94.51% was obtained.

...read moreread less

Abstract: Like other applications, under the purview of pattern classification, analyzing speech signals is crucial. People often mix different languages while talking which makes this task complicated. This happens mostly in India, since different languages are used from one state to another. Among many, Southern part of India suffers a lot from this situation, where distinguishing their languages is important. In this paper, we propose image-based features for speech signal classification because it is possible to identify different patterns by visualizing their speech patterns. Modified Mel frequency cepstral coefficient (MFCC) features namely MFCC- Statistics Grade (MFCC-SG) were extracted which were visualized by plotting techniques and thereafter fed to a convolutional neural network. In this study, we used the top 4 languages namely Telugu, Tamil, Malayalam, and Kannada. Experiments were performed on more than 900 hours of data collected from YouTube leading to over 150000 images and the highest accuracy of 94.51% was obtained.

...read moreread less

8 citations

Cites methods from "Improving of Open-Set Language Iden..."

...[29] used deep SVM for detecting out of set languages in the task of language identification and presented 3 formulations for the out of set languages as well....
[...]

Journal Article•DOI•

Linear Predictive Coefficients-Based Feature to Identify Top-Seven Spoken Languages

[...]

Himadri Mukherjee¹, Ankita Dhar¹, Sk Md Obaidullah², K. C. Santosh³, Santanu Phadikar⁴, Kaushik Roy¹ - Show less +2 more•Institutions (4)

West Bengal State University¹, Aliah University², University of South Dakota³, Islamic Azad University⁴

15 Jun 2020-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: Speech recognition in multilingual scenario is not trivial in the case when multiple languages are used in one conversation and language must be identified before speech recognition as such...

...read moreread less

Abstract: Speech recognition in multilingual scenario is not trivial in the case when multiple languages are used in one conversation. Language must be identified before we process speech recognition as such...

...read moreread less

7 citations

Journal Article•DOI•

Modernizing Open-Set Speech Language Identification

[...]

Mustafa Eyceoz, Justin Lee, Homayoon S. M. Beigi

20 May 2022-arXiv.org

TL;DR: This work tackles the open-set task by adapting two modern-day state-of-the-art approaches to closed-set language identiﬁcation: the first using a CRNN with attention and the second using a TDNN.

...read moreread less

Abstract: While most modern speech Language Identiﬁcation methods are closed-set, we want to see if they can be modiﬁed and adapted for the open-set problem. When switching to the open-set problem, the solution gains the ability to reject an audio input when it fails to match any of our known language options. We tackle the open-set task by adapting two modern-day state-of-the-art approaches to closed-set language identiﬁcation: the ﬁrst using a CRNN with attention and the second using a TDNN. In addition to enhancing our input feature embeddings using MFCCs, log spectral features, and pitch, we will be attempting two approaches to out-of-set language detection: one using thresholds, and the other essentially performing a veriﬁcation task. We will compare both the performance of the TDNN and the CRNN, as well as our detection approaches.

...read moreread less

Journal Article•DOI•

Addressing the semi-open set dialect recognition problem under resource-efficient considerations

[...]

Spandan Dey, Goutam Saha

01 Jul 2023-Speech Communication

TL;DR: In this article , the authors proposed a semi-open set approach for the spoken dialect recognition task, where a closed set model is exposed to unknown class inputs and utterances from other unknown classes are also included.

...read moreread less

Improving of Open-Set Language Identification by Using Deep SVM and Thresholding Functions

Citations

Cites methods from "Improving of Open-Set Language Iden..."

References

"Improving of Open-Set Language Iden..." refers background in this paper

"Improving of Open-Set Language Iden..." refers background or methods in this paper

Related Papers (5)

Trending Questions (1)