Deep learning for spoken language identification: Can we visualize speech signal patterns?

doi:10.1007/S00521-019-04468-3

Journal ArticleDOI

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Himadri Mukherjee, +6 more

- 01 Dec 2019 -

Neural Computing and Applications

- Vol. 31, Iss: 12, pp 8483-8501

Chats0

TLDR

This paper proposes to use speech signal patterns for spoken language identification, where image-based features are used and the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results.

Abstract:

Western countries entertain speech recognition-based applications. It does not happen in a similar magnitude in East Asia. Language complexity could potentially be one of the primary reasons behind this lag. Besides, multilingual countries like India need to be considered so that language identification (words and phrases) can be possible through speech signals. Unlike the previous works, in this paper, we propose to use speech signal patterns for spoken language identification, where image-based features are used. The concept is primarily inspired from the fact that speech signal can be read/visualized. In our experiment, we use spectrograms (for image data) and deep learning for spoken language classification. Using the IIIT-H Indic speech database for Indic languages, we achieve the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results. Furthermore, for a relative decrease of 4018.60% in the signal-to-noise ratio, a decrease of only 0.50% in accuracy tells us the fact that our concept is fairly robust.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A bibliometric analysis on deep learning during 2007–2019

Yang Li, +3 more

- 28 Jun 2020 -

International Journal of Machine Learnin...

TL;DR: A comprehensive analysis of publications of DL from 2007 to 2019 is deployed and a preliminary knowledge of DL is provided for researchers who are interested in this area and a conclusive and comprehensive analysis is made for these who want to do further research on this area.

...read moreread less

Journal ArticleDOI

A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

Aankit Das, +5 more

- 01 Oct 2020 -

IEEE Access

TL;DR: A new nature-inspired feature selection (FS) algorithm is developed by hybridizing Binary Bat Algorithm with Late Acceptance Hill-Climbing (LAHC) to select the optimal subset from the said feature vectors in order to reduce the model complexity and help it train faster.

...read moreread less

Journal ArticleDOI

Hybrid Feature Selection Method Based on Harmony Search and Naked Mole-Rat Algorithms for Spoken Language Identification From Audio Signals

Samarpan Guha, +5 more

- 05 Oct 2020 -

IEEE Access

TL;DR: A new hybrid Feature Selection (FS) algorithm have been developed using the versatile Harmony Search (HS) algorithm and a new nature-inspired algorithm called Naked Mole-Rat (NMR) algorithm to select the best subset of features and reduce the model complexity to help it train faster.

...read moreread less

Journal ArticleDOI

Automatic spoken language identification using MFCC based time series features

Mainak Biswas, +4 more

- 03 Jan 2022 -

Multimedia Tools and Applications

TL;DR: This work proposes a model for the recognition of Indian and foreign languages, and augments data with noise of varying loudness taken from diverse environments to make the model robust to noise from everyday life.

...read moreread less

Journal ArticleDOI

A CNN-BiLSTM based hybrid model for Indian language identification

Himanish Shekhar Das, +1 more

- 01 Nov 2021 -

Applied Acoustics

TL;DR: CNN based bidirectional long short-term memory (BiLSTM) model has been proposed for Indian language identification with special emphasis on Northeastern languages and results show that the ResNet-50 based model has achieved accuracy up to 98.10% as compare to 97.70% for VGG-16 based model.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Deep learning

Yann LeCun, +4 more

- 28 May 2015 -

Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

Book

Deep Learning

Ian Goodfellow, +2 more

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Book

Pattern Classification

Peter E. Hart, +2 more

Journal ArticleDOI

The WEKA data mining software: an update

Mark Hall, +5 more

- 16 Nov 2009 -

Sigkdd Explorations

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.

...read moreread less

Posted Content

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Andrew Howard, +7 more

- 17 Apr 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

...read moreread less

Collapse

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Citations

A bibliometric analysis on deep learning during 2007–2019

A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

Hybrid Feature Selection Method Based on Harmony Search and Naked Mole-Rat Algorithms for Spoken Language Identification From Audio Signals

Automatic spoken language identification using MFCC based time series features

A CNN-BiLSTM based hybrid model for Indian language identification

References

Deep learning

Deep Learning

Pattern Classification

The WEKA data mining software: an update

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Related Papers (5)

Spoken language processing techniques for sign language recognition and translation

State of the art in continuous speech recognition

Towards mixed language speech recognition systems

Discovering linguistic structures in speech : models and applications

Automatic language identification with sequences of language-independent phoneme clusters