scispace - formally typeset
Journal ArticleDOI

Deep learning for spoken language identification: Can we visualize speech signal patterns?

Reads0
Chats0
TLDR
This paper proposes to use speech signal patterns for spoken language identification, where image-based features are used and the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results.
Abstract
Western countries entertain speech recognition-based applications. It does not happen in a similar magnitude in East Asia. Language complexity could potentially be one of the primary reasons behind this lag. Besides, multilingual countries like India need to be considered so that language identification (words and phrases) can be possible through speech signals. Unlike the previous works, in this paper, we propose to use speech signal patterns for spoken language identification, where image-based features are used. The concept is primarily inspired from the fact that speech signal can be read/visualized. In our experiment, we use spectrograms (for image data) and deep learning for spoken language classification. Using the IIIT-H Indic speech database for Indic languages, we achieve the highest accuracy of 99.96%, which outperforms the state-of-the-art reported results. Furthermore, for a relative decrease of 4018.60% in the signal-to-noise ratio, a decrease of only 0.50% in accuracy tells us the fact that our concept is fairly robust.

read more

Citations
More filters
Journal ArticleDOI

A bibliometric analysis on deep learning during 2007–2019

TL;DR: A comprehensive analysis of publications of DL from 2007 to 2019 is deployed and a preliminary knowledge of DL is provided for researchers who are interested in this area and a conclusive and comprehensive analysis is made for these who want to do further research on this area.
Journal ArticleDOI

A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals

TL;DR: A new nature-inspired feature selection (FS) algorithm is developed by hybridizing Binary Bat Algorithm with Late Acceptance Hill-Climbing (LAHC) to select the optimal subset from the said feature vectors in order to reduce the model complexity and help it train faster.
Journal ArticleDOI

Hybrid Feature Selection Method Based on Harmony Search and Naked Mole-Rat Algorithms for Spoken Language Identification From Audio Signals

TL;DR: A new hybrid Feature Selection (FS) algorithm have been developed using the versatile Harmony Search (HS) algorithm and a new nature-inspired algorithm called Naked Mole-Rat (NMR) algorithm to select the best subset of features and reduce the model complexity to help it train faster.
Journal ArticleDOI

Automatic spoken language identification using MFCC based time series features

TL;DR: This work proposes a model for the recognition of Indian and foreign languages, and augments data with noise of varying loudness taken from diverse environments to make the model robust to noise from everyday life.
Journal ArticleDOI

A CNN-BiLSTM based hybrid model for Indian language identification

TL;DR: CNN based bidirectional long short-term memory (BiLSTM) model has been proposed for Indian language identification with special emphasis on Northeastern languages and results show that the ResNet-50 based model has achieved accuracy up to 98.10% as compare to 97.70% for VGG-16 based model.
References
More filters
Journal ArticleDOI

Deep learning

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
Book

Deep Learning

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Journal ArticleDOI

The WEKA data mining software: an update

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Posted Content

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.