scispace - formally typeset
Proceedings ArticleDOI

Recurrent convolutional neural network for speech processing

Reads0
Chats0
TLDR
A recently developed deep learning model, recurrent convolutional neural network (RCNN), is proposed to use for speech processing, which inherits some merits of recurrent neural networks (RNN) and convolutionals (CNN) and is competitive with previous methods in terms of accuracy and efficiency.
Abstract
Different neural networks have exhibited excellent performance on various speech processing tasks, and they usually have specific advantages and disadvantages. We propose to use a recently developed deep learning model, recurrent convolutional neural network (RCNN), for speech processing, which inherits some merits of recurrent neural network (RNN) and convolutional neural network (CNN). The core module can be viewed as a convolutional layer embedded with an RNN, which enables the model to capture both temporal and frequency dependance in the spectrogram of the speech in an efficient way. The model is tested on speech corpus TIMIT for phoneme recognition and IEMOCAP for emotion recognition. Experimental results show that the model is competitive with previous methods in terms of accuracy and efficiency.

read more

Citations
More filters
Journal ArticleDOI

A Survey on Deep Learning: Algorithms, Techniques, and Applications

TL;DR: A comprehensive review of historical and recent state-of-the-art approaches in visual, audio, and text processing; social network analysis; and natural language processing is presented, followed by the in-depth analysis on pivoting and groundbreaking advances in deep learning applications.
Journal ArticleDOI

Speech Emotion Recognition Using Deep Learning Techniques: A Review

TL;DR: An overview of Deep Learning techniques is presented and some recent literature where these methods are utilized for speech-based emotion recognition is discussed, including databases used, emotions extracted, contributions made toward speech emotion recognition and limitations related to it.
Proceedings Article

Gated recurrent convolution neural network for OCR

TL;DR: A new architecture named Gated RCNN (GRCNN) is proposed, inspired by a recently proposed model for general image classification, Recurrent Convolution Neural Network, which is combined with BLSTM to recognize text in natural images.
Journal ArticleDOI

Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning

TL;DR: A deep CNN-RNN model that classifies respiratory sounds based on Mel-spectrograms is proposed that is able to achieve state of the art score on the ICBHI’17 dataset and deep learning models are shown to successfully learn domain specific knowledge when pre-trained with breathing data and produce significantly superior performance compared to generalized models.
Journal ArticleDOI

Deep Neural Network for Respiratory Sound Classification in Wearable Devices Enabled by Patient Specific Model Tuning

TL;DR: In this article, a deep CNN-RNN model was proposed to classify respiratory sounds based on Mel-spectrograms and a local log quantization strategy for model weights to reduce the memory footprint for deployment in memory constrained systems such as wearable devices.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Related Papers (5)