Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Open AccessProceedings Article

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Alex Graves, +1 more

- Vol. 18, pp 602-610

Chats0

TLDR

In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

Abstract:

In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it'.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Document Image Binarization using LSTM: A Sequence Learning Approach

Muhammad Zeshan Afzal, +5 more

TL;DR: The proposed approach significantly outperforms standard binarization approaches both for F-Measure and OCR accuracy with the availability of enough training samples.

...read moreread less

Proceedings ArticleDOI

Large-Scale Item Categorization in e-Commerce Using Multiple Recurrent Neural Networks

Jung-Woo Ha, +2 more

TL;DR: Experimental results show the method improves the categorization accuracy compared to the model using single RNN as well as a standard classification model using unigram-based bag-of-words, and how much the model parameters and the used attributes influence categorization performances is investigated.

...read moreread less

Posted Content

Temporal Convolutional Neural Networks for Diagnosis from Lab Tests

Narges Razavian, +1 more

- 25 Nov 2015 -

arXiv: Learning

TL;DR: This work introduces a multi-resolution convolutional neural network for early detection of multiple diseases from irregularly measured sparse lab values and shows that the temporal signatures learned via convolution are significantly more predictive than baselines commonly used for early disease diagnosis.

...read moreread less

Proceedings ArticleDOI

Hardware architecture of Bidirectional Long Short-Term Memory Neural Network for Optical Character Recognition

Vladimir Rybalkin, +3 more

TL;DR: It is shown that computationally intensive visual recognition task benefits from being migrated to the dedicated hardware accelerator and outperforms high-performance CPU in terms of runtime, while consuming less energy than low power systems with negligible loss of recognition accuracy.

...read moreread less

Posted Content

Gated Word-Character Recurrent Language Model

Yasumasa Miyamoto, +1 more

- 06 Jun 2016 -

arXiv: Computation and Language

TL;DR: This paper introduced a gate that adaptively finds the optimal mixture of the character-level and word-level inputs to create the final vector representation of a word by combining two distinct representations of the word.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Book

Neural networks for pattern recognition

Christopher M. Bishop

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Journal ArticleDOI

Bidirectional recurrent neural networks

Mike Schuster, +1 more

- 01 Nov 1997 -

IEEE Transactions on Signal Processing

TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

...read moreread less