Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Open AccessProceedings Article

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Alex Graves, +1 more

- Vol. 18, pp 602-610

Chats0

TLDR

In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

Abstract:

In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it'.

Citations

PDF

Open Access

More filters

Posted Content

Failures of Gradient-Based Deep Learning

Shai Shalev-Shwartz, +2 more

- 23 Mar 2017 -

arXiv: Learning

TL;DR: This work describes four types of simple problems, for which the gradient-based algorithms commonly used in deep learning either fail or suffer from significant difficulties.

...read moreread less

Journal ArticleDOI

Exploiting deep neural networks for detection-based speech recognition

Sabato Marco Siniscalchi, +3 more

- 01 Apr 2013 -

Neurocomputing

TL;DR: It is shown that DNNs can be used to boost the classification accuracy of basic speech units, such as phonetic attributes (phonological features) and phonemes, and results in improved word recognition accuracy, which is better than previously reported word lattice rescoring results.

...read moreread less

Proceedings ArticleDOI

ECG-based biometrics using recurrent neural networks

Ronald Salloum, +1 more

TL;DR: The use of recurrent neural networks (RNNs) to develop an effective solution to two problems in electrocardiogram (ECG)-based biometrics: identification/classification and authentication was proposed.

...read moreread less

Proceedings ArticleDOI

Deep Learning approach for sentiment analysis of short texts

Abdalraouf Hassan, +1 more

TL;DR: Empirical results show that ConvLstm achieved comparable performances with less parameters on sentiment analysis tasks, and exploit LSTM as a substitute of pooling layer in CNN to reduce the loss of detailed local information and capture long term dependencies in sequence of sentences.

...read moreread less

Proceedings ArticleDOI

Context-sensitive learning for enhanced audiovisual emotion classification (Extended abstract)

Angeliki Metallinou, +5 more

TL;DR: The experimental results indicate that incorporating long-term temporal context is beneficial for emotion recognition systems that encounter a variety of emotional manifestations and context-sensitive approaches outperform those without context for classification tasks such as discrimination between valence levels or between clusters in the valence-activation space.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Book

Neural networks for pattern recognition

Christopher M. Bishop

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Journal ArticleDOI

Bidirectional recurrent neural networks

Mike Schuster, +1 more

- 01 Nov 1997 -

IEEE Transactions on Signal Processing

TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

...read moreread less