Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Open AccessProceedings Article

Framewise phoneme classification with bidirectional LSTM and other neural network architectures

Alex Graves, +1 more

- Vol. 18, pp 602-610

Chats0

TLDR

In this article, a modified, full gradient version of the LSTM learning algorithm was used for framewise phoneme classification, using the TIMIT database, and the results support the view that contextual information is crucial to speech processing, and suggest that bidirectional networks outperform unidirectional ones.

Abstract:

In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it'.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Intra- and inter-epoch temporal context network (IITNet) using sub-epoch features for automatic sleep scoring on raw single-channel EEG

Hogeon Seo, +5 more

- 01 Aug 2020 -

Biomedical Signal Processing and Control

TL;DR: The results support that considering the latest two-minute raw single-channel EEG can be a reasonable choice for sleep scoring via deep neural networks with efficiency and reliability and that introducing intra-epoch temporal context learning with a deep residual network contributes to the improvement in the overall performance.

...read moreread less

Proceedings ArticleDOI

Binarized-BLSTM-RNN based Human Activity Recognition

Marcus Edel, +1 more

TL;DR: A binarized recurrent neural network whose weight parameters, input, and intermediate hidden layer output signals, are all binary-valued, and require only basic bit logic for the evaluation and training process, which is expected to substantially improve power-efficiency.

...read moreread less

Proceedings ArticleDOI

Morphological Inflection Generation with Hard Monotonic Attention

Roee Aharoni, +1 more

TL;DR: This article presented a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and its inflection.

...read moreread less

Journal ArticleDOI

A hybrid information model based on long short-term memory network for tool condition monitoring

Cai Weili, +3 more

- 01 Aug 2020 -

Journal of Intelligent Manufacturing

TL;DR: A hybrid information system based on a long short-term memory network (LSTM) for tool wear prediction using a stacked LSTM and a nonlinear regression model to predict tool wear based on the new input vector.

...read moreread less

Journal ArticleDOI

Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information

Angeliki Metallinou, +2 more

- 01 Feb 2013 -

Image and Vision Computing

TL;DR: A Gaussian Mixture Model-based approach is applied which computes a mapping from a set of observed audio-visual cues to an underlying emotional state and sheds light into the way expressive body language is modulated by underlying emotional states in the context of dyadic interactions.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Book

Neural networks for pattern recognition

Christopher M. Bishop

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Journal ArticleDOI

Bidirectional recurrent neural networks

Mike Schuster, +1 more

- 01 Nov 1997 -

IEEE Transactions on Signal Processing

TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

...read moreread less