Training and Analysing Deep Recurrent Neural Networks

Open AccessProceedings Article

Training and Analysing Deep Recurrent Neural Networks

Michiel Hermans, +1 more

- Vol. 26, pp 190-198

Chats0

TLDR

This work studies the effect of a hierarchy of recurrent neural networks on processing time series, and shows that they reach state-of-the-art performance for recurrent networks in character-level language modelling when trained with stochastic gradient descent.

Abstract:

Time series often have a temporal hierarchy, with information that is spread out over multiple time scales. Common recurrent neural networks, however, do not explicitly accommodate such a hierarchy, and most research on them has been focusing on training algorithms rather than on their basic architecture. In this pa- per we study the effect of a hierarchy of recurrent neural networks on processing time series. Here, each layer is a recurrent network which receives the hidden state of the previous layer as input. This architecture allows us to perform hi- erarchical processing on difficult temporal tasks, and more naturally capture the structure of time series. We show that they reach state-of-the-art performance for recurrent networks in character-level language modelling when trained with sim- ple stochastic gradient descent. We also offer an analysis of the different emergent time scales.

Citations

PDF

Open Access

More filters

Book

Deep Learning: Methods and Applications

Li Deng, +1 more

TL;DR: This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.

...read moreread less

Posted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Shaojie Bai, +2 more

- 04 Mar 2018 -

arXiv: Learning

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.

...read moreread less

Proceedings ArticleDOI

Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling

Hasim Sak, +2 more

TL;DR: The first distributed training of LSTM RNNs using asynchronous stochastic gradient descent optimization on a large cluster of machines is introduced and it is shown that a two-layer deep LSTm RNN where each L STM layer has a linear recurrent projection layer can exceed state-of-the-art speech recognition performance.

...read moreread less

Posted Content

Visualizing and Understanding Recurrent Networks

Andrej Karpathy, +2 more

- 05 Jun 2015 -

arXiv: Learning

TL;DR: This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.

...read moreread less

Proceedings Article

Long Short Term Memory Networks for Anomaly Detection in Time Series.

Pankaj Malhotra, +3 more

TL;DR: The efficacy of stacked LSTM networks for anomaly/fault detection in time series on ECG, space shuttle, power demand, and multi-sensor engine dataset is demonstrated.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

Geoffrey E. Hinton, +1 more

- 28 Jul 2006 -

Science

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.

...read moreread less

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Proceedings ArticleDOI

Speech recognition with deep recurrent neural networks

Alex Graves, +2 more

TL;DR: This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.

...read moreread less

Proceedings ArticleDOI

Extracting and composing robust features with denoising autoencoders

Pascal Vincent, +3 more

TL;DR: This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern.

...read moreread less

Training and Analysing Deep Recurrent Neural Networks

Citations

Deep Learning: Methods and Applications

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling

Visualizing and Understanding Recurrent Networks

Long Short Term Memory Networks for Anomaly Detection in Time Series.

References

Long short-term memory

Reducing the Dimensionality of Data with Neural Networks

A fast learning algorithm for deep belief nets

Speech recognition with deep recurrent neural networks

Extracting and composing robust features with denoising autoencoders

Related Papers (5)

Long short-term memory

Adam: A Method for Stochastic Optimization

ImageNet Classification with Deep Convolutional Neural Networks

Neural Machine Translation by Jointly Learning to Align and Translate

Deep Residual Learning for Image Recognition