scispace - formally typeset
Open AccessProceedings Article

Training and Analysing Deep Recurrent Neural Networks

Michiel Hermans, +1 more
- Vol. 26, pp 190-198
Reads0
Chats0
TLDR
This work studies the effect of a hierarchy of recurrent neural networks on processing time series, and shows that they reach state-of-the-art performance for recurrent networks in character-level language modelling when trained with stochastic gradient descent.
Abstract
Time series often have a temporal hierarchy, with information that is spread out over multiple time scales. Common recurrent neural networks, however, do not explicitly accommodate such a hierarchy, and most research on them has been focusing on training algorithms rather than on their basic architecture. In this pa- per we study the effect of a hierarchy of recurrent neural networks on processing time series. Here, each layer is a recurrent network which receives the hidden state of the previous layer as input. This architecture allows us to perform hi- erarchical processing on difficult temporal tasks, and more naturally capture the structure of time series. We show that they reach state-of-the-art performance for recurrent networks in character-level language modelling when trained with sim- ple stochastic gradient descent. We also offer an analysis of the different emergent time scales.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Deep Learning: Methods and Applications

Li Deng, +1 more
TL;DR: This monograph provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks, including natural language and text processing, information retrieval, and multimodal information processing empowered by multi-task deep learning.
Posted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.
Proceedings ArticleDOI

Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling

TL;DR: The first distributed training of LSTM RNNs using asynchronous stochastic gradient descent optimization on a large cluster of machines is introduced and it is shown that a two-layer deep LSTm RNN where each L STM layer has a linear recurrent projection layer can exceed state-of-the-art speech recognition performance.
Posted Content

Visualizing and Understanding Recurrent Networks

TL;DR: This work uses character-level language models as an interpretable testbed to provide an analysis of LSTM representations, predictions and error types, and reveals the existence of interpretable cells that keep track of long-range dependencies such as line lengths, quotes and brackets.
Proceedings Article

Long Short Term Memory Networks for Anomaly Detection in Time Series.

TL;DR: The efficacy of stacked LSTM networks for anomaly/fault detection in time series on ECG, space shuttle, power demand, and multi-sensor engine dataset is demonstrated.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Journal ArticleDOI

A fast learning algorithm for deep belief nets

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Proceedings ArticleDOI

Speech recognition with deep recurrent neural networks

TL;DR: This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.
Proceedings ArticleDOI

Extracting and composing robust features with denoising autoencoders

TL;DR: This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern.