Practical recommendations for gradient-based training of deep architectures

Open AccessPosted Content

Practical recommendations for gradient-based training of deep architectures

- 24 Jun 2012 -

TLDR

Overall, this chapter describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks and closes with open questions about the training difficulties observed with deeper architectures.

Abstract:

Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Representation Learning: A Review and New Perspectives

Yoshua Bengio, +2 more

- 01 Aug 2013 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.

...read moreread less

Journal ArticleDOI

A survey on deep learning in medical image analysis

Geert Litjens, +8 more

- 01 Dec 2017 -

Medical Image Analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.

...read moreread less

Journal ArticleDOI

Brain tumor segmentation with Deep Neural Networks

Mohammad Havaei, +8 more

- 01 Jan 2017 -

Medical Image Analysis

TL;DR: A fast and accurate fully automatic method for brain tumor segmentation which is competitive both in terms of accuracy and speed compared to the state of the art, and introduces a novel cascaded architecture that allows the system to more accurately model local label dependencies.

...read moreread less

Journal ArticleDOI

Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources

Xiao Xiang Zhu, +6 more

- 01 Dec 2017 -

IEEE Geoscience and Remote Sensing Magaz...

TL;DR: The challenges of using deep learning for remote-sensing data analysis are analyzed, recent advances are reviewed, and resources are provided that hope will make deep learning in remote sensing seem ridiculously simple.

...read moreread less

Journal ArticleDOI

Methods for interpreting and understanding deep neural networks

Grégoire Montavon, +4 more

- 01 Feb 2018 -

Digital Signal Processing

TL;DR: The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which the author provides theory, recommendations, and tricks, to make most efficient use of it on real data.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008 -

Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

Journal ArticleDOI

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988 -

Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

Journal ArticleDOI

Bagging predictors

Leo Breiman

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Collapse

Related Papers (5)

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

arXiv: Learning

Practical recommendations for gradient-based training of deep architectures

Citations

Representation Learning: A Review and New Perspectives

A survey on deep learning in medical image analysis

Brain tumor segmentation with Deep Neural Networks

Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources

Methods for interpreting and understanding deep neural networks

References

Gradient-based learning applied to document recognition

Visualizing Data using t-SNE

Learning representations by back-propagating errors

Bagging predictors

A fast learning algorithm for deep belief nets

Related Papers (5)

Dropout: a simple way to prevent neural networks from overfitting

ImageNet Classification with Deep Convolutional Neural Networks

Deep Learning

Gradient-based learning applied to document recognition

Adam: A Method for Stochastic Optimization