scispace - formally typeset
Open AccessPosted Content

Practical recommendations for gradient-based training of deep architectures

TLDR
Overall, this chapter describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks and closes with open questions about the training difficulties observed with deeper architectures.
Abstract
Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures.

read more

Citations
More filters
Journal ArticleDOI

Representation Learning: A Review and New Perspectives

TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.
Journal ArticleDOI

A survey on deep learning in medical image analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.
Journal ArticleDOI

Brain tumor segmentation with Deep Neural Networks

TL;DR: A fast and accurate fully automatic method for brain tumor segmentation which is competitive both in terms of accuracy and speed compared to the state of the art, and introduces a novel cascaded architecture that allows the system to more accurately model local label dependencies.
Journal ArticleDOI

Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources

TL;DR: The challenges of using deep learning for remote-sensing data analysis are analyzed, recent advances are reviewed, and resources are provided that hope will make deep learning in remote sensing seem ridiculously simple.
Journal ArticleDOI

Methods for interpreting and understanding deep neural networks

TL;DR: The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which the author provides theory, recommendations, and tricks, to make most efficient use of it on real data.
References
More filters
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Journal ArticleDOI

Learning representations by back-propagating errors

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.
Journal ArticleDOI

Bagging predictors

Leo Breiman
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.
Journal ArticleDOI

A fast learning algorithm for deep belief nets

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Related Papers (5)