A fast learning algorithm for deep belief nets

doi:10.1162/NECO.2006.18.7.1527

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

- Vol. 18, Iss: 7, pp 1527-1554

TLDR

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

Abstract:

We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep learning

Yann LeCun, +4 more

- 28 May 2015 -

Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Book

Deep Learning

Ian Goodfellow, +2 more

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Projection Pursuit Regression

Jerome H. Friedman, +1 more

- 01 Dec 1981 -

Journal of the American Statistical Asso...

TL;DR: In this article, a nonparametric multiple regression (NMM) method is presented, which models the regression surface as a sum of general smooth functions of linear combinations of the predictor variables in an iterative manner.

...read moreread less

Book ChapterDOI

A view of the EM algorithm that justifies incremental, sparse, and other variants

Radford M. Neal, +1 more

TL;DR: In this paper, an incremental variant of the EM algorithm is proposed, in which the distribution for only one of the unobserved variables is recalculated in each E step, which is shown empirically to give faster convergence in a mixture estimation problem.

...read moreread less

Journal ArticleDOI

Boosting a weak learning algorithm by majority

Yoav Freund

- 01 Sep 1995 -

Information & Computation

TL;DR: An algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples, is presented.

...read moreread less

Journal ArticleDOI

Optimal Unsupervised Learning in a Single-Layer Linear Feedforward Neural Network

Terence D. Sanger

- 01 Jan 1989 -

Neural Networks

TL;DR: An optimality principle is proposed which is based upon preserving maximal information in the output units and an algorithm for unsupervised learning based upon a Hebbian learning rule, which achieves the desired optimality is presented.

...read moreread less

Journal ArticleDOI

Hierarchical Bayesian inference in the visual cortex

Tai Sing Lee, +1 more

- 01 Jul 2003 -

Journal of The Optical Society of Americ...

TL;DR: This work proposes a new theoretical setting based on the mathematical framework of hierarchical Bayesian inference for reasoning about the visual system, and suggests that the algorithms of particle filtering and Bayesian-belief propagation might model these interactive cortical computations.

...read moreread less