Dropout: a simple way to prevent neural networks from overfitting

Open AccessJournal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

- Vol. 15, Iss: 1, pp 1929-1958

TLDR

It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

Abstract:

Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

CT Super-Resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE)

Chenyu You, +13 more

- 01 Jan 2020 -

IEEE Transactions on Medical Imaging

TL;DR: Wang et al. as mentioned in this paper proposed a semi-supervised deep learning approach to recover high-resolution (HR) CT images from low resolution (LR) counterparts by enforcing the cycle-consistency in terms of the Wasserstein distance.

...read moreread less

Posted Content

Understanding Measures of Uncertainty for Adversarial Example Detection

Lewis Smith, +1 more

- 22 Mar 2018 -

arXiv: Machine Learning

TL;DR: In this article, failure modes for MC dropout, a widely used approach for estimating uncertainty in deep models, are highlighted, and a proposal to improve the quality of uncertainty estimates using probabilistic model ensembles is made.

...read moreread less

Journal ArticleDOI

Hyperspectral Image Classification With Markov Random Fields and a Convolutional Neural Network

Xiangyong Cao, +5 more

- 29 Jan 2018 -

IEEE Transactions on Image Processing

TL;DR: A new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework and achieves better performance on one synthetic data set and two benchmark HSI data sets in a number of experimental settings.

...read moreread less

Journal ArticleDOI

Deep learning in vision-based static hand gesture recognition

Oyebade K. Oyedotun, +1 more

- 01 Dec 2017 -

Neural Computing and Applications

TL;DR: This work proposes applying deep learning to the problem of hand gesture recognition for the whole 24 hand gestures obtained from the Thomas Moeslund's gesture recognition database and shows that more biologically inspired and deep neural networks are capable of learning the complex hand gesture classification task with lower error rates.

...read moreread less

Posted Content

Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks

Nils Reimers, +1 more

- 21 Jul 2017 -

arXiv: Computation and Language

TL;DR: This paper evaluates the importance of different network design choices and hyperparameters for five common linguistic sequence tagging tasks and found, that some parameters, like the pre-trained word embeddings or the last layer of the network, have a large impact on the performance, while other parameters, for example the number of LSTM layers or theNumber of recurrent units, are of minor importance.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996 -

Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

Geoffrey E. Hinton, +1 more

- 28 Jul 2006 -

Science

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.

...read moreread less

Journal ArticleDOI

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006 -

Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Collapse

Dropout: a simple way to prevent neural networks from overfitting

Citations

CT Super-Resolution GAN Constrained by the Identical, Residual, and Cycle Learning Ensemble (GAN-CIRCLE)

Understanding Measures of Uncertainty for Adversarial Example Detection

Hyperspectral Image Classification With Markov Random Fields and a Convolutional Neural Network

Deep learning in vision-based static hand gesture recognition

Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks

References

ImageNet Classification with Deep Convolutional Neural Networks

Regression Shrinkage and Selection via the Lasso

Reducing the Dimensionality of Data with Neural Networks

A fast learning algorithm for deep belief nets

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

Adam: A Method for Stochastic Optimization

Deep Residual Learning for Image Recognition

Long short-term memory

ImageNet Classification with Deep Convolutional Neural Networks

Deep learning

Trending Questions (3)