Understanding deep learning requires rethinking generalization

Open AccessPosted Content

Understanding deep learning requires rethinking generalization

Chiyuan Zhang, +4 more

- 10 Nov 2016 -

arXiv: Learning

Chats0

TLDR

The authors showed that deep neural networks can fit a random labeling of the training data, and that this phenomenon is qualitatively unaffected by explicit regularization, and occurs even if the true images are replaced by completely unstructured random noise.

Abstract:

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A Survey of Methods for Explaining Black Box Models

Riccardo Guidotti, +5 more

- 22 Aug 2018 -

ACM Computing Surveys

TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.

...read moreread less

Journal ArticleDOI

DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

Alexander Mathis, +7 more

- 20 Aug 2018 -

Nature Neuroscience

TL;DR: Using a deep learning approach to track user-defined body parts during various behaviors across multiple species, the authors show that their toolbox, called DeepLabCut, can achieve human accuracy with only a few hundred frames of training data.

...read moreread less

Proceedings Article

On calibration of modern neural networks

Chuan Guo, +3 more

TL;DR: This article found that depth, width, weight decay, and batch normalization are important factors influencing confidence calibration of neural networks, and that temperature scaling is surprisingly effective at calibrating predictions.

...read moreread less

Journal ArticleDOI

Deep learning for time series classification: a review

Hassan Ismail Fawaz, +4 more

- 01 Jul 2019 -

Data Mining and Knowledge Discovery

TL;DR: This article proposes the most exhaustive study of DNNs for TSC by training 8730 deep learning models on 97 time series datasets and provides an open source deep learning framework to the TSC community.

...read moreread less

Proceedings Article

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Arthur Jacot, +2 more

TL;DR: This talk will introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Collapse

Journal of Machine Learning Research

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

Understanding deep learning requires rethinking generalization

Citations

A Survey of Methods for Explaining Black Box Models

DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

On calibration of modern neural networks

Deep learning for time series classification: a review

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Dropout: a simple way to prevent neural networks from overfitting

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

ImageNet Large Scale Visual Recognition Challenge

Related Papers (5)

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Dropout: a simple way to prevent neural networks from overfitting

ImageNet Classification with Deep Convolutional Neural Networks

Gradient-based learning applied to document recognition