A Closer Look at Memorization in Deep Networks

Open AccessPosted Content

A Closer Look at Memorization in Deep Networks

- 16 Jun 2017 -

TLDR

The authors examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness, showing that deep networks tend to prioritize learning simple patterns first.

Abstract:

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. real data. We also demonstrate that for appropriately tuned explicit regularization (e.g., dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Understanding deep learning (still) requires rethinking generalization

Chiyuan Zhang, +4 more

- 22 Feb 2021 -

Communications of The ACM

TL;DR: These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.

...read moreread less

Journal ArticleDOI

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Naveed Akhtar, +1 more

- 19 Feb 2018 -

IEEE Access

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.

...read moreread less

Journal ArticleDOI

Shortcut learning in deep neural networks

Robert Geirhos, +6 more

- 16 Apr 2020 -

Nature Machine Intelligence

TL;DR: A set of recommendations for model interpretation and benchmarking is developed, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.

...read moreread less

Posted Content

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

Bo Han, +7 more

- 18 Apr 2018 -

arXiv: Learning

TL;DR: Co-teaching as discussed by the authors trains two deep neural networks simultaneously, and let them teach each other given every mini-batch: first, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this minibatch should be used for training; finally, each networks back propagates the data selected by its peer network and updates itself.

...read moreread less

Proceedings ArticleDOI

Improved Adam Optimizer for Deep Neural Networks

Zijun Zhang

TL;DR: The proposed method, normalized direction-preserving Adam (ND-Adam), enables more precise control of the direction and step size for updating weight vectors, leading to significantly improved generalization performance.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Approximation by superpositions of a sigmoidal function

George Cybenko

- 01 Dec 1989 -

Mathematics of Control, Signals, and Sys...

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.

...read moreread less

Journal ArticleDOI

Multilayer feedforward networks are universal approximators

HornikK., +2 more

- 01 Jul 1989 -

Neural Networks

Posted Content

Explaining and Harnessing Adversarial Examples

Ian Goodfellow, +2 more

- 20 Dec 2014 -

arXiv: Machine Learning

TL;DR: The authors argue that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, which is supported by new quantitative results while giving the first explanation of the most intriguing fact about adversarial examples: their generalization across architectures and training sets.

...read moreread less

Journal ArticleDOI

Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties

Evelyn Fix, +1 more

- 01 Dec 1989 -

International Statistical Review

TL;DR: In this paper, the discrimination problem is defined as follows: e random variable Z, of observed value z, is distributed over some space (say, p-dimensional) either according to distribution F, or according to Distribution G. The problem is to decide, on the basis of z, which of the two distributions Z has.

...read moreread less

Posted Content

Adversarial examples in the physical world

Alexey Kurakin, +2 more

- 08 Jul 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper showed that even in the physical world scenarios, machine learning systems are vulnerable to adversarial examples, and they demonstrate this by feeding adversarial images obtained from a cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system.

...read moreread less

A Closer Look at Memorization in Deep Networks

Citations

Understanding deep learning (still) requires rethinking generalization

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Shortcut learning in deep neural networks

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

Improved Adam Optimizer for Deep Neural Networks

References

Approximation by superpositions of a sigmoidal function

Multilayer feedforward networks are universal approximators

Explaining and Harnessing Adversarial Examples

Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties

Adversarial examples in the physical world

Related Papers (5)

Deep Residual Learning for Image Recognition

Understanding deep learning requires rethinking generalization.

Learning Multiple Layers of Features from Tiny Images

Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

ImageNet: A large-scale hierarchical image database

Trending Questions (1)