scispace - formally typeset
Open AccessPosted Content

A Closer Look at Memorization in Deep Networks

TLDR
The authors examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness, showing that deep networks tend to prioritize learning simple patterns first.
Abstract
We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. real data. We also demonstrate that for appropriately tuned explicit regularization (e.g., dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.

read more

Citations
More filters
Journal ArticleDOI

Understanding deep learning (still) requires rethinking generalization

TL;DR: These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.
Journal ArticleDOI

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.
Journal ArticleDOI

Shortcut learning in deep neural networks

TL;DR: A set of recommendations for model interpretation and benchmarking is developed, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.
Posted Content

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

TL;DR: Co-teaching as discussed by the authors trains two deep neural networks simultaneously, and let them teach each other given every mini-batch: first, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this minibatch should be used for training; finally, each networks back propagates the data selected by its peer network and updates itself.
Proceedings ArticleDOI

Improved Adam Optimizer for Deep Neural Networks

TL;DR: The proposed method, normalized direction-preserving Adam (ND-Adam), enables more precise control of the direction and step size for updating weight vectors, leading to significantly improved generalization performance.
References
More filters
Journal ArticleDOI

Approximation by superpositions of a sigmoidal function

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.
Posted Content

Explaining and Harnessing Adversarial Examples

TL;DR: The authors argue that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, which is supported by new quantitative results while giving the first explanation of the most intriguing fact about adversarial examples: their generalization across architectures and training sets.
Journal ArticleDOI

Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties

TL;DR: In this paper, the discrimination problem is defined as follows: e random variable Z, of observed value z, is distributed over some space (say, p-dimensional) either according to distribution F, or according to Distribution G. The problem is to decide, on the basis of z, which of the two distributions Z has.
Posted Content

Adversarial examples in the physical world

TL;DR: This paper showed that even in the physical world scenarios, machine learning systems are vulnerable to adversarial examples, and they demonstrate this by feeding adversarial images obtained from a cell-phone camera to an ImageNet Inception classifier and measuring the classification accuracy of the system.
Related Papers (5)
Trending Questions (1)
What are the differences in cognitive neural mechanisms between rote memorization and deep learning?

The provided paper does not discuss the differences in cognitive neural mechanisms between rote memorization and deep learning.