Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Open AccessProceedings Article

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Lu Jiang, +3 more

- Vol. 1, pp 4804-4815

Chats0

TLDR

In this article, the authors established the first benchmark of controlled real-world label noise from the web, which enabled them to study the web label noise in a controlled setting for the first time, and they showed that their method achieves the best result on their dataset as well as on two public benchmarks (CIFAR and WebVision).

Abstract:

Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels. Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and real-world label noise has never been studied in a controlled setting. This paper makes three contributions. First, we establish the first benchmark of controlled real-world label noise from the web. This new benchmark enables us to study the web label noise in a controlled setting for the first time. The second contribution is a simple but effective method to overcome both synthetic and real noisy labels. We show that our method achieves the best result on our dataset as well as on two public benchmarks (CIFAR and WebVision). Third, we conduct the largest study by far into understanding deep neural networks trained on noisy labels across different noise levels, noise types, network architectures, and training settings. The data and code are released at the following link: this http URL

Citations

PDF

Open Access

More filters

Posted Content

Sharpness-Aware Minimization for Efficiently Improving Generalization

Pierre Foret, +3 more

- 03 Oct 2020 -

arXiv: Learning

TL;DR: This work introduces a novel, effective procedure for simultaneously minimizing loss value and loss sharpness, Sharpness-Aware Minimization (SAM), which improves model generalization across a variety of benchmark datasets and models, yielding novel state-of-the-art performance for several.

...read moreread less

Posted Content

Confident Learning: Estimating Uncertainty in Dataset Labels

Curtis G. Northcutt, +2 more

- 31 Oct 2019 -

arXiv: Machine Learning

TL;DR: This work combines building on the assumption of a classification noise process to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels, resulting in a generalized CL which is provably consistent and experimentally performant.

...read moreread less

Proceedings ArticleDOI

Regularizing Generative Adversarial Networks under Limited Data

Hung-Yu Tseng, +4 more

TL;DR: LeCam-GAN as discussed by the authors proposes a regularization approach for training robust GAN models on limited data, which theoretically shows a connection between the regularized loss and an f-divergence, which is more robust under limited training data.

...read moreread less

Proceedings Article

Sharpness-aware Minimization for Efficiently Improving Generalization

Pierre Foret, +3 more

TL;DR: The Sharpness-Aware min-max optimization (SAM) as discussed by the authors minimizes loss value and loss sharpness simultaneously, which results in a minmax optimization problem on which gradient descent can be performed efficiently.

...read moreread less

Posted Content

A Survey of Label-noise Representation Learning: Past, Present and Future.

Bo Han, +6 more

- 09 Nov 2020 -

arXiv: Learning

TL;DR: A formal definition of Label-Noise Representation Learning is clarified from the perspective of machine learning and the reason why noisy labels affect deep models' performance is figured out.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Posted Content

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, +4 more

- 02 Dec 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

...read moreread less

Collapse

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Citations

Sharpness-Aware Minimization for Efficiently Improving Generalization

Confident Learning: Estimating Uncertainty in Dataset Labels

Regularizing Generative Adversarial Networks under Limited Data

Sharpness-aware Minimization for Efficiently Improving Generalization

A Survey of Label-noise Representation Learning: Past, Present and Future.

References

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Dropout: a simple way to prevent neural networks from overfitting

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Rethinking the Inception Architecture for Computer Vision

Related Papers (5)

Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach

Learning Multiple Layers of Features from Tiny Images

Deep Residual Learning for Image Recognition

Learning from massive noisy labeled data for image classification

ImageNet: A large-scale hierarchical image database