scispace - formally typeset
Open AccessProceedings Article

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Reads0
Chats0
TLDR
In this article, the authors established the first benchmark of controlled real-world label noise from the web, which enabled them to study the web label noise in a controlled setting for the first time, and they showed that their method achieves the best result on their dataset as well as on two public benchmarks (CIFAR and WebVision).
Abstract
Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels. Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and real-world label noise has never been studied in a controlled setting. This paper makes three contributions. First, we establish the first benchmark of controlled real-world label noise from the web. This new benchmark enables us to study the web label noise in a controlled setting for the first time. The second contribution is a simple but effective method to overcome both synthetic and real noisy labels. We show that our method achieves the best result on our dataset as well as on two public benchmarks (CIFAR and WebVision). Third, we conduct the largest study by far into understanding deep neural networks trained on noisy labels across different noise levels, noise types, network architectures, and training settings. The data and code are released at the following link: this http URL

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Sharpness-Aware Minimization for Efficiently Improving Generalization

TL;DR: This work introduces a novel, effective procedure for simultaneously minimizing loss value and loss sharpness, Sharpness-Aware Minimization (SAM), which improves model generalization across a variety of benchmark datasets and models, yielding novel state-of-the-art performance for several.
Posted Content

Confident Learning: Estimating Uncertainty in Dataset Labels

TL;DR: This work combines building on the assumption of a classification noise process to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels, resulting in a generalized CL which is provably consistent and experimentally performant.
Proceedings ArticleDOI

Regularizing Generative Adversarial Networks under Limited Data

TL;DR: LeCam-GAN as discussed by the authors proposes a regularization approach for training robust GAN models on limited data, which theoretically shows a connection between the regularized loss and an f-divergence, which is more robust under limited training data.
Proceedings Article

Sharpness-aware Minimization for Efficiently Improving Generalization

TL;DR: The Sharpness-Aware min-max optimization (SAM) as discussed by the authors minimizes loss value and loss sharpness simultaneously, which results in a minmax optimization problem on which gradient descent can be performed efficiently.
Posted Content

A Survey of Label-noise Representation Learning: Past, Present and Future.

TL;DR: A formal definition of Label-Noise Representation Learning is clarified from the perspective of machine learning and the reason why noisy labels affect deep models' performance is figured out.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Rethinking the Inception Architecture for Computer Vision

TL;DR: This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
Related Papers (5)