CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
Kuang-Huei Lee,Xiaodong He,Lei Zhang,Linjun Yang +3 more
- pp 5447-5456
Reads0
Chats0
TLDR
CleanNet as discussed by the authors is a joint neural embedding network, which only requires a fraction of the classes being manually verified to provide the knowledge of label noise that can be transferred to other classes.Abstract:
In this paper, we study the problem of learning image classification models with label noise. Existing approaches depending on human supervision are generally not scalable as manually identifying correct or incorrect labels is time-consuming, whereas approaches not relying on human supervision are scalable but less effective. To reduce the amount of human supervision for label noise cleaning, we introduce CleanNet, a joint neural embedding network, which only requires a fraction of the classes being manually verified to provide the knowledge of label noise that can be transferred to other classes. We further integrate CleanNet and conventional convolutional neural network classifier into one framework for image classification learning. We demonstrate the effectiveness of the proposed algorithm on both of the label noise detection task and the image classification on noisy data task on several large-scale datasets. Experimental results show that CleanNet can reduce label noise detection error rate on held-out classes where no human supervision available by 41.5% compared to current weakly supervised methods. It also achieves 47% of the performance gain of verifying all images with only 3.2% images verified on an image classification task. Source code and dataset will be available at kuanghuei.github.io/CleanNetProject.read more
Citations
More filters
Book ChapterDOI
Stacked Cross Attention for Image-Text Matching
TL;DR: In this article, Liu et al. proposed a stacked cross-attention to discover the full latent alignments using both image regions and words in a sentence as context and infer image-text similarity, achieving state-of-the-art results on the MS-COCO and Flickr30K datasets.
Proceedings ArticleDOI
Symmetric Cross Entropy for Robust Learning With Noisy Labels
TL;DR: The proposed Symmetric cross entropy Learning (SL) approach simultaneously addresses both the under learning and overfitting problem of CE in the presence of noisy labels, and empirically shows that SL outperforms state-of-the-art methods.
Posted Content
Learning from Noisy Labels with Deep Neural Networks: A Survey
TL;DR: A comprehensive review of 62 state-of-the-art robust training methods, all of which are categorized into five groups according to their methodological difference, followed by a systematic comparison of six properties used to evaluate their superiority.
Proceedings Article
DivideMix: Learning with Noisy Labels as Semi-supervised Learning
TL;DR: DivideMix as mentioned in this paper models the per-sample loss distribution with a mixture model to dynamically divide the training data into clean samples and noisy samples, and trains the model on both the labeled and unlabeled data in a semi-supervised manner.
Proceedings ArticleDOI
Learning to Learn From Noisy Labeled Data
TL;DR: In this article, a meta-learning method is proposed to train the model such that after one gradient update using each set of synthetic noisy labels, the model does not overfit to the specific noise.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.