Open AccessPosted Content
Supervised Contrastive Learning.
Prannay Khosla,Piotr Teterwak,Chen Wang,Aaron Sarna,Yonglong Tian,Phillip Isola,Aaron Maschinot,Ce Liu,Dilip Krishnan +8 more
TLDR
In this paper, the authors extend the self-supervised batch contrastive approach to the fully supervised setting, allowing them to effectively leverage label information and achieve state-of-the-art performance in unsupervised training of deep image models.Abstract:
Contrastive learning applied to self-supervised representation learning has seen a resurgence in recent years, leading to state of the art performance in the unsupervised training of deep image models. Modern batch contrastive approaches subsume or significantly outperform traditional contrastive losses such as triplet, max-margin and the N-pairs loss. In this work, we extend the self-supervised batch contrastive approach to the fully-supervised setting, allowing us to effectively leverage label information. Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes. We analyze two possible versions of the supervised contrastive (SupCon) loss, identifying the best-performing formulation of the loss. On ResNet-200, we achieve top-1 accuracy of 81.4% on the ImageNet dataset, which is 0.8% above the best number reported for this architecture. We show consistent outperformance over cross-entropy on other datasets and two ResNet variants. The loss shows benefits for robustness to natural corruptions and is more stable to hyperparameter settings such as optimizers and data augmentations. Our loss function is simple to implement, and reference TensorFlow code is released at this https URL.read more
Citations
More filters
Posted Content
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
TL;DR: This paper proposes an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons, and uses a swapped prediction mechanism where it predicts the cluster assignment of a view from the representation of another view.
Posted Content
A Survey on Contrastive Self-supervised Learning
TL;DR: This paper provides an extensive review of self-supervised methods that follow the contrastive approach, explaining commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far.
Journal ArticleDOI
Computer Vision and Image Understanding
Sumeet Menon,David Chapman +1 more
TL;DR: Zhang et al. as discussed by the authors introduced methods to differentiate posed expressions from spontaneous ones by capturing global spatial patterns embedded in posed and spontaneous expressions, and incorporating gender and expression categories as privileged information during spatial pattern modeling.
Proceedings Article
Hard Negative Mixing for Contrastive Learning
TL;DR: It is argued that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected and proposed hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.
Proceedings ArticleDOI
Self-supervised Graph Learning for Recommendation
TL;DR: This work explores self-supervised learning on user-item graph, so as to improve the accuracy and robustness of GCNs for recommendation, and implements it on the state-of-the-art model LightGCN, which has the ability of automatically mining hard negatives.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Proceedings ArticleDOI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.