scispace - formally typeset
Open AccessPosted Content

Self-supervised Pretraining of Visual Features in the Wild

TLDR
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods as mentioned in this paper. But self-learning cannot learn from any random image and from any unbounded dataset.
Abstract
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: this https URL

read more

Citations
More filters
Journal ArticleDOI

Self-supervised Learning: Generative or Contrastive.

TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.
Posted Content

Emerging Properties in Self-Supervised Vision Transformers

TL;DR: In this paper, self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) beyond the fact that adapting selfsupervised methods to this architecture works particularly well, they make the following observations: first, self-vised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets.
Journal ArticleDOI

Artificial intelligence and machine learning for medical imaging: A technology review.

TL;DR: Artificial intelligence (AI) has recently become a very popular buzzword, as a consequence of disruptive technical advances and impressive experimental results, notably in the field of image analysis and processing as discussed by the authors.
Journal ArticleDOI

Review on self-supervised image recognition using deep neural networks

TL;DR: Self-supervised learning as discussed by the authors is a form of unsupervised deep learning that allows the network to learn rich visual features that help in performing downstream computer vision tasks such as image classification, object detection, and image segmentation.
Posted ContentDOI

Self-Supervised Deep-Learning Encodes High-Resolution Features of Protein Subcellular Localization

TL;DR: In this article, a deep learning-based approach for fully self-supervised protein localization profiling and clustering is presented, which does not require pre-existing knowledge, categories, or annotations.
References
More filters
Proceedings ArticleDOI

Revisiting Self-Supervised Visual Representation Learning

TL;DR: This study revisits numerous previously proposed self-supervised models, conducts a thorough large scale study and uncovers multiple crucial insights about standard recipes for CNN design that do not always translate to self- supervised representation learning.
Journal ArticleDOI

Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks

TL;DR: In this article, a set of surrogate classes are formed by applying a variety of transformations to a randomly sampled image patch, and the resulting feature representation is not class specific, but provides robustness to the transformations that have been applied during training.
Posted Content

Training Deep Nets with Sublinear Memory Cost.

TL;DR: This work designs an algorithm that costs O( √ n) memory to train a n layer network, with only the computational cost of an extra forward pass per mini-batch, and shows that it is possible to trade computation for memory giving a more memory efficient training algorithm with a little extra computation cost.
Posted Content

Large Batch Training of Convolutional Networks

TL;DR: It is argued that the current recipe for large batch training (linear learning rate scaling with warm-up) is not general enough and training may diverge and a new training algorithm based on Layer-wise Adaptive Rate Scaling (LARS) is proposed.
Proceedings ArticleDOI

The iNaturalist Species Classification and Detection Dataset

TL;DR: The iNaturalist dataset as discussed by the authors contains 859,000 images from over 5,000 different species of plants and animals captured in a wide variety of situations from all over the world.
Related Papers (5)