Open AccessPosted Content
Self-supervised Pretraining of Visual Features in the Wild
Priya Goyal,Mathilde Caron,Benjamin Lefaudeux,Min Xu,Pengchao Wang,Vivek S. Pai,Mannat Singh,Vitaliy Liptchinsky,Ishan Misra,Armand Joulin,Piotr Bojanowski +10 more
Reads0
Chats0
TLDR
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods as mentioned in this paper. But self-learning cannot learn from any random image and from any unbounded dataset.Abstract:
Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: this https URLread more
Citations
More filters
Journal ArticleDOI
Self-supervised Learning: Generative or Contrastive.
TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.
Posted Content
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron,Hugo Touvron,Hugo Touvron,Ishan Misra,Hervé Jégou,Julien Mairal,Piotr Bojanowski,Armand Joulin +7 more
TL;DR: In this paper, self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) beyond the fact that adapting selfsupervised methods to this architecture works particularly well, they make the following observations: first, self-vised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets.
Journal ArticleDOI
Artificial intelligence and machine learning for medical imaging: A technology review.
Ana M. Barragan-Montero,Umair Javaid,Gilmer Valdes,Dan Nguyen,Paul Desbordes,Benoît Macq,S. Willems,Liesbeth Vandewinckele,Mats Holmström,Fredrik Löfman,Steven Michiels,Kevin Souris,Edmond Sterpin,John Aldo Lee +13 more
TL;DR: Artificial intelligence (AI) has recently become a very popular buzzword, as a consequence of disruptive technical advances and impressive experimental results, notably in the field of image analysis and processing as discussed by the authors.
Journal ArticleDOI
Review on self-supervised image recognition using deep neural networks
Kriti Ohri,Mukesh Kumar +1 more
TL;DR: Self-supervised learning as discussed by the authors is a form of unsupervised deep learning that allows the network to learn rich visual features that help in performing downstream computer vision tasks such as image classification, object detection, and image segmentation.
Posted ContentDOI
Self-Supervised Deep-Learning Encodes High-Resolution Features of Protein Subcellular Localization
TL;DR: In this article, a deep learning-based approach for fully self-supervised protein localization profiling and clustering is presented, which does not require pre-existing knowledge, categories, or annotations.
References
More filters
Book ChapterDOI
Learning Visual Features from Large Weakly Supervised Data
TL;DR: In this paper, the authors explore the potential of leveraging massive, weakly-labeled image collections for learning good visual features, and train convolutional networks on a dataset of 100 million Flickr photos and comments.
Proceedings ArticleDOI
Libri-Light: A Benchmark for ASR with Limited or No Supervision
Jacob Kahn,Morgane Riviere,Weiyi Zheng,Eugene Kharitonov,Qiantong Xu,Pierre-Emmanuel Mazaré,Julien Karadayi,Vitaliy Liptchinsky,Ronan Collobert,Christian Fuegen,Tatiana Likhomanenko,Gabriel Synnaeve,Armand Joulin,Abdelrahman Mohamed,Emmanuel Dupoux +14 more
TL;DR: In this article, the authors introduce a new collection of spoken English audio suitable for training speech recognition systems under limited or no supervision, which is derived from open-source audio books from the LibriVox project.
Proceedings Article
Self-labelling via simultaneous clustering and representation learning
TL;DR: In this paper, the authors proposed to maximize the information between labels and input data indices to solve the cross-entropy minimization problem for unsupervised learning of deep neural networks.
Proceedings ArticleDOI
Unsupervised Pretraining Transfers Well Across Languages
TL;DR: In this article, contrastive predictive coding (CPC) algorithms have been proposed to pretrain ASR systems with unlabeled data, and the authors investigated whether unsupervised pretraining transfers well across languages.
Proceedings Article
Unsupervised learning by predicting Noise
Piotr Bojanowski,Armand Joulin +1 more
TL;DR: In this article, the authors propose to fix a set of target representations, called Noise As Targets (NAT), and constrain the deep features to align to them to avoid trivial solutions and collapsing of features.