scispace - formally typeset
Search or ask a question
Author

Benjamin Lefaudeux

Bio: Benjamin Lefaudeux is an academic researcher. The author has an hindex of 1, co-authored 1 publications receiving 41 citations.

Papers
More filters
Posted Content
TL;DR: Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods as mentioned in this paper. But self-learning cannot learn from any random image and from any unbounded dataset.
Abstract: Recently, self-supervised learning methods like MoCo, SimCLR, BYOL and SwAV have reduced the gap with supervised methods. These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model, a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy, surpassing the best self-supervised pretrained model by 1% and confirming that self-supervised learning works in a real world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving 77.9% top-1 with access to only 10% of ImageNet. Code: this https URL

42 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.
Abstract: Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people to explore a better solution. As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years. Self-supervised representation learning leverages input data itself as supervision and benefits almost all types of downstream tasks. In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning. We comprehensively review the existing empirical methods and summarize them into three main categories according to their objectives: generative, contrastive, and generative-contrastive (adversarial). We further investigate related theoretical analysis work to provide deeper thoughts on how self-supervised learning works. Finally, we briefly discuss open problems and future directions for self-supervised learning. An outline slide for the survey is provided.

576 citations

Posted Content
TL;DR: In this paper, self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) beyond the fact that adapting selfsupervised methods to this architecture works particularly well, they make the following observations: first, self-vised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets.
Abstract: In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets). Beyond the fact that adapting self-supervised methods to this architecture works particularly well, we make the following observations: first, self-supervised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets. Second, these features are also excellent k-NN classifiers, reaching 78.3% top-1 on ImageNet with a small ViT. Our study also underlines the importance of momentum encoder, multi-crop training, and the use of small patches with ViTs. We implement our findings into a simple self-supervised method, called DINO, which we interpret as a form of self-distillation with no labels. We show the synergy between DINO and ViTs by achieving 80.1% top-1 on ImageNet in linear evaluation with ViT-Base.

557 citations

Journal ArticleDOI
TL;DR: Artificial intelligence (AI) has recently become a very popular buzzword, as a consequence of disruptive technical advances and impressive experimental results, notably in the field of image analysis and processing as discussed by the authors.

66 citations

Journal ArticleDOI
TL;DR: Self-supervised learning as discussed by the authors is a form of unsupervised deep learning that allows the network to learn rich visual features that help in performing downstream computer vision tasks such as image classification, object detection, and image segmentation.
Abstract: Deep learning has brought significant developments in image understanding tasks such as object detection, image classification, and image segmentation. But the success of image recognition largely relies on supervised learning that requires huge number of human-annotated labels. To avoid costly collection of labeled data and the domains where very few standard pre-trained models exist, self-supervised learning comes to our rescue. Self-supervised learning is a form of unsupervised learning that allows the network to learn rich visual features that help in performing downstream computer vision tasks such as image classification, object detection, and image segmentation. This paper provides a thorough review of self-supervised learning which has the potential to revolutionize the computer vision field using unlabeled data. First, the motivation of self-supervised learning is discussed, and other annotation efficient learning schemes. Then, the general pipeline for supervised learning and self-supervised learning is illustrated. Next, various handcrafted pretext tasks are explained that enable learning of visual features using unlabeled image dataset. The paper also highlights the recent breakthroughs in self-supervised learning using contrastive learning and clustering methods that are outperforming supervised learning. Finally, we have performance comparisons of self-supervised techniques on evaluation tasks such as image classification and detection. In the end, the paper is concluded with practical considerations and open challenges of image recognition tasks in self-supervised learning regime. From the onset of the review paper, the core focus is on visual feature learning from images using the self-supervised approaches.

51 citations

Posted ContentDOI
29 Mar 2021-bioRxiv
TL;DR: In this article, a deep learning-based approach for fully self-supervised protein localization profiling and clustering is presented, which does not require pre-existing knowledge, categories, or annotations.
Abstract: Elucidating the diversity and complexity of protein localization is essential to fully understand cellular architecture. Here, we present cytoself, a deep learning-based approach for fully self-supervised protein localization profiling and clustering. cytoself leverages a self-supervised training scheme that does not require pre-existing knowledge, categories, or annotations. Applying cytoself to images of 1311 endogenously labeled proteins from the recently released OpenCell database creates a highly resolved protein localization atlas. We show that the representations derived from cytoself encapsulate highly specific features that can be used to derive functional insights for proteins on the sole basis of their localization. Finally, to better understand the inner workings of our model, we dissect the emergent features from which our clustering is derived, interpret these features in the context of the fluorescence images, and analyze the performance contributions of the different components of our approach.

35 citations