Self-labelling via simultaneous clustering and representation learning

Open AccessProceedings Article

Self-labelling via simultaneous clustering and representation learning

TLDR

In this paper, the authors proposed to maximize the information between labels and input data indices to solve the cross-entropy minimization problem for unsupervised learning of deep neural networks.

Abstract:

Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard cross-entropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Compared to the best previous method in this class, namely DeepCluster, our formulation minimizes a single objective function for both representation learning and clustering; it also significantly outperforms DeepCluster in standard benchmarks.

Citations

PDF

Open Access

More filters

The PASCAL Visual Object Classes Challenge

Jianguo Zhang

Posted Content

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

Mathilde Caron, +5 more

- 17 Jun 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons, and uses a swapped prediction mechanism where it predicts the cluster assignment of a view from the representation of another view.

...read moreread less

Posted Content

Exploring Simple Siamese Representation Learning

Xinlei Chen, +1 more

- 20 Nov 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Surprising empirical results are reported that simple Siamese networks can learn meaningful representations even using none of the following: (i) negative sample pairs, (ii) large batches, (iii) momentum encoders.

...read moreread less

Proceedings ArticleDOI

Exploring Simple Siamese Representation Learning

Xinlei Chen, +1 more

TL;DR: SimSiam as discussed by the authors proposes to use a stop-gradient operation to prevent collapsing solutions in Siamese networks, which achieves competitive results on ImageNet and downstream tasks, and further shows proof-of-concept experiments verifying it.

...read moreread less

Posted Content

Emerging Properties in Self-Supervised Vision Transformers

Mathilde Caron, +7 more

- 29 Apr 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) beyond the fact that adapting selfsupervised methods to this architecture works particularly well, they make the following observations: first, self-vised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Journal ArticleDOI

Pattern Recognition and Machine Learning

Radford M. Neal

- 01 Aug 2007 -

Technometrics

TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Collapse

Self-labelling via simultaneous clustering and representation learning

Citations

The PASCAL Visual Object Classes Challenge

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

Exploring Simple Siamese Representation Learning

Exploring Simple Siamese Representation Learning

Emerging Properties in Self-Supervised Vision Transformers

References

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet: A large-scale hierarchical image database

Pattern Recognition and Machine Learning

The Pascal Visual Object Classes (VOC) Challenge

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

Momentum Contrast for Unsupervised Visual Representation Learning

Deep Residual Learning for Image Recognition

Representation Learning with Contrastive Predictive Coding

Unsupervised Representation Learning by Predicting Image Rotations

Unsupervised Feature Learning via Non-parametric Instance Discrimination