scispace - formally typeset
Open AccessPosted Content

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

Reads0
Chats0
TLDR
VICReg as discussed by the authors combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks.
Abstract
Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image. A trivial solution is obtained when the encoder outputs constant vectors. This collapse problem is often avoided through implicit biases in the learning architecture, that often lack a clear justification or interpretation. In this paper, we introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings along each dimension individually. VICReg combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks. In addition, we show that incorporating our new variance term into other methods helps stabilize the training and leads to performance improvements.

read more

Citations
More filters
Posted Content

Decoupled Contrastive Learning

TL;DR: In contrastive learning, the authors proposed a decoupled contrastive objective function for self-supervised learning (SSL), which considers two augmented views of the same image as positive and negative to be pushed further apart.
Posted Content

3D Infomax improves GNNs for Molecular Property Prediction

TL;DR: In this article, the authors proposed pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs by maximizing the mutual information between 3D summary vectors and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information.
Posted Content

Can contrastive learning avoid shortcut solutions

TL;DR: This paper proposed implicit feature modification (IFM), a method for altering positive and negative samples in order to guide contrastive models towards capturing a wider variety of predictive features, and as a result improved performance on vision and medical imaging tasks.
Posted Content

An Empirical Study of Graph Contrastive Learning

TL;DR: In this paper, the authors identify several critical design considerations within a general GCL paradigm, including augmentation functions, contrasting modes, contrastive objectives, and negative mining techniques, and conduct extensive, controlled experiments over a set of benchmark tasks on datasets across various domains.
Posted Content

AAVAE: Augmentation-Augmented Variational Autoencoders.

TL;DR: In this article, the authors introduce augmentation-augmented variational autoencoders (AAVAE), a third approach to self-supervised learning based on autoencoding, which replaces the KL divergence regularization with data augmentations that explicitly encourage the internal representations to encode domain-specific invariances and equivariances.
References
More filters
Proceedings ArticleDOI

ClusterFit: Improving Generalization of Visual Representations

TL;DR: ClusterFit as mentioned in this paper uses k-means clustering to improve the robustness of the visual representations learned during pre-training by re-training a pre-trained network on a new dataset using cluster assignments as pseudo-labels.
Proceedings Article

Prototypical Contrastive Learning of Unsupervised Representations

TL;DR: Prototypical Contrastive Learning (PCL) as mentioned in this paper introduces prototypes as latent variables to help find the maximum-likelihood estimation of the network parameters in an Expectation-Maximization framework.
Proceedings Article

Unsupervised Deep Learning by Neighbourhood Discovery

TL;DR: In this article, the authors introduce a generic unsupervised deep learning approach to training deep models without the need for any manual label supervision, which progressively discover sample anchored/centred neighbourhoods to reason and learn the underlying class decision boundaries iteratively and accumulatively.
Proceedings ArticleDOI

Learning Representations by Predicting Bags of Visual Words

TL;DR: In this article, a self-supervised approach based on spatially dense image descriptions that encode discrete visual concepts, here called visual words, is proposed to learn perturbation-invariant and context-aware image features.
Posted Content

Understanding self-supervised Learning Dynamics without Contrastive Pairs

TL;DR: DirectPred as mentioned in this paper sets the linear predictor based on the statistics of its inputs, without gradient training, and outperforms a linear predictor by $2.5\%$ in 300-epoch training.