scispace - formally typeset
Open AccessPosted Content

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

Reads0
Chats0
TLDR
VICReg as discussed by the authors combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks.
Abstract
Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image. A trivial solution is obtained when the encoder outputs constant vectors. This collapse problem is often avoided through implicit biases in the learning architecture, that often lack a clear justification or interpretation. In this paper, we introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings along each dimension individually. VICReg combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks. In addition, we show that incorporating our new variance term into other methods helps stabilize the training and leads to performance improvements.

read more

Citations
More filters
Posted Content

Decoupled Contrastive Learning

TL;DR: In contrastive learning, the authors proposed a decoupled contrastive objective function for self-supervised learning (SSL), which considers two augmented views of the same image as positive and negative to be pushed further apart.
Posted Content

3D Infomax improves GNNs for Molecular Property Prediction

TL;DR: In this article, the authors proposed pre-training a model to reason about the geometry of molecules given only their 2D molecular graphs by maximizing the mutual information between 3D summary vectors and the representations of a Graph Neural Network (GNN) such that they contain latent 3D information.
Posted Content

Can contrastive learning avoid shortcut solutions

TL;DR: This paper proposed implicit feature modification (IFM), a method for altering positive and negative samples in order to guide contrastive models towards capturing a wider variety of predictive features, and as a result improved performance on vision and medical imaging tasks.
Posted Content

An Empirical Study of Graph Contrastive Learning

TL;DR: In this paper, the authors identify several critical design considerations within a general GCL paradigm, including augmentation functions, contrasting modes, contrastive objectives, and negative mining techniques, and conduct extensive, controlled experiments over a set of benchmark tasks on datasets across various domains.
Posted Content

AAVAE: Augmentation-Augmented Variational Autoencoders.

TL;DR: In this article, the authors introduce augmentation-augmented variational autoencoders (AAVAE), a third approach to self-supervised learning based on autoencoding, which replaces the KL divergence regularization with data augmentations that explicitly encourage the internal representations to encode domain-specific invariances and equivariances.
References
More filters
Proceedings Article

Signature Verification using a "Siamese" Time Delay Neural Network

TL;DR: An algorithm for verification of signatures written on a pen-input tablet based on a novel, artificial neural network called a "Siamese" neural network, which consists of two identical sub-networks joined at their outputs.
Proceedings Article

Learning Deep Features for Scene Recognition using Places Database

TL;DR: A new scene-centric database called Places with over 7 million labeled pictures of scenes is introduced with new methods to compare the density and diversity of image datasets and it is shown that Places is as dense as other scene datasets and has more diversity.
Posted Content

Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

TL;DR: This paper empirically show that on the ImageNet dataset large minibatches cause optimization difficulties, but when these are addressed the trained networks exhibit good generalization and enable training visual recognition models on internet-scale data with high efficiency.
Proceedings Article

Sinkhorn Distances: Lightspeed Computation of Optimal Transport

TL;DR: This work smooths the classic optimal transport problem with an entropic regularization term, and shows that the resulting optimum is also a distance which can be computed through Sinkhorn's matrix scaling algorithm at a speed that is several orders of magnitude faster than that of transport solvers.
Proceedings ArticleDOI

Unsupervised Feature Learning via Non-parametric Instance Discrimination

TL;DR: This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes.