scispace - formally typeset
I

Ishan Misra

Researcher at Facebook

Publications -  90
Citations -  10759

Ishan Misra is an academic researcher from Facebook. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 29, co-authored 66 publications receiving 5410 citations. Previous affiliations of Ishan Misra include University of Massachusetts Amherst & Carnegie Mellon University.

Papers
More filters
Posted Content

Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

TL;DR: This paper proposes an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons, and uses a swapped prediction mechanism where it predicts the cluster assignment of a view from the representation of another view.
Proceedings ArticleDOI

Self-Supervised Learning of Pretext-Invariant Representations

TL;DR: This work develops Pretext-Invariant Representation Learning (PIRL), a new state-of-the-art in self-supervised learning from images that learns invariant representations based on pretext tasks that substantially improves the semantic quality of the learned image representations.
Proceedings ArticleDOI

Cross-Stitch Networks for Multi-task Learning

TL;DR: In this paper, a cross-stitch unit is proposed to combine the activations from multiple networks and can be trained end-to-end to learn an optimal combination of shared and task-specific representations.
Book ChapterDOI

Shuffle and Learn: Unsupervised Learning Using Temporal Order Verification

TL;DR: This paper forms an approach for learning a visual representation from the raw spatiotemporal signals in videos using a Convolutional Neural Network, and shows that this method captures information that is temporally varying, such as human pose.
Posted Content

Emerging Properties in Self-Supervised Vision Transformers

TL;DR: In this paper, self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets) beyond the fact that adapting selfsupervised methods to this architecture works particularly well, they make the following observations: first, self-vised ViT features contain explicit information about the semantic segmentation of an image, which does not emerge as clearly with supervised ViTs, nor with convnets.