scispace - formally typeset
C

Carl Vondrick

Researcher at Columbia University

Publications -  119
Citations -  12134

Carl Vondrick is an academic researcher from Columbia University. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 37, co-authored 92 publications receiving 9054 citations. Previous affiliations of Carl Vondrick include University of California, Irvine & Google.

Papers
More filters
Posted Content

Generative Interventions for Causal Learning.

TL;DR: The authors introduce a framework for learning robust visual representations that generalize to new viewpoints, backgrounds, and scene contexts, and demonstrate state-of-the-art performance generalizing from ImageNet to ObjectNet dataset.
Posted Content

Learning to Learn Words from Narrated Video.

TL;DR: A framework that learns how to learn text representations from visual context is proposed that significantly outperforms the state-of-the-art in visual language modeling for acquiring new words and predicting new compositions.
Posted Content

We Have So Much In Common: Modeling Semantic Relational Set Abstractions in Videos.

TL;DR: This work combines visual features with natural language supervision to generate high-level representations of similarities across a set of videos, which allows the model to perform cognitive tasks such as set abstraction, set completion, and odd one out detection.
Proceedings ArticleDOI

Learning Goals from Failure

TL;DR: In this paper, a framework that predicts the goals behind observable human action in video is introduced. But the model is trained with minimal supervision and it is not able to predict the underlying goals in video of unintentional action.
Proceedings ArticleDOI

Revealing Occlusions with 4D Neural Fields

TL;DR: A framework for learning to estimate 4D visual representations from monocular RGB-D video is introduced, which is able to persist objects, even once they become obstructed by occlusions, and encode point clouds into a continuous representation, which permits the model to attend across the spatiotemporal context to resolve occlusion.