scispace - formally typeset
X

Xudong Lin

Researcher at Columbia University

Publications -  37
Citations -  779

Xudong Lin is an academic researcher from Columbia University. The author has contributed to research in topics: Computer science & Event (computing). The author has an hindex of 7, co-authored 20 publications receiving 364 citations. Previous affiliations of Xudong Lin include Tsinghua University & Facebook.

Papers
More filters
Proceedings ArticleDOI

Deep Adversarial Metric Learning

TL;DR: This paper proposes a deep adversarial metric learning (DAML) framework to generate synthetic hard negatives from the observed negative samples, which is widely applicable to supervised deep metric learning methods.
Book ChapterDOI

Deep Variational Metric Learning

TL;DR: This paper proposes a deep variational metric learning (DVML) framework to explicitly model the intra-class variance and disentangle the intra -class invariance, namely, the class centers, and can simultaneously generate discriminative samples to improve robustness.
Proceedings ArticleDOI

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition

TL;DR: In this paper, a lightweight generator network is proposed to reduce the noise in motion vectors and capture fine motion details, achieving a more discriminative motion Cue (DMC) representation.
Journal ArticleDOI

All in One: Exploring Unified Video-Language Pre-training

TL;DR: This work introduces an end-to-end video-language model, namely all-in-one Transformer, that embeds raw video and textual signals into joint representations using a unified backbone architecture and introduces a novel and effective token rolling operation to encode temporal representations from video clips in a non-parametric manner.
Proceedings ArticleDOI

CLIP-Event: Connecting Text and Images with Event Structures

TL;DR: A contrastive learning framework to enforce vision-language pretraining models to comprehend events and associated argument (participant) roles is proposed, which takes advantage of text information extraction technologies to obtain event structural knowledge, and utilizes multiple prompt functions to contrast difficult negative descriptions by manipulating event structures.