scispace - formally typeset
Y

Yue Cao

Researcher at Microsoft

Publications -  75
Citations -  15599

Yue Cao is an academic researcher from Microsoft. The author has contributed to research in topics: Computer science & Hash function. The author has an hindex of 32, co-authored 61 publications receiving 7782 citations. Previous affiliations of Yue Cao include Tsinghua University & University of Science and Technology of China.

Papers
More filters
Posted Content

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

TL;DR: Wang et al. as mentioned in this paper proposed a new vision Transformer called Swin Transformer, which is computed with shifted windows to address the differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text.
Posted Content

Learning Transferable Features with Deep Adaptation Networks

TL;DR: A new Deep Adaptation Network (DAN) architecture is proposed, which generalizes deep convolutional neural network to the domain adaptation scenario and can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding.
Proceedings Article

Learning Transferable Features with Deep Adaptation Networks

TL;DR: Deep Adaptation Network (DAN) as mentioned in this paper embeds hidden representations of all task-specific layers in a reproducing kernel Hilbert space where the mean embeddings of different domain distributions can be explicitly matched.
Proceedings ArticleDOI

GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond

TL;DR: A simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation is created, and this simplified design shares similar structure with Squeeze-Excitation Network (SENet), which generally outperforms both simplified NLNet and SENet on major benchmarks for various recognition tasks.
Posted Content

VL-BERT: Pre-training of Generic Visual-Linguistic Representations

TL;DR: A new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT), which adopts the simple yet powerful Transformer model as the backbone, and extends it to take both visual and linguistic embedded features as input.