Y
Yangguang Li
Publications - 23
Citations - 86
Yangguang Li is an academic researcher. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 1, co-authored 2 publications receiving 1 citations.
Papers
More filters
Journal ArticleDOI
Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and Supervision
TL;DR: This work proposes CLIP-benchmark, a first attempt to evaluate, analyze, and benchmark CLIP and its variants, and conducts a comprehensive analysis of three key factors: data, supervision, and model architecture.
Journal Article
SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples
TL;DR: This work takes the negation of original sentences as soft negative samples, and proposes Bidirectional Margin Loss (BML) to introduce them into traditional contrastive learning framework, which merely involves positive and negative samples.
Proceedings ArticleDOI
RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
TL;DR: This paper incorporates local feature learning into self-supervised vision transformers via Reconstructive Pre-training (RePre), which extends contrastive frameworks by adding a branch for reconstructing raw image pixels in parallel with the existing contrastive objective.
Journal ArticleDOI
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
TL;DR: MAE is extended to a fully-supervised setting by adding a supervised classification branch, thereby en- abling MAE to effectively learn global features from golden labels and its robustness on ImageNet variants and transfer learning performance outperforms MAE and standard supervised pre-training counterparts.
Journal ArticleDOI
Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception
Bin Huang,Yangguang Li,Enze Xie,Feng Liang,Luyang Wang,Ming Shen,Feng G. Liu,Tianqi Wang,Ping Luo,Jing Shao +9 more
TL;DR: Fast-BEV as mentioned in this paper proposes a data augmentation strategy for both image and BEV space to avoid overfitting and a multi-frame feature fusion mechanism to leverage the temporal information.