S
Shi Pu
Researcher at Beijing University of Posts and Telecommunications
Publications - 12
Citations - 278
Shi Pu is an academic researcher from Beijing University of Posts and Telecommunications. The author has contributed to research in topics: Computer science & Eye tracking. The author has an hindex of 2, co-authored 7 publications receiving 197 citations. Previous affiliations of Shi Pu include Tencent.
Papers
More filters
Posted Content
Deep Attentive Tracking via Reciprocative Learning
TL;DR: In this article, a reciprocative learning algorithm was proposed to exploit visual attention for training deep classifiers, which consists of feed-forward and backward operations to generate attention maps, which serve as regularization terms coupled with the original classification loss function for training.
Proceedings Article
Deep Attentive Tracking via Reciprocative Learning
TL;DR: In this paper, a reciprocative learning algorithm was proposed to exploit visual attention for training deep classifiers, which consists of feed-forward and backward operations to generate attention maps, which serve as regularization terms coupled with the original classification loss function for training.
Proceedings ArticleDOI
End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding
Meng Li,Tianbao Wang,Haoyu Zhang,Shengyu Zhang,Zhou Zhao,Jiaxu Miao,Wenqiao Zhang,Wenming Tan,Jin Wang,Peng Wang,Shi Pu,Fei Wu +11 more
TL;DR: An end-to-end model via Information Tree for One-Shot video grounding (IT-OS), which can eliminate the interference of irrelevant frames based on branch search and branch cropping techniques and several self-supervised tasks are proposed based on the information tree to improve the representation learning under insufficient labeling.
Proceedings ArticleDOI
HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding
Meng Li,Tianbao Wang,Haoyu Zhang,Shengyu Zhang,Zhou Zhao,Wenqiao Zhang,Jiaxu Miao,Shi Pu,Fei Wu +8 more
TL;DR: This paper introduces the weakly-supervised contrastive learning that classifies the video as action-consistent and action-independent frames relying on the video-caption action semantic correspondence and designs the hierarchical reasoning layers to decouple fully connected multi-head attention and remove the redundant interfering correlations.
Journal ArticleDOI
Learning Recurrent Memory Activation Networks for Visual Tracking
TL;DR: A recurrent memory activation network (RMAN) to exploit the untapped temporal coherence of the target appearance for visual tracking, built on top of the long short-term memory network (LSTM) with an additional memory activation layer.