Xiangpeng Li

Researcher at University of Electronic Science and Technology of China

Publications - 17

Citations - 942

Xiangpeng Li is an academic researcher from University of Electronic Science and Technology of China. The author has contributed to research in topics: Computer science & Question answering. The author has an hindex of 8, co-authored 13 publications receiving 612 citations.

Papers

PDF

Open Access

More filters

Journal ArticleDOI

Beyond RNNs: Positional Self-Attention with Co-Attention for Video Question Answering

Xiangpeng Li, +6 more

TL;DR: This work proposes a new architecture, Positional Self-Attention with Coattention (PSAC), which does not require RNNs for video question answering and significantly outperforms the state-of-the-art on three tasks and attains comparable result on the Count task.

...read moreread less

Journal ArticleDOI

Self-Supervised Video Hashing with Hierarchical Binary Auto-encoder

Jingkuan Song, +5 more

- 07 Feb 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a novel unsupervised video hashing framework dubbed SSVH, which is able to capture the temporal nature of videos in an end-to-end learning to hash fashion, and designs a hierarchical binary auto-encoder to model the temporal dependencies in videos with multiple granularities.

...read moreread less

Journal ArticleDOI

Hierarchical LSTMs with Adaptive Attention for Visual Captioning

Lianli Gao, +3 more

- 01 May 2020 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A hierarchical LSTM with adaptive attention (hLSTMat) approach for image and video captioning that utilizes the spatial or temporal attention for selecting specific regions or frames to predict the related words, while the adaptive attention is for deciding whether to depend on the visual information or the language context information.

...read moreread less

Journal ArticleDOI

Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder

Jingkuan Song, +5 more

- 09 Mar 2018 -

IEEE Transactions on Image Processing

TL;DR: Self-supervised video hashing (SSVH) as discussed by the authors proposes a hierarchical binary auto-encoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture.

...read moreread less

Proceedings ArticleDOI

Learnable Aggregating Net with Diversity Learning for Video Question Answering

Xiangpeng Li, +6 more

TL;DR: A novel architecture, namely Learnable Aggregating Net with Diversity learning (LAD-Net), for V-VQA, which automatically aggregates adaptively-weighted frame-level features to extract rich video (or question) context semantic information by imitating Bags-of-Words (BoW) quantization.

...read moreread less