C
Chenyu Gao
Researcher at Northwestern Polytechnical University
Publications - 9
Citations - 197
Chenyu Gao is an academic researcher from Northwestern Polytechnical University. The author has contributed to research in topics: Optical character recognition & Closed captioning. The author has an hindex of 4, co-authored 8 publications receiving 78 citations.
Papers
More filters
Posted Content
C^3 Framework: An Open-source PyTorch Code for Crowd Counting
TL;DR: This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F), which has achieved the state-of-the-arts performance.
Journal ArticleDOI
MobileCount: An efficient encoder-decoder framework for real-time crowd counting
TL;DR: The proposed network is able to achieve comparable counting performance with 1 / 10 FLOPs on a number of benchmarks and a multi-layer knowledge distillation method is proposed to further boost the performance of MobileCount without increasing itsFLOPs.
Posted Content
Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps
TL;DR: This paper argues that a simple attention mechanism can do the same or even better job without any bells and whistles of multi-modality encoder design, and finds this simple baseline model consistently outperforms state-of-the-art (SOTA) models on two popular benchmarks, TextVQA and all three tasks of ST-V QA.
Posted Content
Structured Multimodal Attentions for TextVQA
TL;DR: An end-to-end structured multimodal attention (SMA) neural network is proposed to mainly solve the first two issues above.
Journal ArticleDOI
Structured Multimodal Attentions for TextVQA.
TL;DR: The authors proposed an end-to-end structured multimodal attention (SMA) neural network to solve the problems of poor text reading ability, lack of textual-visual reasoning capacity, and choosing discriminative answering mechanism over generative couterpart.