scispace - formally typeset
C

Chenyu Gao

Researcher at Northwestern Polytechnical University

Publications -  9
Citations -  197

Chenyu Gao is an academic researcher from Northwestern Polytechnical University. The author has contributed to research in topics: Optical character recognition & Closed captioning. The author has an hindex of 4, co-authored 8 publications receiving 78 citations.

Papers
More filters
Posted Content

C^3 Framework: An Open-source PyTorch Code for Crowd Counting

TL;DR: This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F), which has achieved the state-of-the-arts performance.
Journal ArticleDOI

MobileCount: An efficient encoder-decoder framework for real-time crowd counting

TL;DR: The proposed network is able to achieve comparable counting performance with 1 / 10 FLOPs on a number of benchmarks and a multi-layer knowledge distillation method is proposed to further boost the performance of MobileCount without increasing itsFLOPs.
Posted Content

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps

TL;DR: This paper argues that a simple attention mechanism can do the same or even better job without any bells and whistles of multi-modality encoder design, and finds this simple baseline model consistently outperforms state-of-the-art (SOTA) models on two popular benchmarks, TextVQA and all three tasks of ST-V QA.
Posted Content

Structured Multimodal Attentions for TextVQA

TL;DR: An end-to-end structured multimodal attention (SMA) neural network is proposed to mainly solve the first two issues above.
Journal ArticleDOI

Structured Multimodal Attentions for TextVQA.

TL;DR: The authors proposed an end-to-end structured multimodal attention (SMA) neural network to solve the problems of poor text reading ability, lack of textual-visual reasoning capacity, and choosing discriminative answering mechanism over generative couterpart.