scispace - formally typeset
M

Miaojing Shi

Researcher at King's College London

Publications -  67
Citations -  1649

Miaojing Shi is an academic researcher from King's College London. The author has contributed to research in topics: Computer science & Image retrieval. The author has an hindex of 16, co-authored 45 publications receiving 1123 citations. Previous affiliations of Miaojing Shi include Tongji University & French Institute for Research in Computer Science and Automation.

Papers
More filters
Proceedings ArticleDOI

Crowd Counting via Scale-Adaptive Convolutional Neural Network

TL;DR: SaCNN as mentioned in this paper proposes a scale-adaptive CNN architecture with a backbone of fixed small receptive fields to extract feature maps from multiple layers and adapt them to have the same output size; combine them to produce the final density map.
Proceedings ArticleDOI

Revisiting Perspective Information for Efficient Crowd Counting

TL;DR: Li et al. as mentioned in this paper proposed a perspective-aware convolutional neural network (PACNN) for efficient crowd counting, which integrates the perspective information into density regression to provide additional knowledge of the person scale change in an image.
Proceedings ArticleDOI

Point in, Box Out: Beyond Counting Persons in Crowds

TL;DR: Zhang et al. as discussed by the authors proposed a curriculum learning strategy to train the network from images of relatively accurate and easy pseudo ground truth first, which can simultaneously detect the size and location of human heads and count them in crowds.
Journal ArticleDOI

Database Saliency for Fast Image Retrieval

TL;DR: Thorough experiments suggest that the proposed saliency- inspired fast image retrieval scheme, S-sim, significantly speeds up online retrieval and outperforms the state-of-the-art BoW-based image retrieval schemes.
Proceedings ArticleDOI

Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model

TL;DR: A novel method, detection prompt (DetPro), to learn continuous prompt representations for open-vocabulary object detection based on the pre-trained vision-language model, which outperforms the baseline ViLD in all settings.