Drone-Based Object Counting by Spatially Regularized Regional Proposal Network
Meng-Ru Hsieh,Yen-Liang Lin,Winston H. Hsu +2 more
- pp 4165-4173
TLDR
This work presents a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots and is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.Abstract:
Existing counting methods often adopt regression-based approaches and cannot precisely localize the target objects, which hinders the further analysis (e.g., high-level understanding and fine-grained classification). In addition, most of prior work mainly focus on counting objects in static environments with fixed cameras. Motivated by the advent of unmanned flying vehicles (i.e., drones), we are interested in detecting and counting objects in such dynamic environments. We propose Layout Proposal Networks (LPNs) and spatial kernels to simultaneously count and localize target objects (e.g., cars) in videos recorded by the drone. Different from the conventional region proposal methods, we leverage the spatial layout information (e.g., cars often park regularly) and introduce these spatially regularized constraints into our network to improve the localization accuracy. To evaluate our counting method, we present a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots. To the best of our knowledge, it is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.read more
Citations
More filters
Proceedings ArticleDOI
Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs
TL;DR: A novel method called Contextual Pyramid CNN (CP-CNN) for generating high-quality crowd density and count estimation by explicitly incorporating global and local contextual information of crowd images is presented.
Book ChapterDOI
The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking
Dawei Du,Yuankai Qi,Hongyang Yu,Yifan Yang,Kaiwen Duan,Guorong Li,Weigang Zhang,Qingming Huang,Qi Tian +8 more
TL;DR: In this article, a new unconstrained UAV benchmark dataset is proposed for object detection, single object tracking, and multiple object tracking with new level challenges, including high density, small object, and camera motion, and a detailed quantitative study is performed using most recent state-of-the-art algorithms for each task.
Proceedings ArticleDOI
Bayesian Loss for Crowd Count Estimation With Point Supervision
TL;DR: This work proposes Bayesian loss, a novel loss function which constructs a density contribution probability model from the point annotations, and outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset.
Proceedings ArticleDOI
FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras
TL;DR: Zhang et al. as discussed by the authors designed a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with LSTM in a residual learning fashion.
Proceedings ArticleDOI
Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting
TL;DR: A network that involves a multi-level bottom-top and top-bottom fusion method to combine information from shallower to deeper layers and vice versa at multiple levels and a principled way of generating scale-aware ground-truth density maps for training.
References
More filters
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Proceedings ArticleDOI
You Only Look Once: Unified, Real-Time Object Detection
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Proceedings ArticleDOI
Fast R-CNN
TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.
Proceedings Article
Faster R-CNN: towards real-time object detection with region proposal networks
TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.