Drone-Based Object Counting by Spatially Regularized Regional Proposal Network

doi:10.1109/ICCV.2017.446

Open AccessProceedings ArticleDOI

Drone-Based Object Counting by Spatially Regularized Regional Proposal Network

- pp 4165-4173

TLDR

This work presents a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots and is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.

Abstract:

Existing counting methods often adopt regression-based approaches and cannot precisely localize the target objects, which hinders the further analysis (e.g., high-level understanding and fine-grained classification). In addition, most of prior work mainly focus on counting objects in static environments with fixed cameras. Motivated by the advent of unmanned flying vehicles (i.e., drones), we are interested in detecting and counting objects in such dynamic environments. We propose Layout Proposal Networks (LPNs) and spatial kernels to simultaneously count and localize target objects (e.g., cars) in videos recorded by the drone. Different from the conventional region proposal methods, we leverage the spatial layout information (e.g., cars often park regularly) and introduce these spatially regularized constraints into our network to improve the localization accuracy. To evaluate our counting method, we present a new large-scale car parking lot dataset (CARPK) that contains nearly 90,000 cars captured from different parking lots. To the best of our knowledge, it is the first and the largest drone view dataset that supports object counting, and provides the bounding box annotations.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs

Vishwanath A. Sindagi, +1 more

TL;DR: A novel method called Contextual Pyramid CNN (CP-CNN) for generating high-quality crowd density and count estimation by explicitly incorporating global and local contextual information of crowd images is presented.

...read moreread less

Book ChapterDOI

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

Dawei Du, +8 more

TL;DR: In this article, a new unconstrained UAV benchmark dataset is proposed for object detection, single object tracking, and multiple object tracking with new level challenges, including high density, small object, and camera motion, and a detailed quantitative study is performed using most recent state-of-the-art algorithms for each task.

...read moreread less

Proceedings ArticleDOI

Bayesian Loss for Crowd Count Estimation With Point Supervision

Zhiheng Ma, +3 more

TL;DR: This work proposes Bayesian loss, a novel loss function which constructs a density contribution probability model from the point annotations, and outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset.

...read moreread less

Proceedings ArticleDOI

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Shanghang Zhang, +3 more

TL;DR: Zhang et al. as discussed by the authors designed a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with LSTM in a residual learning fashion.

...read moreread less

Proceedings ArticleDOI

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

Vishwanath A. Sindagi, +1 more

TL;DR: A network that involves a multi-level bottom-top and top-bottom fusion method to combine information from shallower to deeper layers and vice versa at multiple levels and a principled way of generating scale-aware ground-truth density maps for training.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Proceedings ArticleDOI

Fast R-CNN

Ross Girshick

TL;DR: Fast R-CNN as discussed by the authors proposes a Fast Region-based Convolutional Network method for object detection, which employs several innovations to improve training and testing speed while also increasing detection accuracy and achieves a higher mAP on PASCAL VOC 2012.

...read moreread less

Proceedings Article

Faster R-CNN: towards real-time object detection with region proposal networks

Shaoqing Ren, +3 more

TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.

...read moreread less

Collapse

Drone-Based Object Counting by Spatially Regularized Regional Proposal Network

Citations

Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

Bayesian Loss for Crowd Count Estimation With Point Supervision

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting

References

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Large Scale Visual Recognition Challenge

You Only Look Once: Unified, Real-Time Object Detection

Fast R-CNN

Faster R-CNN: towards real-time object detection with region proposal networks

Related Papers (5)

SSD: Single Shot MultiBox Detector

Microsoft COCO: Common Objects in Context

Deep Residual Learning for Image Recognition

Faster R-CNN: towards real-time object detection with region proposal networks

You Only Look Once: Unified, Real-Time Object Detection