CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

doi:10.1109/CVPR.2018.00120

Open AccessProceedings ArticleDOI

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

- pp 1091-1100

TLDR

CSRNet as discussed by the authors is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations.

Abstract:

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Scale Aggregation Network for Accurate and Efficient Crowd Counting

Xinkun Cao, +3 more

TL;DR: A novel training loss, combining of Euclidean loss and local pattern consistency loss is proposed, which improves the performance of the model in the authors' experiments and achieves superior performance to state-of-the-art methods while with much less parameters.

...read moreread less

Proceedings ArticleDOI

Context-Aware Crowd Counting

Weizhe Liu, +2 more

TL;DR: In this article, an end-to-end trainable deep architecture that combines features obtained using multiple receptive field sizes and learns the importance of each such feature at each image location is proposed.

...read moreread less

Proceedings ArticleDOI

Learning From Synthetic Data for Crowd Counting in the Wild

Qi Wang, +3 more

TL;DR: A data collector and labeler is developed which can generate the synthetic crowd scenes and simultaneously annotate them without any manpower, and a crowd counting method via domain adaptation is proposed, which can free humans from heavy data annotations.

...read moreread less

Posted Content

Learning from Synthetic Data for Crowd Counting in the Wild

Qi Wang, +3 more

- 08 Mar 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Wang et al. as discussed by the authors developed a data collector and labeler to generate the synthetic crowd scenes and simultaneously annotate them without any manpower, which can boost the performance of crowd counting in the wild.

...read moreread less

Proceedings ArticleDOI

ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding

Ning Liu, +5 more

TL;DR: An attention-injective deformable convolutional network for crowd understanding that can address the accuracy degradation problem of highly congested noisy scenes and achieves the capability of being more effective to capture the crowd features and more resistant to various noises.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

Image quality assessment: from error visibility to structural similarity

Zhou Wang, +3 more

- 01 Apr 2004 -

IEEE Transactions on Image Processing

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.

...read moreread less

Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

Collapse

CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Citations

Scale Aggregation Network for Accurate and Efficient Crowd Counting

Context-Aware Crowd Counting

Learning From Synthetic Data for Crowd Counting in the Wild

Learning from Synthetic Data for Crowd Counting in the Wild

ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Image quality assessment: from error visibility to structural similarity

Dropout: a simple way to prevent neural networks from overfitting

Related Papers (5)

Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

Switching Convolutional Neural Network for Crowd Counting

Cross-scene crowd counting via deep convolutional neural networks

Multi-source Multi-scale Counting in Extremely Dense Crowd Images

Learning To Count Objects in Images