Counting Everyday Objects in Everyday Scenes

doi:10.1109/CVPR.2017.471

Open AccessProceedings ArticleDOI

Counting Everyday Objects in Everyday Scenes

- pp 4428-4437

TLDR

In this article, a divide-and-conquerior model for counting the number of instances of object classes in natural, everyday images is proposed, inspired by the phenomenon of subitizing and #x2013.

Abstract:

We are interested in counting the number of instances of object classes in natural, everyday images. Previous counting approaches tackle the problem in restricted domains such as counting pedestrians in surveillance videos. Counts can also be estimated from outputs of other vision tasks like object detection. In this work, we build dedicated models for counting designed to tackle the large variance in counts, appearances, and scales of objects found in natural scenes. Our approach is inspired by the phenomenon of subitizing – the ability of humans to make quick assessments of counts given a perceptual signal, for small count values. Given a natural scene, we employ a divide and conquer strategy while incorporating context across the scene to adapt the subitizing idea to counting. Our approach offers consistent improvements over numerous baseline approaches for counting on the PASCAL VOC 2007 and COCO datasets. Subsequently, we study how counting can be used to improve object detection. We then show a proof of concept application of our counting methods to the task of Visual Question Answering, by studying the how many? questions in the VQA and COCO-QA datasets.

Citations

PDF

Open Access

More filters

Posted Content

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

Mohammad Sadegh Norouzzadeh, +6 more

- 16 Mar 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into “big data” sciences is investigated.

...read moreread less

Proceedings Article

Compositional Attention Networks for Machine Reasoning

Drew A. Hudson, +1 more

TL;DR: The MAC network is presented, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning that is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.

...read moreread less

Proceedings ArticleDOI

Representation Learning by Learning to Count

Mehdi Noroozi, +2 more

TL;DR: This paper uses two image transformations in the context of counting: scaling and tiling to train a neural network with a contrastive loss that produces representations that perform on par or exceed the state of the art in transfer learning benchmarks.

...read moreread less

Proceedings ArticleDOI

Context-Aware Crowd Counting

Weizhe Liu, +2 more

TL;DR: In this article, an end-to-end trainable deep architecture that combines features obtained using multiple receptive field sizes and learns the importance of each such feature at each image location is proposed.

...read moreread less

Proceedings ArticleDOI

Bayesian Loss for Crowd Count Estimation With Point Supervision

Zhiheng Ma, +3 more

TL;DR: This work proposes Bayesian loss, a novel loss function which constructs a density contribution probability model from the point annotations, and outperforms previous best approaches by a large margin on the latest and largest UCF-QNRF dataset.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Collapse

Counting Everyday Objects in Everyday Scenes

Citations

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

Compositional Attention Networks for Machine Reasoning

Representation Learning by Learning to Count

Context-Aware Crowd Counting

Bayesian Loss for Crowd Count Estimation With Point Supervision

References

Adam: A Method for Stochastic Optimization

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Related Papers (5)

Deep Residual Learning for Image Recognition

Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

Microsoft COCO: Common Objects in Context

Faster R-CNN: towards real-time object detection with region proposal networks

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning