Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Open AccessPosted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

Abstract:

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. We further merge RPN and Fast R-CNN into a single network by sharing their convolutional features---using the recently popular terminology of neural networks with 'attention' mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015 competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Journal ArticleDOI

Squeeze-and-Excitation Networks

Jie Hu, +4 more

TL;DR: This work proposes a novel architectural unit, which is term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and finds that SE blocks produce significant performance improvements for existing state-of-the-art deep architectures at minimal additional computational cost.

...read moreread less

Posted Content

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Andrew Howard, +7 more

- 17 Apr 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

...read moreread less

Proceedings ArticleDOI

Mask R-CNN

Kaiming He, +3 more

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

How good are detection proposals, really?

Jan Hosang, +2 more

- 26 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the authors provide an in depth analysis of ten object proposal methods along with four baselines regarding ground truth annotation recall (on Pascal VOC 2007 and ImageNet 2013), repeatability, and impact on DPM detector performance.

...read moreread less

Posted Content

Object Detection Networks on Convolutional Feature Maps

Shaoqing Ren, +4 more

- 23 Apr 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is shown by experiments that despite the effective ResNets and Faster R-CNN systems, the design of NoCs is an essential element for the 1st-place winning entries in ImageNet and MS COCO challenges 2015.

...read moreread less

Posted Content

Object-Proposal Evaluation Protocol is 'Gameable'

Neelima Chavali, +3 more

- 21 May 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, a nearly-fully annotated version of the PASCAL VOC dataset is introduced to check if object proposal techniques are overfitting to a particular list of categories.

...read moreread less

Posted Content

DeePM: A Deep Part-Based Model for Object Detection and Semantic Part Localization

Jun Zhu, +2 more

- 23 Nov 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper annotates semantic parts for all 20 object categories on the PASCAL VOC 2012 dataset, which provides information on object pose, occlusion, viewpoint and functionality and presents an end-to-end Object-Part R-CNN which learns an implicit feature representation for jointly mapping an image ROI to the object and part bounding boxes.

...read moreread less

Posted Content

R-CNN minus R

Karel Lenc, +1 more

- 23 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this paper, the role of proposal generation in CNN-based detectors is investigated to determine whether it is a necessary modelling component, carrying essential geometric information not contained in the CNN, or a way of accelerating detection.

...read moreread less

Collapse

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Citations

Deep Residual Learning for Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

Squeeze-and-Excitation Networks

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Mask R-CNN

References

How good are detection proposals, really?

Object Detection Networks on Convolutional Feature Maps

Object-Proposal Evaluation Protocol is 'Gameable'

DeePM: A Deep Part-Based Model for Object Detection and Semantic Part Localization

R-CNN minus R

Related Papers (5)

Deep Residual Learning for Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

ImageNet Classification with Deep Convolutional Neural Networks

Microsoft COCO: Common Objects in Context