Learning RoI Transformer for Oriented Object Detection in Aerial Images

doi:10.1109/CVPR.2019.00296

Proceedings ArticleDOI

Learning RoI Transformer for Oriented Object Detection in Aerial Images

- pp 2849-2858

TLDR

The core idea of RoI Transformer is to apply spatial transformations on RoIs and learn the transformation parameters under the supervision of oriented bounding box (OBB) annotations.

Abstract:

Object detection in aerial images is an active yet challenging task in computer vision because of the bird’s-eye view perspective, the highly complex backgrounds, and the variant appearances of objects. Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects. This leads to the common misalignment between the final object classification confidence and localization accuracy. In this paper, we propose a RoI Transformer to address these problems. The core idea of RoI Transformer is to apply spatial transformations on RoIs and learn the transformation parameters under the supervision of oriented bounding box (OBB) annotations. RoI Transformer is with lightweight and can be easily embedded into detectors for oriented object detection. Simply apply the RoI Transformer to light head RCNN has achieved state-of-the-art performances on two common and challenging aerial datasets, i.e., DOTA and HRSC2016, with a neglectable reduction to detection speed. Our RoI Transformer exceeds the deformable Position Sensitive RoI pooling when oriented bounding-box annotations are available. Extensive experiments have also validated the flexibility and effectiveness of our RoI Transformer.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Yongchao Xu, +6 more

- 01 Apr 2021 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object is introduced, and five extra target variables are added to the regression head of faster R-CNN, which requires ignorable extra computation time.

...read moreread less

Journal ArticleDOI

Align Deep Features for Oriented Object Detection

Jiaming Han, +3 more

- 12 Mar 2021 -

IEEE Transactions on Geoscience and Remo...

TL;DR: A single-shot alignment network (S2A-Net) consisting of two modules: a feature alignment module (FAM) and an oriented detection module (ODM) that can achieve the state-of-the-art performance on two commonly used aerial objects’ data sets while keeping high efficiency.

...read moreread less

Posted Content

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

Xue Yang, +3 more

- 15 Aug 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The key idea of feature refinement module is to re-encode the position information of the current refined bounding box to the corresponding feature points through feature interpolation to realize feature reconstruction and alignment.

...read moreread less

Book ChapterDOI

Arbitrary-Oriented Object Detection with Circular Smooth Label

Xue Yang, +1 more

TL;DR: This paper designs a new rotation detection baseline, to address the boundary problem by transforming angular prediction from a regression problem to a classification task with little accuracy loss, whereby high-precision angle classification is devised in contrast to previous works using coarse-granularity in rotation detection.

...read moreread less

Proceedings ArticleDOI

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Xingjia Pan, +7 more

TL;DR: A dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH) that enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, which empowers the model to refine the prediction dynamically in an object-aware manner.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Collapse

Learning RoI Transformer for Oriented Object Detection in Aerial Images

Citations

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

Align Deep Features for Oriented Object Detection

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

Arbitrary-Oriented Object Detection with Circular Smooth Label

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

References

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Related Papers (5)

SSD: Single Shot MultiBox Detector

Feature Pyramid Networks for Object Detection

Deep Residual Learning for Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

Focal Loss for Dense Object Detection