Soft-NMS — Improving Object Detection with One Line of Code

doi:10.1109/ICCV.2017.593

Open AccessProceedings ArticleDOI

Soft-NMS — Improving Object Detection with One Line of Code

- pp 5562-5570

TLDR

Soft-NMS as mentioned in this paper decays the detection scores of all other objects as a continuous function of their overlap with M. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss.

Abstract:

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for Soft-NMS is publicly available on GitHub http://bit.ly/2nJLNMu.

Citations

PDF

Open Access

More filters

Book ChapterDOI

Acquisition of Localization Confidence for Accurate Object Detection

Borui Jiang, +4 more

TL;DR: IoU-Net as discussed by the authors learns to predict the IoU between each detected bounding box and the matched ground-truth, which improves the NMS procedure by preserving accurately localized bounding boxes.

...read moreread less

Journal ArticleDOI

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

Zhaowei Cai, +1 more

- 01 May 2021 -

IEEE Transactions on Pattern Analysis an...

TL;DR: A multi-stage object detection architecture, the Cascade R-CNN, composed of a sequence of detectors trained with increasing IoU thresholds, which significantly improves high-quality detection on generic and specific object datasets, including VOC, KITTI, CityPerson, and WiderFace.

...read moreread less

Book ChapterDOI

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

George Papandreou, +5 more

TL;DR: In this article, a CNN is used to detect individual keypoints and predict their relative displacements, allowing them to group keypoints into person pose instances and then associate semantic person pixels with their corresponding person instance, delivering instance-level person segmentations.

...read moreread less

Posted Content

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Peize Sun, +10 more

- 25 Nov 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Sparse R-CNN demonstrates accuracy, run-time and training convergence performance on par with the well-established detector baselines on the challenging COCO dataset, e.g., achieving 45.0 AP in standard 3× training schedule and running at 22 fps using ResNet-50 FPN model.

...read moreread less

Posted Content

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

Zhaowei Cai, +1 more

- 24 Jun 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Cascade R-CNN as mentioned in this paper is a multi-stage RNN architecture composed of a sequence of detectors trained with increasing intersection over union (IoU) thresholds, which progressively improves hypotheses quality.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Journal ArticleDOI

A Computational Approach to Edge Detection

John Canny

- 01 Jun 1986 -

IEEE Transactions on Pattern Analysis an...

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Collapse

Soft-NMS — Improving Object Detection with One Line of Code

Citations

Acquisition of Localization Confidence for Accurate Object Detection

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

References

Deep Residual Learning for Image Recognition

Histograms of oriented gradients for human detection

A Computational Approach to Edge Detection

You Only Look Once: Unified, Real-Time Object Detection

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Related Papers (5)

SSD: Single Shot MultiBox Detector

Deep Residual Learning for Image Recognition

Feature Pyramid Networks for Object Detection

You Only Look Once: Unified, Real-Time Object Detection

Fast R-CNN