Soft-NMS — Improving Object Detection with One Line of Code

doi:10.1109/ICCV.2017.593

Open AccessProceedings ArticleDOI

Soft-NMS — Improving Object Detection with One Line of Code

- pp 5562-5570

TLDR

Soft-NMS as mentioned in this paper decays the detection scores of all other objects as a continuous function of their overlap with M. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss.

Abstract:

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for Soft-NMS is publicly available on GitHub http://bit.ly/2nJLNMu.

Citations

PDF

Open Access

More filters

Posted Content

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, +2 more

- 23 Apr 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work uses new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, C mBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100.

...read moreread less

Posted Content

End-to-End Object Detection with Transformers

Nicolas Carion, +5 more

- 26 May 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a new method that views object detection as a direct set prediction problem, and demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster RCNN baseline on the challenging COCO object detection dataset.

...read moreread less

Posted Content

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

Ze Liu, +7 more

- 25 Mar 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Wang et al. as mentioned in this paper proposed a new vision Transformer called Swin Transformer, which is computed with shifted windows to address the differences between the two domains, such as large variations in the scale of visual entities and the high resolution of pixels in images compared to words in text.

...read moreread less

Book ChapterDOI

End-to-End Object Detection with Transformers

Nicolas Carion, +5 more

TL;DR: DetR as mentioned in this paper proposes a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture to directly output the final set of predictions in parallel.

...read moreread less

Journal ArticleDOI

Deep Learning for Generic Object Detection: A Survey

Li Liu, +7 more

- 01 Feb 2020 -

International Journal of Computer Vision

TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Object retrieval with large vocabularies and fast spatial matching

James Philbin, +4 more

TL;DR: To improve query performance, this work adds an efficient spatial verification stage to re-rank the results returned from the bag-of-words model and shows that this consistently improves search quality, though by less of a margin when the visual vocabulary is large.

...read moreread less

Book ChapterDOI

Edge Boxes: Locating Object Proposals from Edges

C. Lawrence Zitnick, +1 more

TL;DR: A novel method for generating object bounding box proposals using edges is proposed, showing results that are significantly more accurate than the current state-of-the-art while being faster to compute.

...read moreread less

Journal ArticleDOI

W/sup 4/: real-time surveillance of people and their activities

Ismail Haritaoglu, +2 more

- 01 Aug 2000 -

IEEE Transactions on Pattern Analysis an...

TL;DR: W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusions.

...read moreread less

Proceedings ArticleDOI

Pedestrian detection: A benchmark

Piotr Dollár, +3 more

TL;DR: The Caltech Pedestrian Dataset is introduced, which is two orders of magnitude larger than existing datasets and proposes improved evaluation metrics, demonstrating that commonly used per-window measures are flawed and can fail to predict performance on full images.

...read moreread less

Journal ArticleDOI

Edge and Curve Detection for Visual Scene Analysis

Azriel Rosenfeld, +1 more

- 01 May 1971 -

IEEE Transactions on Computers

TL;DR: Simple sets of parallel operations are described which can be used to detect texture edges, "spots," and "streaks" in digitized pictures and it is shown that a composite output is constructed in which edges between differently textured regions are detected, and isolated objects are also detected, but the objects composing the textures are ignored.

...read moreread less

Collapse

Soft-NMS — Improving Object Detection with One Line of Code

Citations

YOLOv4: Optimal Speed and Accuracy of Object Detection

End-to-End Object Detection with Transformers

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows.

End-to-End Object Detection with Transformers

Deep Learning for Generic Object Detection: A Survey

References

Object retrieval with large vocabularies and fast spatial matching

Edge Boxes: Locating Object Proposals from Edges

W/sup 4/: real-time surveillance of people and their activities

Pedestrian detection: A benchmark

Edge and Curve Detection for Visual Scene Analysis

Related Papers (5)

SSD: Single Shot MultiBox Detector

Deep Residual Learning for Image Recognition

Feature Pyramid Networks for Object Detection

You Only Look Once: Unified, Real-Time Object Detection

Fast R-CNN