scispace - formally typeset
Open AccessProceedings ArticleDOI

Soft-NMS — Improving Object Detection with One Line of Code

TLDR
Soft-NMS as mentioned in this paper decays the detection scores of all other objects as a continuous function of their overlap with M. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss.
Abstract
Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes. As per the design of the algorithm, if an object lies within the predefined overlap threshold, it leads to a miss. To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process. Soft-NMS obtains consistent improvements for the coco-style mAP metric on standard datasets like PASCAL VOC2007 (1.7% for both R-FCN and Faster-RCNN) and MS-COCO (1.3% for R-FCN and 1.1% for Faster-RCNN) by just changing the NMS algorithm without any additional hyper-parameters. Using Deformable-RFCN, Soft-NMS improves state-of-the-art in object detection from 39.8% to 40.9% with a single model. Further, the computational complexity of Soft-NMS is the same as traditional NMS and hence it can be efficiently implemented. Since Soft-NMS does not require any extra training and is simple to implement, it can be easily integrated into any object detection pipeline. Code for Soft-NMS is publicly available on GitHub http://bit.ly/2nJLNMu.

read more

Citations
More filters
Journal ArticleDOI

Inverted Non-maximum Suppression for more Accurate and Neater Face Detection

Lian Liu, +1 more
- 17 May 2023 - 
TL;DR: Zhang et al. as discussed by the authors proposed a new NMS method that operates in the reverse order of other NMS methods, which performs well on low-quality and tiny face samples.
Posted Content

Mixture-Model-based Bounding Box Density Estimation for Object Detection

TL;DR: A new object detection network, Mixture-Model-based Object Detector (MMOD), that performs multi-object detection through density estimation using a mixture model, and outperforms other detection methods in terms of speed and performance trade-offs.
Proceedings ArticleDOI

Estimating Maximum Likelihood using the Combined Linear and Nonlinear Function in NMS for Object Detection

TL;DR: In this paper, the authors combine linear and nonlinear functions to implement a novel fuzzy IOU-guided NMS to estimate the maximum likelihood (i.e., bounding box in our case) to fine-tune the task of object detection.
Journal ArticleDOI

Compact Sparse R-CNN: Speeding up sparse R-CNN by reducing iterative detection heads and simplifying feature pyramid network

Zihang He, +2 more
- 01 May 2023 - 
TL;DR: In this paper , the authors proposed an iterative Hungarian assigner that encourages Sparse R-CNN to generate multiple proposals for each object at the inference stage, which decreases the missing rate when the number of iterative heads is small.
Journal ArticleDOI

Efficient novel penultimate joint detector for shrimps selection employing convolutional pose machine

TL;DR: In this paper , a cascaded neural network is proposed to implement the detection of key points in a multi-shrimp scenario processing, which includes two stages: a shrimp detector based on YOLOv3 and followed by a pose estimator based on Convolutional Pose Machine (CPM).
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Journal ArticleDOI

A Computational Approach to Edge Detection

TL;DR: There is a natural uncertainty principle between detection and localization performance, which are the two main goals, and with this principle a single operator shape is derived which is optimal at any scale.
Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.