scispace - formally typeset
Open AccessJournal ArticleDOI

Focal Loss for Dense Object Detection

Reads0
Chats0
TLDR
Focal loss as discussed by the authors focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training, which improves the accuracy of one-stage detectors.
Abstract
The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron .

read more

Citations
More filters
Journal ArticleDOI

A Deep Learning Prognosis Model Help Alert for COVID-19 Patients at High-Risk of Death: A Multi-Center Study

TL;DR: De-COVID19-Net can non-invasively predict whether a patient will die shortly based on the patient's initial CT scan with an impressive performance, which indicated that it could be used as a potential prognosis tool to alert high-risk patients and intervene in advance.
Proceedings ArticleDOI

Real-Time Panoptic Segmentation From Dense Detections

TL;DR: This paper proposes a new single-shot panoptic segmentation network that leverages dense detections and a global self-attention mechanism to operate in real-time with performance approaching the state of the art.
Journal ArticleDOI

Automatic Polyp Recognition in Colonoscopy Images Using Deep Learning and Two-Stage Pyramidal Feature Prediction

TL;DR: The proposed PLPNet method can effectively detect polyps in colonoscopy images and generate high-quality segmentation masks in a pixel-to-pixel manner and corroborates that CNNs with very deep architecture and richer semantics are highly efficient in medical image learning and inference.
Journal ArticleDOI

Ransomware classification using patch-based CNN and self-attention network on embedded N-grams of opcodes

TL;DR: A static analysis framework based on N-gram opcodes with deep learning and the first to exploit self-attention mechanism on opcode sequences for ransomware classification, which outperforms the state-of-the-art methods in many evaluations.
Journal ArticleDOI

The effects of skin lesion segmentation on the performance of dermatoscopic image classification.

TL;DR: In this article, the authors investigated the impact of using skin lesion segmentation masks on the performance of dermatoscopic image classification and found that the effect of using segmentation information on classification performance has remained an open question.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings ArticleDOI

Histograms of oriented gradients for human detection

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Book ChapterDOI

Microsoft COCO: Common Objects in Context

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Related Papers (5)