You Only Look Once: Unified, Real-Time Object Detection

doi:10.1109/CVPR.2016.91

Open AccessProceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

- pp 779-788

Chats0

TLDR

Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

Abstract:

We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

Citations

PDF

Open Access

More filters

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

- 08 Dec 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: SSD as mentioned in this paper discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, and combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.

...read moreread less

Proceedings ArticleDOI

Focal Loss for Dense Object Detection

Tsung-Yi Lin, +4 more

TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

...read moreread less

Proceedings ArticleDOI

YOLO9000: Better, Faster, Stronger

Joseph Redmon, +1 more

TL;DR: YOLO9000 as discussed by the authors is a state-of-the-art real-time object detection system that can detect over 9000 object categories in real time using a novel multi-scale training method, offering an easy tradeoff between speed and accuracy.

...read moreread less

Posted Content

YOLO9000: Better, Faster, Stronger

Joseph Redmon, +1 more

- 25 Dec 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Detecting People in Cubist Art

Shiry Ginosar, +3 more

TL;DR: This paper evaluates existing object detection methods on these abstract renditions of objects in Cubist abstract art, comparing human annotators to four state-of-the-art object detectors on a corpus of Picasso paintings to demonstrate that while human perception significantly outperforms current methods, human perception and part-based models exhibit a similarly graceful degradation in object detection performance.

...read moreread less

Posted Content

Detecting People in Cubist Art

Shiry Ginosar, +3 more

- 22 Sep 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Detectors trained on natural images can detect parts that characterize person figures in Cubist paintings that are not seen in paintings.

...read moreread less

Posted Content

Do More Dropouts in Pool5 Feature Maps for Better Object Detection

Zhiqiang Shen, +1 more

- 24 Sep 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel approach is proposed which generates an edited version for each original CNN feature vector by applying the maximum entropy principle to abandon particular vectors.

...read moreread less

Collapse

You Only Look Once: Unified, Real-Time Object Detection

Citations

SSD: Single Shot MultiBox Detector

SSD: Single Shot MultiBox Detector

Focal Loss for Dense Object Detection

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger

References

Detecting People in Cubist Art

Detecting People in Cubist Art

Do More Dropouts in Pool5 Feature Maps for Better Object Detection

Related Papers (5)

Deep Residual Learning for Image Recognition

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

ImageNet Classification with Deep Convolutional Neural Networks

Microsoft COCO: Common Objects in Context

Very Deep Convolutional Networks for Large-Scale Image Recognition

Trending Questions (2)