scispace - formally typeset
Open AccessProceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Reads0
Chats0
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Abstract
We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI

SSD: Single Shot MultiBox Detector

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
Book ChapterDOI

SSD: Single Shot MultiBox Detector

TL;DR: SSD as mentioned in this paper discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, and combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.
Proceedings ArticleDOI

Focal Loss for Dense Object Detection

TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
Proceedings ArticleDOI

YOLO9000: Better, Faster, Stronger

TL;DR: YOLO9000 as discussed by the authors is a state-of-the-art real-time object detection system that can detect over 9000 object categories in real time using a novel multi-scale training method, offering an easy tradeoff between speed and accuracy.
Posted Content

YOLO9000: Better, Faster, Stronger

TL;DR: YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work.
References
More filters
Book ChapterDOI

Detecting People in Cubist Art

TL;DR: This paper evaluates existing object detection methods on these abstract renditions of objects in Cubist abstract art, comparing human annotators to four state-of-the-art object detectors on a corpus of Picasso paintings to demonstrate that while human perception significantly outperforms current methods, human perception and part-based models exhibit a similarly graceful degradation in object detection performance.
Posted Content

Detecting People in Cubist Art

TL;DR: Detectors trained on natural images can detect parts that characterize person figures in Cubist paintings that are not seen in paintings.
Posted Content

Do More Dropouts in Pool5 Feature Maps for Better Object Detection

TL;DR: A novel approach is proposed which generates an edited version for each original CNN feature vector by applying the maximum entropy principle to abandon particular vectors.
Related Papers (5)
Trending Questions (2)
What are the advantages and disadvantages of YOLOv8 vs Media Pipe for object detection?

The provided paper does not mention YOLOv8 or Media Pipe, so it does not provide information about the advantages and disadvantages of YOLOv8 vs Media Pipe for object detection.

What objects can Yolo detect?

Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background.