Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

doi:10.1109/ICCV.2017.146

Open AccessProceedings ArticleDOI

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

- pp 1310-1319

TLDR

This paper proposes a simple approach to generate large annotated instance datasets with minimal effort and outperforms existing synthesis approaches and when combined with real images improves relative performance by more than 21% on benchmark datasets.

Abstract:

A major impediment in rapidly deploying object detection models for instance detection is the lack of large annotated datasets. For example, finding a large labeled dataset containing instances in a particular kitchen is unlikely. Each new environment with new instances requires expensive data collection and annotation. In this paper, we propose a simple approach to generate large annotated instance datasets with minimal effort. Our key insight is that ensuring only patch-level realism provides enough training signal for current object detector models. We automatically ‘cut’ object instances and ‘paste’ them on random backgrounds. A naive way to do this results in pixel artifacts which result in poor performance for trained models. We show how to make detectors ignore these artifacts during training and generate data that gives competitive performance on real data. Our method outperforms existing synthesis approaches and when combined with real images improves relative performance by more than 21% on benchmark datasets. In a cross-domain setting, our synthetic data combined with just 10% real data outperforms models trained on all real data.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Sangdoo Yun, +5 more

TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.

...read moreread less

Journal ArticleDOI

Deep Learning for Generic Object Detection: A Survey

Li Liu, +7 more

- 01 Feb 2020 -

International Journal of Computer Vision

TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.

...read moreread less

Proceedings ArticleDOI

Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization

Jonathan Tremblay, +9 more

TL;DR: This work presents a system for training deep neural networks for object detection using synthetic images that relies upon the technique of domain randomization, in which the parameters of the simulator are randomized in non-realistic ways to force the neural network to learn the essential features of the object of interest.

...read moreread less

Posted Content

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Golnaz Ghiasi, +7 more

- 13 Dec 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A systematic study of the Copy-Paste augmentation for instance segmentation where the authors randomly paste objects onto an image finds that the simple mechanism of pasting objects randomly is good enough and can provide solid gains on top of strong baselines.

...read moreread less

Proceedings ArticleDOI

Conditional Generative Adversarial Network for Structured Domain Adaptation

Weixiang Hong, +3 more

TL;DR: A principled way to conduct structured domain adaption for semantic segmentation by integrating GAN into the FCN framework to mitigate the gap between source and target domains is proposed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Collapse

Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection

Citations

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

Deep Learning for Generic Object Detection: A Survey

Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Conditional Generative Adversarial Network for Structured Domain Adaptation

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Microsoft COCO: Common Objects in Context

Fully convolutional networks for semantic segmentation

You Only Look Once: Unified, Real-Time Object Detection

Related Papers (5)

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

SSD: Single Shot MultiBox Detector

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet: A large-scale hierarchical image database