Learning to Segment Every Thing
Ronghang Hu,Piotr Dollár,Kaiming He,Trevor Darrell,Ross Girshick +4 more
- pp 4233-4241
Reads0
Chats0
TLDR
A new partially supervised training paradigm is proposed, together with a novel weight transfer function, that enables training instance segmentation models on a large set of categories all of which have box annotations, but only a small fraction ofWhich have mask annotations.Abstract:
Most methods for object instance segmentation require all training examples to be labeled with segmentation masks. This requirement makes it expensive to annotate new categories and has restricted instance segmentation models to ~100 well-annotated classes. The goal of this paper is to propose a new partially supervised training paradigm, together with a novel weight transfer function, that enables training instance segmentation models on a large set of categories all of which have box annotations, but only a small fraction of which have mask annotations. These contributions allow us to train Mask R-CNN to detect and segment 3000 visual concepts using box annotations from the Visual Genome dataset and mask annotations from the 80 classes in the COCO dataset. We evaluate our approach in a controlled study on the COCO dataset. This work is a first step towards instance segmentation models that have broad comprehension of the visual world.read more
Citations
More filters
Journal ArticleDOI
Deep Learning for Generic Object Detection: A Survey
Li Liu,Li Liu,Wanli Ouyang,Xiaogang Wang,Paul Fieguth,Jie Chen,Xinwang Liu,Matti Pietikäinen +7 more
TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Journal ArticleDOI
UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation
TL;DR: UNet++ as mentioned in this paper proposes an efficient ensemble of U-Nets of varying depths, which partially share an encoder and co-learn simultaneously using deep supervision, leading to a highly flexible feature fusion scheme.
Book ChapterDOI
Unified Perceptual Parsing for Scene Understanding
TL;DR: A multi-task framework called UPerNet and a training strategy are developed to learn from heterogeneous image annotations and it is shown that it is able to effectively segment a wide range of concepts from images.
Posted Content
Image Segmentation Using Deep Learning: A Survey
Shervin Minaee,Yuri Boykov,Fatih Porikli,Antonio Plaza,Nasser Kehtarnavaz,Demetri Terzopoulos +5 more
TL;DR: A comprehensive review of recent pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings are provided.
Posted Content
Activation Functions: Comparison of trends in Practice and Research for Deep Learning
TL;DR: This paper will be the first, to compile the trends in AF applications in practice against the research results from literature, found in deep learning research to date.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Proceedings ArticleDOI
Glove: Global Vectors for Word Representation
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.