TensorMask: A Foundation for Dense Object Segmentation

doi:10.1109/ICCV.2019.00215

Open AccessProceedings ArticleDOI

TensorMask: A Foundation for Dense Object Segmentation

Xinlei Chen, +3 more

- pp 2061-2069

Chats0

TLDR

It is demonstrated that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN, suggesting that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task.

Abstract:

Sliding-window object detectors that generate bounding-box object predictions over a dense, regular grid have advanced rapidly and proven popular. In contrast, modern instance segmentation approaches are dominated by methods that first detect object bounding boxes, and then crop and segment these regions, as popularized by Mask R-CNN. In this work, we investigate the paradigm of dense sliding-window instance segmentation, which is surprisingly under-explored. Our core observation is that this task is fundamentally different than other dense prediction tasks such as semantic segmentation or bounding-box object detection, as the output at every spatial location is itself a geometric structure with its own spatial dimensions. To formalize this, we treat dense instance segmentation as a prediction task over 4D tensors and present a general framework called TensorMask that explicitly captures this geometry and enables novel operators on 4D tensors. We demonstrate that the tensor view leads to large gains over baselines that ignore this structure, and leads to results comparable to Mask R-CNN. These promising results suggest that TensorMask can serve as a foundation for novel advances in dense mask prediction and a more complete understanding of the task. Code will be made available.

Citations

PDF

Open Access

More filters

Posted Content

Image Segmentation Using Deep Learning: A Survey

Shervin Minaee, +5 more

- 15 Jan 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A comprehensive review of recent pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings are provided.

...read moreread less

Journal ArticleDOI

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Deng-Ping Fan, +4 more

- 01 May 2021 -

IEEE Transactions on Neural Networks

TL;DR: It is demonstrated that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU.

...read moreread less

Book ChapterDOI

Conditional Convolutions for Instance Segmentation

Zhi Tian, +2 more

TL;DR: A simpler instance segmentation method that can achieve improved performance in both accuracy and inference speed on the COCO dataset, and outperform a few recent methods including well-tuned Mask RCNN baselines, without longer training schedules needed.

...read moreread less

Proceedings ArticleDOI

PointRend: Image Segmentation As Rendering

Alexander Kirillov, +3 more

TL;DR: PointRend as discussed by the authors proposes a point-based rendering module that performs segmentation predictions at adaptively selected locations based on an iterative subdivision algorithm, which produces crisp object boundaries in regions that are over-smoothed by previous methods.

...read moreread less

Proceedings ArticleDOI

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

Hao Chen, +5 more

TL;DR: The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Collapse

TensorMask: A Foundation for Dense Object Segmentation

Citations

Image Segmentation Using Deep Learning: A Survey

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Conditional Convolutions for Instance Segmentation

PointRend: Image Segmentation As Rendering

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

References

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

Fully convolutional networks for semantic segmentation

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Related Papers (5)

Mask R-CNN

Deep Residual Learning for Image Recognition

Microsoft COCO: Common Objects in Context

Feature Pyramid Networks for Object Detection

Faster R-CNN: towards real-time object detection with region proposal networks