Mask Encoding for Single Shot Instance Segmentation

doi:10.1109/CVPR42600.2020.01024

Open AccessProceedings ArticleDOI

Mask Encoding for Single Shot Instance Segmentation

Rufeng Zhang, +4 more

- pp 10226-10235

Chats0

TLDR

Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector, which allows the instance segmentation task to be incorporated into one-stage bounding-box detectors and results in a simple yet efficient instance segmentations framework.

Abstract:

To date, instance segmentation is dominated by two-stage methods, as pioneered by Mask R-CNN. In contrast, one-stage alternatives cannot compete with Mask R-CNN in mask AP, mainly due to the difficulty of compactly representing masks, making the design of one-stage methods very challenging. In this work, we propose a simple single-shot instance segmentation framework, termed mask encoding based instance segmentation (MEInst). Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector, which allows the instance segmentation task to be incorporated into one-stage bounding-box detectors and results in a simple yet efficient instance segmentation framework. The proposed one-stage MEInst achieves 36.4% in mask AP with single-model (ResNeXt-101-FPN backbone) and single-scale testing on the MS-COCO benchmark. We show that the much simpler and flexible one-stage instance segmentation method, can also achieve competitive performance. This framework can be easily adapted for other instance-level recognition tasks. Code is available at: git.io/AdelaiDet

Citations

PDF

Open Access

More filters

Posted Content

SOLOv2: Dynamic and Fast Instance Segmentation

Xinlong Wang, +4 more

- 23 Mar 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: State-of-the-art results in object detection (from the authors' mask byproduct) and panoptic segmentation show the potential to serve as a new strong baseline for many instance-level recognition tasks besides instance segmentation.

...read moreread less

Posted Content

FCOS: A simple and strong anchor-free object detector

Zhi Tian, +3 more

- 14 Jun 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, a fully convolutional one-stage object detector (FCOS) is proposed to solve object detection in a per-pixel prediction fashion, analogue to other dense prediction problems such as semantic segmentation.

...read moreread less

Posted Content

A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images.

İrem Ülkü, +1 more

- 21 Dec 2019 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This survey focuses on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images, and chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era.

...read moreread less

Proceedings ArticleDOI

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation

Chufeng Tang, +5 more

TL;DR: BPR as mentioned in this paper proposes a post-processing refinement framework to improve the boundary quality based on the results of any instance segmentation model, extracting and refining a series of small boundary patches along the predicted instance boundaries.

...read moreread less

Proceedings ArticleDOI

Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation

Minghan Li, +3 more

TL;DR: Li et al. as mentioned in this paper proposed a simple yet effective one-stage video instance segmentation framework by spatial calibration and temporal fusion, namely STMask, to ensure spatial feature calibration with ground-truth bounding boxes.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Book ChapterDOI

Microsoft COCO: Common Objects in Context

Tsung-Yi Lin, +7 more

TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Collapse

Mask Encoding for Single Shot Instance Segmentation

Citations

SOLOv2: Dynamic and Fast Instance Segmentation

FCOS: A simple and strong anchor-free object detector

A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images.

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation

Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation

References

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Microsoft COCO: Common Objects in Context

Fully convolutional networks for semantic segmentation

You Only Look Once: Unified, Real-Time Object Detection

Related Papers (5)

Microsoft COCO: Common Objects in Context

Mask R-CNN

Deep Residual Learning for Image Recognition

Feature Pyramid Networks for Object Detection

Path Aggregation Network for Instance Segmentation