Mask Encoding for Single Shot Instance Segmentation
Rufeng Zhang,Zhi Tian,Chunhua Shen,Mingyu You,Youliang Yan +4 more
- pp 10226-10235
Reads0
Chats0
TLDR
Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector, which allows the instance segmentation task to be incorporated into one-stage bounding-box detectors and results in a simple yet efficient instance segmentations framework.Abstract:
To date, instance segmentation is dominated by two-stage methods, as pioneered by Mask R-CNN. In contrast, one-stage alternatives cannot compete with Mask R-CNN in mask AP, mainly due to the difficulty of compactly representing masks, making the design of one-stage methods very challenging. In this work, we propose a simple single-shot instance segmentation framework, termed mask encoding based instance segmentation (MEInst). Instead of predicting the two-dimensional mask directly, MEInst distills it into a compact and fixed-dimensional representation vector, which allows the instance segmentation task to be incorporated into one-stage bounding-box detectors and results in a simple yet efficient instance segmentation framework. The proposed one-stage MEInst achieves 36.4% in mask AP with single-model (ResNeXt-101-FPN backbone) and single-scale testing on the MS-COCO benchmark. We show that the much simpler and flexible one-stage instance segmentation method, can also achieve competitive performance. This framework can be easily adapted for other instance-level recognition tasks. Code is available at: git.io/AdelaiDetread more
Citations
More filters
Posted Content
SOLOv2: Dynamic and Fast Instance Segmentation
TL;DR: State-of-the-art results in object detection (from the authors' mask byproduct) and panoptic segmentation show the potential to serve as a new strong baseline for many instance-level recognition tasks besides instance segmentation.
Posted Content
FCOS: A simple and strong anchor-free object detector
TL;DR: In this article, a fully convolutional one-stage object detector (FCOS) is proposed to solve object detection in a per-pixel prediction fashion, analogue to other dense prediction problems such as semantic segmentation.
Posted Content
A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images.
İrem Ülkü,Erdem Akagunduz +1 more
TL;DR: This survey focuses on the recent scientific developments in semantic segmentation, specifically on deep learning-based methods using 2D images, and chronologically categorised the approaches into three main periods, namely pre-and early deep learning era, the fully convolutional era, and the post-FCN era.
Proceedings ArticleDOI
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation
TL;DR: BPR as mentioned in this paper proposes a post-processing refinement framework to improve the boundary quality based on the results of any instance segmentation model, extracting and refining a series of small boundary patches along the predicted instance boundaries.
Proceedings ArticleDOI
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation
TL;DR: Li et al. as mentioned in this paper proposed a simple yet effective one-stage video instance segmentation framework by spatial calibration and temporal fusion, namely STMask, to ensure spatial feature calibration with ground-truth bounding boxes.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Book ChapterDOI
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Proceedings ArticleDOI
You Only Look Once: Unified, Real-Time Object Detection
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.