MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects

doi:10.1109/ISMAR.2018.00024

Open AccessProceedings ArticleDOI

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects

Martin Rünz, +2 more

- pp 10-20

Chats0

TLDR

MaskFusion as discussed by the authors is a real-time object-aware, semantic and dynamic RGB-D SLAM system that goes beyond traditional systems which output a purely geometric map of a static scene.

Abstract:

We present MaskFusion, a real-time, object-aware, semantic and dynamic RGB-D SLAM system that goes beyond traditional systems which output a purely geometric map of a static scene. MaskFusion recognizes, segments and assigns semantic class labels to different objects in the scene, while tracking and reconstructing them even when they move independently from the camera. As an RGB-D camera scans a cluttered scene, image-based instance-level semantic segmentation creates semantic object masks that enable realtime object recognition and the creation of an object-level representation for the world map. Unlike previous recognition-based SLAM systems, MaskFusion does not require known models of the objects it can recognize, and can deal with multiple independent motions. MaskFusion takes full advantage of using instance-level semantic segmentation to enable semantic labels to be fused into an object-aware map, unlike recent semantics enabled SLAM systems that perform voxel-level semantic segmentation. We show augmented-reality applications that demonstrate the unique features of the map output by MaskFusion: instance-aware, semantic and dynamic. Code will be made available.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

Antoni Rosinol, +3 more

TL;DR: Kimera as discussed by the authors is an open-source C++ library for real-time metric-semantic visual-inertial SLAM by enabling mesh reconstruction and semantic labeling in 3D.

...read moreread less

Proceedings ArticleDOI

Fusion++: Volumetric Object-Level SLAM

John McCormac, +4 more

TL;DR: In this article, Mask-RCNN instance segmentation is used to initialise compact per-object Truncated Signed Distance Function (TSDF) reconstructions with object size-dependent resolutions and a novel 3D foreground mask.

...read moreread less

Proceedings ArticleDOI

SuMa++: Efficient LiDAR-based Semantic SLAM

Xieyuanli Chen, +5 more

TL;DR: An extension of a recently published surfel-based mapping approach exploiting three-dimensional laser range scans by integrating semantic information to facilitate the mapping process, which enables us to reliably filter moving objects, but also improve the projective scan matching via semantic constraints.

...read moreread less

Journal ArticleDOI

Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery

Margarita Grinvald, +6 more

TL;DR: This work presents an approach to incrementally build volumetric object-centric maps during online scanning with a localized RGB-D camera and demonstrates that the proposed approach for building instance-level semantic maps is competitive with state-of-the-art methods, while additionally able to discover objects of unseen categories.

...read moreread less

Proceedings ArticleDOI

PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things

Gaku Narita, +3 more

TL;DR: PanopticFusion as discussed by the authors predicts pixel-wise panoptic labels (class labels for stuff regions and instance IDs for thing regions) for incoming RGB frames by fusing 2D semantic and instance segmentation outputs.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Proceedings ArticleDOI

Mask R-CNN

Kaiming He, +3 more

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.

...read moreread less

Proceedings Article

Faster R-CNN: towards real-time object detection with region proposal networks

Shaoqing Ren, +3 more

TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.

...read moreread less

Proceedings Article

Mask R-CNN

Kaiming He, +3 more

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.

...read moreread less

Collapse

Related Papers (5)

Mask R-CNN

Kaiming He, +3 more

ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras

Raul Mur-Artal, +1 more

- 12 Jun 2017 -

IEEE Transactions on Robotics

MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects

Citations

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

Fusion++: Volumetric Object-Level SLAM

SuMa++: Efficient LiDAR-based Semantic SLAM

Volumetric Instance-Aware Semantic Mapping and 3D Object Discovery

PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things

References

Deep Residual Learning for Image Recognition

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Mask R-CNN

Faster R-CNN: towards real-time object detection with region proposal networks

Mask R-CNN

Related Papers (5)

Mask R-CNN

ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras

KinectFusion: Real-time dense surface mapping and tracking

A benchmark for the evaluation of RGB-D SLAM systems

SLAM++: Simultaneous Localisation and Mapping at the Level of Objects