Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

doi:10.1007/978-3-319-10599-4_45

Open AccessBook ChapterDOI

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Abhijit Kundu, +4 more

- pp 703-718

Chats0

TLDR

Improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences is demonstrated.

Abstract:

We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respectively. We derive a Conditional Random Field (CRF) model defined in the 3D space, that jointly infers the semantic category and occupancy for each voxel. Such a joint inference in the 3D CRF paves the way for more informed priors and constraints, which is otherwise not possible if solved separately in their traditional frameworks. We make use of class specific semantic cues that constrain the 3D structure in areas, where multiview constraints are weak. Our model comprises of higher order factors, which helps when the depth is unobservable.We also make use of class specific semantic cues to reduce either the degree of such higher order factors, or to approximately model them with unaries if possible. We demonstrate improved 3D structure and temporally consistent semantic segmentation for difficult, large scale, forward moving monocular image sequences.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

The Cityscapes Dataset for Semantic Urban Scene Understanding

Marius Cordts, +8 more

TL;DR: This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.

...read moreread less

Proceedings ArticleDOI

The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

German Ros, +4 more

TL;DR: This paper generates a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations, and conducts experiments with DCNNs that show how the inclusion of SYnTHIA in the training stage significantly improves performance on the semantic segmentation task.

...read moreread less

Journal ArticleDOI

Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age

Cesar Cadena, +7 more

- 01 Dec 2016 -

IEEE Transactions on Robotics

TL;DR: Simultaneous localization and mapping (SLAM) as mentioned in this paper consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it.

...read moreread less

Journal ArticleDOI

Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

Cesar Cadena, +7 more

- 19 Jun 2016 -

arXiv: Robotics

TL;DR: What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers.

...read moreread less

Book

Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

Joel Janai, +3 more

TL;DR: This survey includes both the historically most relevant literature as well as the current state of the art on several specific topics, including recognition, reconstruction, motion estimation, tracking, scene understanding, and end-to-end learning for autonomous driving.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +2 more

TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

...read moreread less

Probabilistic Models for Segmenting and Labeling Sequence Data

John Lafferty, +3 more

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

Book

Probabilistic graphical models : principles and techniques

Daniel L. Koller, +1 more

TL;DR: The framework of probabilistic graphical models, presented in this book, provides a general approach for causal reasoning and decision making under uncertainty, allowing interpretable models to be constructed and then manipulated by reasoning algorithms.

...read moreread less

Book

Probabilistic Robotics

Sebastian Thrun

TL;DR: This research presents a novel approach to planning and navigation algorithms that exploit statistics gleaned from uncertain, imperfect real-world environments to guide robots toward their goals and around obstacles.

...read moreread less

Collapse

IEEE Transactions on Robotics

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Citations

The Cityscapes Dataset for Semantic Urban Scene Understanding

The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes

Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age

Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age

Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art

References

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

Probabilistic Models for Segmenting and Labeling Sequence Data

Are we ready for autonomous driving? The KITTI vision benchmark suite

Probabilistic graphical models : principles and techniques

Probabilistic Robotics

Related Papers (5)

Fully convolutional networks for semantic segmentation

The Cityscapes Dataset for Semantic Urban Scene Understanding

Are we ready for autonomous driving? The KITTI vision benchmark suite

Indoor segmentation and support inference from RGBD images

ORB-SLAM: A Versatile and Accurate Monocular SLAM System