SegStereo: Exploiting Semantic Information for Disparity Estimation

doi:10.1007/978-3-030-01234-2_39

Open AccessBook ChapterDOI

SegStereo: Exploiting Semantic Information for Disparity Estimation

- pp 636-651

TLDR

This paper suggests that appropriate incorporation of semantic cues can greatly rectify prediction in commonly-used disparity estimation frameworks and proposes a unified model SegStereo, which employs semantic features from segmentation and introduces semantic softmax loss, which helps improve the prediction accuracy of disparity maps.

Abstract:

Disparity estimation for binocular stereo images finds a wide range of applications. Traditional algorithms may fail on featureless regions, which could be handled by high-level clues such as semantic segments. In this paper, we suggest that appropriate incorporation of semantic cues can greatly rectify prediction in commonly-used disparity estimation frameworks. Our method conducts semantic feature embedding and regularizes semantic cues as the loss term to improve learning disparity. Our unified model SegStereo employs semantic features from segmentation and introduces semantic softmax loss, which helps improve the prediction accuracy of disparity maps. The semantic cues work well in both unsupervised and supervised manners. SegStereo achieves state-of-the-art results on KITTI Stereo benchmark and produces decent prediction on both CityScapes and FlyingThings3D datasets.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

Xiaodong Gu, +5 more

TL;DR: This paper proposes a both memory and time efficient cost volume formulation that is complementary to existing multi-view stereo and stereo matching approaches based on 3D cost volumes and applies the cascade cost volume to the representative MVS-Net, obtaining a 35.6% improvement on DTU benchmark.

...read moreread less

Proceedings ArticleDOI

Improving Semantic Segmentation via Video Propagation and Label Relaxation

Yi Zhu, +6 more

TL;DR: In this article, a video prediction-based methodology was proposed to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks, which achieved state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid.

...read moreread less

Proceedings ArticleDOI

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

Antoni Rosinol, +3 more

TL;DR: Kimera as discussed by the authors is an open-source C++ library for real-time metric-semantic visual-inertial SLAM by enabling mesh reconstruction and semantic labeling in 3D.

...read moreread less

Proceedings ArticleDOI

Group-Wise Correlation Stereo Network

Xiaoyang Guo, +4 more

TL;DR: Group-wise correlation provides efficient representations for measuring feature similarities and will not lose too much information like full correlation, and preserves better performance when reducing parameters compared with previous methods.

...read moreread less

Proceedings ArticleDOI

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

Zhichao Yin, +2 more

TL;DR: Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching, is proposed and achieves state-of-the-art results.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Vijay Badrinarayanan, +2 more

- 01 Dec 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.

...read moreread less

Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

Collapse

SegStereo: Exploiting Semantic Information for Disparity Estimation

Citations

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

Improving Semantic Segmentation via Video Propagation and Label Relaxation

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

Group-Wise Correlation Stereo Network

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

References

Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Caffe: Convolutional Architecture for Fast Feature Embedding

Are we ready for autonomous driving? The KITTI vision benchmark suite

Related Papers (5)

Object scene flow for autonomous vehicles

Are we ready for autonomous driving? The KITTI vision benchmark suite

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

Stereo Processing by Semiglobal Matching and Mutual Information

Deep Residual Learning for Image Recognition