scispace - formally typeset
Open AccessBook ChapterDOI

SegStereo: Exploiting Semantic Information for Disparity Estimation

TLDR
This paper suggests that appropriate incorporation of semantic cues can greatly rectify prediction in commonly-used disparity estimation frameworks and proposes a unified model SegStereo, which employs semantic features from segmentation and introduces semantic softmax loss, which helps improve the prediction accuracy of disparity maps.
Abstract
Disparity estimation for binocular stereo images finds a wide range of applications. Traditional algorithms may fail on featureless regions, which could be handled by high-level clues such as semantic segments. In this paper, we suggest that appropriate incorporation of semantic cues can greatly rectify prediction in commonly-used disparity estimation frameworks. Our method conducts semantic feature embedding and regularizes semantic cues as the loss term to improve learning disparity. Our unified model SegStereo employs semantic features from segmentation and introduces semantic softmax loss, which helps improve the prediction accuracy of disparity maps. The semantic cues work well in both unsupervised and supervised manners. SegStereo achieves state-of-the-art results on KITTI Stereo benchmark and produces decent prediction on both CityScapes and FlyingThings3D datasets.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

TL;DR: This paper proposes a both memory and time efficient cost volume formulation that is complementary to existing multi-view stereo and stereo matching approaches based on 3D cost volumes and applies the cascade cost volume to the representative MVS-Net, obtaining a 35.6% improvement on DTU benchmark.
Proceedings ArticleDOI

Improving Semantic Segmentation via Video Propagation and Label Relaxation

TL;DR: In this article, a video prediction-based methodology was proposed to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks, which achieved state-of-the-art mIoUs of 83.5% on Cityscapes and 82.9% on CamVid.
Proceedings ArticleDOI

Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping

TL;DR: Kimera as discussed by the authors is an open-source C++ library for real-time metric-semantic visual-inertial SLAM by enabling mesh reconstruction and semantic labeling in 3D.
Proceedings ArticleDOI

Group-Wise Correlation Stereo Network

TL;DR: Group-wise correlation provides efficient representations for measuring feature similarities and will not lose too much information like full correlation, and preserves better performance when reducing parameters compared with previous methods.
Proceedings ArticleDOI

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

TL;DR: Hierarchical Discrete Distribution Decomposition (HD^3), a framework suitable for learning probabilistic pixel correspondences in both optical flow and stereo matching, is proposed and achieves state-of-the-art results.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Journal ArticleDOI

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

TL;DR: Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet.
Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
Proceedings ArticleDOI

Are we ready for autonomous driving? The KITTI vision benchmark suite

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.
Related Papers (5)