Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

doi:10.1109/ICCV.2017.281

Proceedings ArticleDOI

Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

- pp 2593-2602

TLDR

The importance of recognition granularity is investigated, from coarse 2D bounding box estimates over 2D instance segmentations to fine-grained 3D object part predictions, and it is observed that the instance segmentation cue is by far strongest, in the authors' setting.

Abstract:

Existing methods for 3D scene flow estimation often fail in the presence of large displacement or local ambiguities, e.g., at texture-less or reflective surfaces. However, these challenges are omnipresent in dynamic road scenes, which is the focus of this work. Our main contribution is to overcome these 3D motion estimation problems by exploiting recognition. In particular, we investigate the importance of recognition granularity, from coarse 2D bounding box estimates over 2D instance segmentations to fine-grained 3D object part predictions. We compute these cues using CNNs trained on a newly annotated dataset of stereo images and integrate them into a CRF-based model for robust 3D scene flow estimation - an approach we term Instance Scene Flow. We analyze the importance of each recognition cue in an ablation study and observe that the instance segmentation cue is by far strongest, in our setting. We demonstrate the effectiveness of our method on the challenging KITTI 2015 scene flow benchmark where we achieve state-of-the-art performance at the time of submission.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

Deqing Sun, +3 more

TL;DR: PWC-Net as discussed by the authors uses the current optical flow estimate to warp the CNN features of the second image, which is processed by a CNN to estimate the optical flow, and achieves state-of-the-art performance on the MPI Sintel final pass and KITTI 2015 benchmarks.

...read moreread less

Proceedings ArticleDOI

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

Zhichao Yin, +1 more

TL;DR: GeoNet as mentioned in this paper proposes an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively.

...read moreread less

Posted Content

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

Zhichao Yin, +1 more

- 06 Mar 2018 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

...read moreread less

Proceedings ArticleDOI

M3D-RPN: Monocular 3D Region Proposal Network for Object Detection

Garrick Brazil, +1 more

TL;DR: M3D-RPN is able to significantly improve the performance of both monocular 3D Object Detection and Bird's Eye View tasks within the KITTI urban autonomous driving dataset, while efficiently using a shared multi-class model.

...read moreread less

Book ChapterDOI

SegStereo: Exploiting Semantic Information for Disparity Estimation

Guorun Yang, +4 more

TL;DR: This paper suggests that appropriate incorporation of semantic cues can greatly rectify prediction in commonly-used disparity estimation frameworks and proposes a unified model SegStereo, which employs semantic features from segmentation and introduces semantic softmax loss, which helps improve the prediction accuracy of disparity maps.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Joint Optical Flow and Temporally Consistent Semantic Segmentation

Junhwa Hur, +1 more

TL;DR: In this paper, a method for jointly estimating optical flow and temporally consistent semantic segmentation is proposed, which closely connects these two problem domains and leverages each other. But it is not yet a state-of-the-art method.

...read moreread less

Book ChapterDOI

View-Consistent 3D Scene Flow Estimation over Multiple Frames

Christoph Vogel, +2 more

TL;DR: It is shown that such a view-consistent multi-frame scheme greatly improves scene flow computation in the presence of occlusions, and increases its robustness against adverse imaging conditions, such as specularities.

...read moreread less

Book ChapterDOI

6-DOF Model Based Tracking via Object Coordinate Regression

Alexander Krull, +5 more

TL;DR: This work builds on a recently developed state-of-the-art system for single image 6D pose estimation of known 3D objects, using the concept of so-called 3D object coordinates, and creates a new dataset, which will be made publicly available.

...read moreread less

Proceedings ArticleDOI

Pose Estimation of Kinematic Chain Instances via Object Coordinate Regression.

Frank Michel, +5 more

TL;DR: This work considers the task of one-shot pose estimation of articulated object instances from an RGB-D image of objects with the topology of a kinematic chain of any length, i.e. objects are composed of a chain of parts interconnected by joints.

...read moreread less

Book ChapterDOI

A Continuous Optimization Approach for Efficient and Accurate Scene Flow

Zhaoyang Lv, +5 more

TL;DR: A continuous optimization method for solving dense 3D scene flow problems from stereo imagery using a fine superpixel segmentation that is fixed a-priori and a factor graph formulation that decomposes the problem into photometric, geometric, and smoothing constraints is proposed.

...read moreread less

Collapse

Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios?

Citations

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

M3D-RPN: Monocular 3D Region Proposal Network for Object Detection

SegStereo: Exploiting Semantic Information for Disparity Estimation

References

Joint Optical Flow and Temporally Consistent Semantic Segmentation

View-Consistent 3D Scene Flow Estimation over Multiple Frames

6-DOF Model Based Tracking via Object Coordinate Regression

Pose Estimation of Kinematic Chain Instances via Object Coordinate Regression.

A Continuous Optimization Approach for Efficient and Accurate Scene Flow

Related Papers (5)

Object scene flow for autonomous vehicles

Are we ready for autonomous driving? The KITTI vision benchmark suite

FlowNet: Learning Optical Flow with Convolutional Networks

FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume