scispace - formally typeset
Open AccessProceedings ArticleDOI

SceneEDNet: A Deep Learning Approach for Scene Flow Estimation

TLDR
This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture.
Abstract
Estimating scene flow in RGB-D videos is attracting much interest of the computer vision researchers, due to its potential applications in robotics. The state-of-the-art techniques for scene flow estimation typically rely on the knowledge of scene structure of the frame and the correspondence between frames. However, with the availability of large RGB-D data captured from depth sensors, learning representations for estimation of scene flow has become possible. This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture. The proposed network SceneEDNet involves estimation of three dimensional motion vectors of all the scene points from sequence of stereo images. The training for direct estimation of scene flow is done using consecutive pairs of stereo images and corresponding scene flow ground truth.

read more

Citations
More filters
Posted Content

Self-Supervised Monocular Scene Flow Estimation

TL;DR: This work designs a single convolutional neural network (CNN) that successfully estimates depth and 3D motion simultaneously from a classical optical flow cost volume, and adopts self-supervised learning with 3D loss functions and occlusion reasoning to leverage unlabeled data.
Proceedings ArticleDOI

DeepFaceFlow: In-the-Wild Dense 3D Facial Motion Estimation

TL;DR: DeepFaceFlow as discussed by the authors proposes a robust, fast and highlyaccurate framework for the dense estimation of 3D non-rigid facial flow between pairs of monocular images, which is trained and tested on two very large-scale facial video datasets, one of them of their own collection and annotation, with the aid of occlusion-aware and 3D-based loss function.
Proceedings ArticleDOI

A Conditional Adversarial Network for Scene Flow Estimation

TL;DR: In this article, a conditional adversarial network is proposed for scene flow estimation in depth videos, which uses loss function at two ends: both the generator and the discriminator, and is able to estimate both the optical flow and disparity from the input stereo images simultaneously.
Posted Content

DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation

TL;DR: This work proposes DeepFaceFlow, a robust, fast, and highly-accurate framework for the dense estimation of 3D non-rigid facial flow between pairs of monocular images, and incorporates its framework in a full-head state-of-the-art facial video synthesis method.
References
More filters
Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Journal ArticleDOI

A taxonomy and evaluation of dense two-frame stereo correspondence algorithms

TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Proceedings ArticleDOI

FlowNet: Learning Optical Flow with Convolutional Networks

TL;DR: In this paper, the authors propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations, and show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI.
Posted Content

ShapeNet: An Information-Rich 3D Model Repository

TL;DR: ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy, a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations.
Proceedings Article

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.
Related Papers (5)