SceneEDNet: A Deep Learning Approach for Scene Flow Estimation
Ravi Kumar Thakur,Snehasis Mukherjee +1 more
- pp 394-399
TLDR
This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture.Abstract:
Estimating scene flow in RGB-D videos is attracting much interest of the computer vision researchers, due to its potential applications in robotics. The state-of-the-art techniques for scene flow estimation typically rely on the knowledge of scene structure of the frame and the correspondence between frames. However, with the availability of large RGB-D data captured from depth sensors, learning representations for estimation of scene flow has become possible. This paper introduces a first effort to apply a deep learning method for direct estimation of scene flow by presenting a fully convolutional neural network with an encoder-decoder (ED) architecture. The proposed network SceneEDNet involves estimation of three dimensional motion vectors of all the scene points from sequence of stereo images. The training for direct estimation of scene flow is done using consecutive pairs of stereo images and corresponding scene flow ground truth.read more
Citations
More filters
Posted Content
Self-Supervised Monocular Scene Flow Estimation
Junhwa Hur,Stefan Roth +1 more
TL;DR: This work designs a single convolutional neural network (CNN) that successfully estimates depth and 3D motion simultaneously from a classical optical flow cost volume, and adopts self-supervised learning with 3D loss functions and occlusion reasoning to leverage unlabeled data.
Proceedings ArticleDOI
DeepFaceFlow: In-the-Wild Dense 3D Facial Motion Estimation
TL;DR: DeepFaceFlow as discussed by the authors proposes a robust, fast and highlyaccurate framework for the dense estimation of 3D non-rigid facial flow between pairs of monocular images, which is trained and tested on two very large-scale facial video datasets, one of them of their own collection and annotation, with the aid of occlusion-aware and 3D-based loss function.
Proceedings ArticleDOI
A Conditional Adversarial Network for Scene Flow Estimation
TL;DR: In this article, a conditional adversarial network is proposed for scene flow estimation in depth videos, which uses loss function at two ends: both the generator and the discriminator, and is able to estimate both the optical flow and disparity from the input stereo images simultaneously.
Posted Content
DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation
TL;DR: This work proposes DeepFaceFlow, a robust, fast, and highly-accurate framework for the dense estimation of 3D non-rigid facial flow between pairs of monocular images, and incorporates its framework in a full-head state-of-the-art facial video synthesis method.
References
More filters
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
Journal ArticleDOI
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Proceedings ArticleDOI
FlowNet: Learning Optical Flow with Convolutional Networks
Alexey Dosovitskiy,Philipp Fischery,Eddy Ilg,Philip Häusser,Caner Hazirbas,Vladimir Golkov,Patrick van der Smagt,Daniel Cremers,Thomas Brox +8 more
TL;DR: In this paper, the authors propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations, and show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI.
Posted Content
ShapeNet: An Information-Rich 3D Model Repository
Angel X. Chang,Thomas Funkhouser,Leonidas J. Guibas,Pat Hanrahan,Qixing Huang,Zimo Li,Silvio Savarese,Manolis Savva,Shuran Song,Hao Su,Jianxiong Xiao,Li Yi,Fisher Yu +12 more
TL;DR: ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy, a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations.
Proceedings Article
Depth Map Prediction from a Single Image using a Multi-Scale Deep Network
TL;DR: In this article, two deep network stacks are employed to make a coarse global prediction based on the entire image, and another to refine this prediction locally, which achieves state-of-the-art results on both NYU Depth and KITTI.