Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency
Shubham Tulsiani,Tinghui Zhou,Alexei A. Efros,Jitendra Malik +3 more
- pp 209-217
TLDR
A differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view is proposed by reformulating view consistency using a differentiable ray consistency (DRC) term and it is shown that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations.Abstract:
We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g. foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction. We present empirical analysis of our technique in a controlled setting. We also show that this approach allows us to improve over existing techniques for single-view reconstruction of objects from the PASCAL VOC dataset.read more
Citations
More filters
Posted Content
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall,Pratul P. Srinivasan,Matthew Tancik,Jonathan T. Barron,Ravi Ramamoorthi,Ren Ng +5 more
TL;DR: This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.
Proceedings ArticleDOI
Unsupervised Learning of Depth and Ego-Motion from Video
TL;DR: In this paper, an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences is presented, which uses single-view depth and multiview pose networks with a loss based on warping nearby views to the target using the computed depth and pose.
Proceedings ArticleDOI
Occupancy Networks: Learning 3D Reconstruction in Function Space
TL;DR: In this paper, the authors propose Occupancy Networks, which implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier, which can be used for learning-based 3D reconstruction methods.
Book ChapterDOI
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Ben Mildenhall,Pratul P. Srinivasan,Matthew Tancik,Jonathan T. Barron,Ravi Ramamoorthi,Ren Ng +5 more
TL;DR: In this article, a fully-connected (non-convolutional) deep network is used to synthesize novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.
Proceedings ArticleDOI
Neural 3D Mesh Renderer
TL;DR: In this article, an approximate gradient for rasterization is proposed to enable the integration of rendering into neural networks, which enables single-image 3D mesh reconstruction with silhouette image supervision.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Proceedings ArticleDOI
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Book
The Ecological Approach to Visual Perception
TL;DR: The relationship between Stimulation and Stimulus Information for visual perception is discussed in detail in this article, where the authors also present experimental evidence for direct perception of motion in the world and movement of the self.
Journal ArticleDOI
The Pascal Visual Object Classes (VOC) Challenge
TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.