Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency

doi:10.1109/CVPR.2017.30

Open AccessProceedings ArticleDOI

Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency

- pp 209-217

TLDR

A differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view is proposed by reformulating view consistency using a differentiable ray consistency (DRC) term and it is shown that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations.

Abstract:

We study the notion of consistency between a 3D shape and a 2D observation and propose a differentiable formulation which allows computing gradients of the 3D shape given an observation from an arbitrary view. We do so by reformulating view consistency using a differentiable ray consistency (DRC) term. We show that this formulation can be incorporated in a learning framework to leverage different types of multi-view observations e.g. foreground masks, depth, color images, semantics etc. as supervision for learning single-view 3D prediction. We present empirical analysis of our technique in a controlled setting. We also show that this approach allows us to improve over existing techniques for single-view reconstruction of objects from the PASCAL VOC dataset.

Citations

PDF

Open Access

More filters

Posted Content

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Ben Mildenhall, +5 more

- 19 Mar 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.

...read moreread less

Proceedings ArticleDOI

Unsupervised Learning of Depth and Ego-Motion from Video

Tinghui Zhou, +3 more

TL;DR: In this paper, an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences is presented, which uses single-view depth and multiview pose networks with a loss based on warping nearby views to the target using the computed depth and pose.

...read moreread less

Proceedings ArticleDOI

Occupancy Networks: Learning 3D Reconstruction in Function Space

Lars Mescheder, +4 more

TL;DR: In this paper, the authors propose Occupancy Networks, which implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier, which can be used for learning-based 3D reconstruction methods.

...read moreread less

Book ChapterDOI

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Ben Mildenhall, +5 more

TL;DR: In this article, a fully-connected (non-convolutional) deep network is used to synthesize novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.

...read moreread less

Proceedings ArticleDOI

Neural 3D Mesh Renderer

Hiroharu Kato, +2 more

TL;DR: In this article, an approximate gradient for rasterization is proposed to enable the integration of rendering into neural networks, which enables single-image 3D mesh reconstruction with silhouette image supervision.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Book

The Ecological Approach to Visual Perception

James J. Gibson

TL;DR: The relationship between Stimulation and Stimulus Information for visual perception is discussed in detail in this article, where the authors also present experimental evidence for direct perception of motion in the world and movement of the self.

...read moreread less

Journal ArticleDOI

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010 -

International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Collapse

Related Papers (5)

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

Christopher Choy, +4 more

ShapeNet: An Information-Rich 3D Model Repository

Angel X. Chang, +12 more

- 09 Dec 2015 -

arXiv: Graphics

Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency

Citations

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Unsupervised Learning of Depth and Ego-Motion from Video

Occupancy Networks: Learning 3D Reconstruction in Function Space

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

Neural 3D Mesh Renderer

References

Deep Residual Learning for Image Recognition

ImageNet Large Scale Visual Recognition Challenge

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

The Ecological Approach to Visual Perception

The Pascal Visual Object Classes (VOC) Challenge

Related Papers (5)

3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction

ShapeNet: An Information-Rich 3D Model Repository

A Point Set Generation Network for 3D Object Reconstruction from a Single Image

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Occupancy Networks: Learning 3D Reconstruction in Function Space