Accelerating 3D Deep Learning with PyTorch3D

Open AccessPosted Content

Accelerating 3D Deep Learning with PyTorch3D

- 16 Jul 2020 -

arXiv: Computer Vision and Pattern Recog...

TLDR

1. Accelerating 3D Deep Learning with PyTorch3D, arXiv 2007 2. Mesh R-CNN, ICCV 2019 3. SynSin: End-to-end View Synthesis from a Single Image, CVPR 2020 4. Fast Differentiable Raycasting for Neural Rendering using Sphere-based Representations.

Abstract:

Deep learning has significantly improved 2D image recognition. Extending into 3D may advance many new applications including autonomous vehicles, virtual and augmented reality, authoring 3D content, and even improving 2D recognition. However despite growing interest, 3D deep learning remains relatively underexplored. We believe that some of this disparity is due to the engineering challenges involved in 3D deep learning, such as efficiently processing heterogeneous data and reframing graphics operations to be differentiable. We address these challenges by introducing PyTorch3D, a library of modular, efficient, and differentiable operators for 3D deep learning. It includes a fast, modular differentiable renderer for meshes and point clouds, enabling analysis-by-synthesis approaches. Compared with other differentiable renderers, PyTorch3D is more modular and efficient, allowing users to more easily extend it while also gracefully scaling to large meshes and images. We compare the PyTorch3D operators and renderer with other implementations and demonstrate significant speed and memory improvements. We also use PyTorch3D to improve the state-of-the-art for unsupervised 3D mesh and point cloud prediction from 2D images on ShapeNet. PyTorch3D is open-source and we hope it will help accelerate research in 3D deep learning.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

AutoFlow: Learning a Better Training Set for Optical Flow

Deqing Sun, +8 more

TL;DR: AutoFlow as discussed by the authors takes a layered approach to render synthetic data, where the motion, shape, and appearance of each layer are controlled by learnable hyperparameters and achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.

...read moreread less

Inverse Rendering for Computer Graphics

Sato Imari

Posted Content

MVTN: Multi-View Transformation Network for 3D Shape Recognition

Abdullah Hamdi, +2 more

- 26 Nov 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The Multi-View Transformation Network (MVTN) is introduced that regresses optimal view-points for 3D shape recognition, building upon advances in differentiable rendering and can provide network robustness against rotation and occlusion in the 3D domain.

...read moreread less

Continuous shading of curved surfaces

Henri Gouraud

TL;DR: In this paper, a procedure for computing shaded pictures of curved surfaces is presented, where the surface is approximated by small polygons in order to solve easily the hidden-parts problem, but the shading of each polygon is computed so that the discontinuities of shade are eliminated across the surface and a smooth appearance is obtained.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less