scispace - formally typeset
Search or ask a question
Author

Taylor Gordon

Bio: Taylor Gordon is an academic researcher. The author has contributed to research in topics: Deep learning & Augmented reality. The author has an hindex of 1, co-authored 1 publications receiving 127 citations.

Papers
More filters
Posted Content
TL;DR: 1. Accelerating 3D Deep Learning with PyTorch3D, arXiv 2007 2. Mesh R-CNN, ICCV 2019 3. SynSin: End-to-end View Synthesis from a Single Image, CVPR 2020 4. Fast Differentiable Raycasting for Neural Rendering using Sphere-based Representations.
Abstract: Deep learning has significantly improved 2D image recognition. Extending into 3D may advance many new applications including autonomous vehicles, virtual and augmented reality, authoring 3D content, and even improving 2D recognition. However despite growing interest, 3D deep learning remains relatively underexplored. We believe that some of this disparity is due to the engineering challenges involved in 3D deep learning, such as efficiently processing heterogeneous data and reframing graphics operations to be differentiable. We address these challenges by introducing PyTorch3D, a library of modular, efficient, and differentiable operators for 3D deep learning. It includes a fast, modular differentiable renderer for meshes and point clouds, enabling analysis-by-synthesis approaches. Compared with other differentiable renderers, PyTorch3D is more modular and efficient, allowing users to more easily extend it while also gracefully scaling to large meshes and images. We compare the PyTorch3D operators and renderer with other implementations and demonstrate significant speed and memory improvements. We also use PyTorch3D to improve the state-of-the-art for unsupervised 3D mesh and point cloud prediction from 2D images on ShapeNet. PyTorch3D is open-source and we hope it will help accelerate research in 3D deep learning.

430 citations


Cited by
More filters
Proceedings ArticleDOI
09 Aug 2021
TL;DR: Loss functions for Neural Rendering Jun-Yan Zhu shows the importance of knowing the number of neurons in the system and how many neurons are firing at the same time.
Abstract: Loss functions for Neural Rendering Jun-Yan Zhu

174 citations

Proceedings ArticleDOI
01 Jun 2021
TL;DR: AutoFlow as discussed by the authors takes a layered approach to render synthetic data, where the motion, shape, and appearance of each layer are controlled by learnable hyperparameters and achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.
Abstract: Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications. To automate the process, we present AutoFlow, a simple and effective method to render training data for optical flow that optimizes the performance of a model on a target dataset. AutoFlow takes a layered approach to render synthetic data, where the motion, shape, and appearance of each layer are controlled by learnable hyperparameters. Experimental results show that AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT. Our code and data are available at autoflow-google.github.io.

79 citations

05 Mar 2010

68 citations

Posted Content
TL;DR: The Multi-View Transformation Network (MVTN) is introduced that regresses optimal view-points for 3D shape recognition, building upon advances in differentiable rendering and can provide network robustness against rotation and occlusion in the 3D domain.
Abstract: Multi-view projection methods have demonstrated their ability to reach state-of-the-art performance on 3D shape recognition. Those methods learn different ways to aggregate information from multiple views. However, the camera view-points for those views tend to be heuristically set and fixed for all shapes. To circumvent the lack of dynamism of current multi-view methods, we propose to learn those view-points. In particular, we introduce the Multi-View Transformation Network (MVTN) that regresses optimal view-points for 3D shape recognition, building upon advances in differentiable rendering. As a result, MVTN can be trained end-to-end along with any multi-view network for 3D shape classification. We integrate MVTN in a novel adaptive multi-view pipeline that can render either 3D meshes or point clouds. MVTN exhibits clear performance gains in the tasks of 3D shape classification and 3D shape retrieval without the need for extra training supervision. In these tasks, MVTN achieves state-of-the-art performance on ModelNet40, ShapeNet Core55, and the most recent and realistic ScanObjectNN dataset (up to 6% improvement). Interestingly, we also show that MVTN can provide network robustness against rotation and occlusion in the 3D domain.

62 citations

Henri Gouraud1
01 Jan 1982
TL;DR: In this paper, a procedure for computing shaded pictures of curved surfaces is presented, where the surface is approximated by small polygons in order to solve easily the hidden-parts problem, but the shading of each polygon is computed so that the discontinuities of shade are eliminated across the surface and a smooth appearance is obtained.
Abstract: A procedure for computing shaded pictures of curved surfaces is presented. The surface is approximated by small polygons in order to solve easily the hidden-parts problem, but the shading of each polygon is computed so that the discontinuities of shade are eliminated across the surface and a smooth appearance is obtained. In order to achieve speed efficiency, the technique developed by Watkins is used which makes possible a hardware implementation of this algorithm.

60 citations