scispace - formally typeset
Search or ask a question
Posted Content

DiffusionNet: Discretization Agnostic Learning on Surfaces

TL;DR: In this paper, a simple diffusion layer is proposed for spatial communication on 3D mesh surfaces, and the spatial support of diffusion is optimized as a continuous network parameter ranging from purely local to totally global, removing the burden of manually choosing neighborhood sizes.
Abstract: We introduce a new approach to deep learning on 3D surfaces, based on the insight that a simple diffusion layer is highly effective for spatial communication. The resulting networks automatically generalize across different samplings and resolutions of a surface -- a basic property which is crucial for practical applications. Our networks can be discretized on various geometric representations such as triangle meshes or point clouds, and can even be trained on one representation then applied to another. We optimize the spatial support of diffusion as a continuous network parameter ranging from purely local to totally global, removing the burden of manually choosing neighborhood sizes. The only other ingredients in the method are a multi-layer perceptron applied independently at each point, and spatial gradient features to support directional filters. The resulting networks are simple, robust, and efficient. Here, we focus primarily on triangle mesh surfaces, and demonstrate state-of-the-art results for a variety of tasks including surface classification, segmentation, and non-rigid correspondence.
Citations
More filters
Posted Content
TL;DR: SyNoRiM as mentioned in this paper is a method to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds, which achieves state-of-the-art performance in registration accuracy.
Abstract: We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds. Even though the ability to process non-rigid shapes is critical in various applications ranging from computer animation to 3D digitization, the literature still lacks a robust and flexible framework to match and align a collection of real, noisy scans observed under occlusions. Given a set of such point clouds, our method first computes the pairwise correspondences parameterized via functional maps. We simultaneously learn potentially non-orthogonal basis functions to effectively regularize the deformations, while handling the occlusions in an elegant way. To maximally benefit from the multi-way information provided by the inferred pairwise deformation fields, we synchronize the pairwise functional maps into a cycle-consistent whole thanks to our novel and principled optimization formulation. We demonstrate via extensive experiments that our method achieves a state-of-the-art performance in registration accuracy, while being flexible and efficient as we handle both non-rigid and multi-body cases in a unified framework and avoid the costly optimization over point-wise permutations by the use of basis function maps.

4 citations

Posted Content
TL;DR: In this article, the authors transform the discretization of the surfaces to semi-regular meshes that have a locally regular connectivity and whose meshing is hierarchical, and they apply the same spatial convolutional filters to the local neighborhoods and define a pooling operator that can be applied to every semiregular mesh and visualize the underlying dynamics of unseen mesh sequences with an autoencoder trained on different classes of meshes.
Abstract: The analysis of deforming 3D surface meshes is accelerated by autoencoders since the low-dimensional embeddings can be used to visualize underlying dynamics. But, state-of-the-art mesh convolutional autoencoders require a fixed connectivity of all input meshes handled by the autoencoder. This is due to either the use of spectral convolutional layers or mesh dependent pooling operations. Therefore, the types of datasets that one can study are limited and the learned knowledge cannot be transferred to other datasets that exhibit similar behavior. To address this, we transform the discretization of the surfaces to semi-regular meshes that have a locally regular connectivity and whose meshing is hierarchical. This allows us to apply the same spatial convolutional filters to the local neighborhoods and to define a pooling operator that can be applied to every semi-regular mesh. We apply the same mesh autoencoder to different datasets and our reconstruction error is more than 50% lower than the error from state-of-the-art models, which have to be trained for every mesh separately. Additionally, we visualize the underlying dynamics of unseen mesh sequences with an autoencoder trained on different classes of meshes.

2 citations

Journal ArticleDOI
TL;DR: Wu et al. as mentioned in this paper proposed a new convolutional neural network, called Multi Point-Voxel Convolution (MPVConv), for deep learning on point clouds.
Abstract: The existing 3D deep learning methods adopt either individual point-based features or local-neighboring voxel-based features, and demonstrate great potential for processing 3D data. However, the point-based models are inefficient due to the unordered nature of point clouds and the voxel-based models suffer from large information loss. Motivated by the success of recent point-voxel representation, such as PVCNN and DRINet, we propose a new convolutional neural network, called Multi Point-Voxel Convolution (MPVConv), for deep learning on point clouds. Integrating both the advantages of voxel and point-based methods, MPVConv can effectively increase the neighboring collection between point-based features and also promote independence among voxel-based features. Extensive experiments on benchmark datasets such as ShapeNet Part, S3DIS and KITTI for various tasks show that MPVConv improves the accuracy of the backbone (PointNet) by up to 36%, and achieves higher accuracy than the voxel-based model with up to 34× speedups. In addition, MPVConv outperforms the state-of-the-art point-based models with up to 8× speedups. Also, our MPVConv only needs 65% of the GPU memory required by the latest point-voxel-based model (DRINet). The source code of our method is attached in https://github.com/NWUzhouwei/MPVConv.

1 citations

Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed a self-supervised method to learn shape correspondences directly from a group of bone surfaces segmented from CT scans, without any supervision from time-consuming and error-prone manual annotations.
Abstract: Non-rigid registration between 3D surfaces is an important but notorious problem in medical imaging, because finding correspondences between non-isometric instances is mathematically non-trivial. We propose a novel self-supervised method to learn shape correspondences directly from a group of bone surfaces segmented from CT scans, without any supervision from time-consuming and error-prone manual annotations. Relying on a Siamese architecture, DiffusionNet as the feature extractor is jointly trained with a pair of randomly rotated and scaled copies of the same shape. The learned embeddings are aligned in spectral domain using eigenfunctions of the Laplace-Beltrami Operator. Additional normalization and regularization losses are incorporated to guide the learned embeddings towards a similar uniform representation over spectrum, which promotes the embeddings to encode multiscale features and advocates sparsity and diagonality of the inferred functional maps. Our method achieves state-of-the-art results among the unsupervised methods on several benchmarks, and presents greater robustness and efficacy in registering moderately deformed shapes. A hybrid refinement strategy is proposed to retrieve smooth and close-to-conformal point-to-point correspondences from the inferred functional map. Our method is orientation and discretization-invariant. Given a pair of near-isometric surfaces, our method automatically computes registration in high accuracy, and outputs anatomically meaningful correspondences. In this study, we show that it is possible to use neural networks to learn general embeddings from 3D shapes in a self-supervised way. The learned features are multiscale, informative, and discriminative, which might potentially benefit almost all types of morphology-related downstream tasks, such as diagnostics, data screening and statistical shape analysis in future.

1 citations

Posted Content
TL;DR: DPFM as discussed by the authors uses the functional map framework, which can be trained in a supervised or unsupervised manner, and learns descriptors directly from the data, thus both improving robustness and accuracy in challenging cases.
Abstract: We consider the problem of computing dense correspondences between non-rigid shapes with potentially significant partiality. Existing formulations tackle this problem through heavy manifold optimization in the spectral domain, given hand-crafted shape descriptors. In this paper, we propose the first learning method aimed directly at partial non-rigid shape correspondence. Our approach uses the functional map framework, can be trained in a supervised or unsupervised manner, and learns descriptors directly from the data, thus both improving robustness and accuracy in challenging cases. Furthermore, unlike existing techniques, our method is also applicable to partial-to-partial non-rigid matching, in which the common regions on both shapes are unknown a priori. We demonstrate that the resulting method is data-efficient, and achieves state-of-the-art results on several benchmark datasets. Our code and data can be found online: https://github.com/pvnieo/DPFM

1 citations

References
More filters
Proceedings ArticleDOI
27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

123,388 citations

Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations

Proceedings Article
01 Jan 2019
TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

10,045 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.
Abstract: Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input. Our network, named PointNet, provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing. Though simple, PointNet is highly efficient and effective. Empirically, it shows strong performance on par or even better than state of the art. Theoretically, we provide analysis towards understanding of what the network has learnt and why the network is robust with respect to input perturbation and corruption.

9,457 citations

Journal ArticleDOI
TL;DR: In this article, the authors proposed a geometrically motivated algorithm for representing high-dimensional data, based on the correspondence between the graph Laplacian, the Laplace Beltrami operator on the manifold and the connections to the heat equation.
Abstract: One of the central problems in machine learning and pattern recognition is to develop appropriate representations for complex data. We consider the problem of constructing a representation for data lying on a low-dimensional manifold embedded in a high-dimensional space. Drawing on the correspondence between the graph Laplacian, the Laplace Beltrami operator on the manifold, and the connections to the heat equation, we propose a geometrically motivated algorithm for representing the high-dimensional data. The algorithm provides a computationally efficient approach to nonlinear dimensionality reduction that has locality-preserving properties and a natural connection to clustering. Some potential applications and illustrative examples are discussed.

7,210 citations