PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Home
/
Papers
/
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Posted Content•

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Charles R. Qi¹, Li Yi¹, Hao Su², Leonidas J. Guibas¹•Institutions (2)

Stanford University¹, Johns Hopkins University²

07 Jun 2017-arXiv: Computer Vision and Pattern Recognition-

TL;DR: A hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set and proposes novel set learning layers to adaptively combine features from multiple scales to learn deep point set features efficiently and robustly.

read less

Abstract: Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.

...read moreread less

Citations

PDF

Open Access

More filters

Posted Content•

Fast Graph Representation Learning with PyTorch Geometric

[...]

Matthias Fey, Jan Eric Lenssen

06 Mar 2019-arXiv: Learning

TL;DR: PyTorch Geometric is introduced, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch, and a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios is performed.

...read moreread less

Abstract: We introduce PyTorch Geometric, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch. In addition to general graph data structures and processing methods, it contains a variety of recently published methods from the domains of relational learning and 3D data processing. PyTorch Geometric achieves high data throughput by leveraging sparse GPU acceleration, by providing dedicated CUDA kernels and by introducing efficient mini-batch handling for input examples of different size. In this work, we present the library in detail and perform a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios.

...read moreread less

2,308 citations

Cites background or methods from "PointNet++: Deep Hierarchical Featu..."

...As hierarchical pooling layers, we use the iterative farthest point sampling algorithm followed by a new graph generation based on a larger query ball (PointNet++ (Qi et al., 2017), MPNN (Gilmer et al....
[...]
...For learning on point clouds, manifolds and graphs with multidimensional edge features, we provide the relational GCN operator from Schlichtkrull et al. (2018), PointNet++ (Qi et al., 2017), PointCNN (Li et al., 2018), and the continuous kernel-based methods MPNN (Gilmer et al., 2017), MoNet (Monti et al., 2017), SplineCNN (Fey et al., 2018) and the edge convolution operator (EdgeCNN) from Wang et al. (2018b)....
[...]
...…(Simonovsky & Komodakis, 2017), the iterative farthest point sampling algorithm (Qi et al., 2017) followed by k-NN or query ball graph generation (Qi et al., 2017; Wang et al., 2018b), and differentiable pooling mechanisms such as DiffPool (Ying et al., 2018) and topk pooling (Gao & Ji, 2018;…...
[...]
...…al., 2007; Fagginger Auer & Bisseling, 2011) and voxel grid pooling (Simonovsky & Komodakis, 2017), the iterative farthest point sampling algorithm (Qi et al., 2017) followed by k-NN or query ball graph generation (Qi et al., 2017; Wang et al., 2018b), and differentiable pooling mechanisms such as…...
[...]
...…pooling layers, we use the iterative farthest point sampling algorithm followed by a new graph generation based on a larger query ball (PointNet++ (Qi et al., 2017), MPNN (Gilmer et al., 2017) and SplineCNN (Fey et al., 2018)) or based on a fixed number of nearest neighbors (EdgeCNN (Wang et al.,…...
[...]

Proceedings Article•DOI•

KPConv: Flexible and Deformable Convolution for Point Clouds

[...]

Hugues Thomas¹, Charles R. Qi², Jean-Emmanuel Deschaud³, Beatriz Marcotegui³, François Goulette³, Leonidas J. Guibas⁴ - Show less +2 more•Institutions (4)

PSL Research University¹, Facebook², Mines ParisTech³, Stanford University⁴

18 Apr 2019

TL;DR: KPConv is a new design of point convolution, i.e. that operates on point clouds without any intermediate representation, that outperform state-of-the-art classification and segmentation approaches on several datasets.

...read moreread less

Abstract: We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furthermore, these locations are continuous in space and can be learned by the network. Therefore, KPConv can be extended to deformable convolutions that learn to adapt kernel points to local geometry. Thanks to a regular subsampling strategy, KPConv is also efficient and robust to varying densities. Whether they use deformable KPConv for complex tasks, or rigid KPconv for simpler tasks, our networks outperform state-of-the-art classification and segmentation approaches on several datasets. We also offer ablation studies and visualizations to provide understanding of what has been learned by KPConv and to validate the descriptive power of deformable KPConv.

...read moreread less

1,742 citations

Cites methods from "PointNet++: Deep Hierarchical Featu..."

...Following PointNet, some hierarchical architectures have been developed to aggregate local neighborhood information with MLPs [26, 18, 20]....
[...]
...For benchmarking purpose, we use data provided by [26]....
[...]

Journal Article•DOI•

SECOND: Sparsely Embedded Convolutional Detection

[...]

Yan Yan¹, Yuxing Mao¹, Bo Li•Institutions (1)

Chongqing University¹

06 Oct 2018-Sensors

TL;DR: An improved sparse convolution method for Voxel-based 3D convolutional networks is investigated, which significantly increases the speed of both training and inference and introduces a new form of angle loss regression to improve the orientation estimation performance.

...read moreread less

Abstract: LiDAR-based or RGB-D-based object detection is used in numerous applications, ranging from autonomous driving to robot vision. Voxel-based 3D convolutional networks have been used for some time to enhance the retention of information when processing point cloud LiDAR data. However, problems remain, including a slow inference speed and low orientation estimation performance. We therefore investigate an improved sparse convolution method for such networks, which significantly increases the speed of both training and inference. We also introduce a new form of angle loss regression to improve the orientation estimation performance and a new data augmentation approach that can enhance the convergence speed and performance. The proposed network produces state-of-the-art results on the KITTI 3D object detection benchmarks while maintaining a fast inference speed.

...read moreread less

1,624 citations

Proceedings Article•

PointCNN: convolution on Χ -transformed points

[...]

Yangyan Li¹, Rui Bu¹, Mingchao Sun¹, Wei Wu¹, Xinhan Di², Baoquan Chen³ - Show less +2 more•Institutions (3)

Shandong University¹, Huawei², Peking University³

03 Dec 2018

TL;DR: This work proposes to learn an Χ-transformation from the input points to simultaneously promote two causes: the first is the weighting of the input features associated with the points, and the second is the permutation of the points into a latent and potentially canonical order.

...read moreread less

Abstract: We present a simple and general framework for feature learning from point clouds. The key to the success of CNNs is the convolution operator that is capable of leveraging spatially-local correlation in data represented densely in grids (e.g. images). However, point clouds are irregular and unordered, thus directly convolving kernels against features associated with the points will result in desertion of shape information and variance to point ordering. To address these problems, we propose to learn an Χ-transformation from the input points to simultaneously promote two causes: the first is the weighting of the input features associated with the points, and the second is the permutation of the points into a latent and potentially canonical order. Element-wise product and sum operations of the typical convolution operator are subsequently applied on the Χ-transformed features. The proposed method is a generalization of typical CNNs to feature learning from point clouds, thus we call it PointCNN. Experiments show that PointCNN achieves on par or better performance than state-of-the-art methods on multiple challenging benchmark datasets and tasks.

...read moreread less

1,535 citations

Cites methods from "PointNet++: Deep Hierarchical Featu..."

...PointNet++ [35] and SO-Net [27] applies PointNet hierarchically for better capturing of local structures....
[...]
...Nevertheless, PointCNN built with X -Conv is still significantly better than a direct application of typical convolutions on point clouds, and on par or better than state-of-the-art neural networks designed for point cloud input data, such as PointNet++ [35]....
[...]
...Methods PointNet [33] PointNet++ [35] 3DmFV-Net [4] DGCNN [50] SpecGCN [46] PCNN [3] PointCNN Parameters 3....
[...]

Proceedings Article•DOI•

PointConv: Deep Convolutional Networks on 3D Point Clouds

[...]

Wenxuan Wu¹, Zhongang Qi¹, Li Fuxin¹•Institutions (1)

Oregon State University¹

15 Jun 2019

TL;DR: The dynamic filter is extended to a new convolution operation, named PointConv, which can be applied on point clouds to build deep convolutional networks and is able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds.

...read moreread less

Abstract: Unlike images which are represented in regular dense grids, 3D point clouds are irregular and unordered, hence applying convolution on them can be difficult. In this paper, we extend the dynamic filter to a new convolution operation, named PointConv. PointConv can be applied on point clouds to build deep convolutional networks. We treat convolution kernels as nonlinear functions of the local coordinates of 3D points comprised of weight and density functions. With respect to a given point, the weight functions are learned with multi-layer perceptron networks and the density functions through kernel density estimation. A novel reformulation is proposed for efficiently computing the weight functions, which allowed us to dramatically scale up the network and significantly improve its performance. The learned convolution kernel can be used to compute translation-invariant and permutation-invariant convolution on any point set in the 3D space. Besides, PointConv can also be used as deconvolution operators to propagate features from a subsampled point cloud back to its original resolution. Experiments on ModelNet40, ShapeNet, and ScanNet show that deep convolutional neural networks built on PointConv are able to achieve state-of-the-art on challenging semantic segmentation benchmarks on 3D point clouds. Besides, our experiments converting CIFAR-10 into a point cloud showed that networks built on PointConv can match the performance of convolutional networks in 2D images of a similar structure.

...read moreread less

1,321 citations

Cites background or methods or result from "PointNet++: Deep Hierarchical Featu..."

...We follow the experiment setup in most related work [28, 35, 44, 18]....
[...]
...PointNet++ [28] proposes to use distance-based interpolation to propagate features, which is reasonable due to local correlations inside a local region....
[...]
...PointNet++ [28] improved the network in PointNet [26] by adding a hierarchical structure....
[...]
...The key structure used in both PointNet [26] and PointNet++ [28] to aggregate features from different points is max-pooling....
[...]
...The SPLATNet [35] is able to give comparable results as PointNet++ [28]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

111,197 citations

Proceedings Article•

ImageNet Classification with Deep Convolutional Neural Networks

[...]

Alex Krizhevsky¹, Ilya Sutskever¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read moreread less

73,978 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

55,235 citations

"PointNet++: Deep Hierarchical Featu..." refers background in this paper

...Among all the learning models, convolutional neural network [10; 25; 8] is one of the most prominent ones....
[...]
...[25] shows that using smaller kernels helps to improve the ability of CNNs....
[...]

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

49,914 citations