scispace - formally typeset
Search or ask a question
Author

Yulan Guo

Bio: Yulan Guo is an academic researcher from National University of Defense Technology. The author has contributed to research in topics: Computer science & Point cloud. The author has an hindex of 30, co-authored 164 publications receiving 5012 citations. Previous affiliations of Yulan Guo include Chinese Academy of Sciences & University of Western Australia.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.
Abstract: Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions

1,021 citations

Proceedings ArticleDOI
14 Jun 2020
TL;DR: This paper introduces RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds, and introduces a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details.
Abstract: We study the problem of efficient semantic segmentation for large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Extensive experiments show that our RandLA-Net can process 1 million points in a single pass with up to 200x faster than existing approaches. Moreover, our RandLA-Net clearly surpasses state-of-the-art approaches for semantic segmentation on two large-scale benchmarks Semantic3D and SemanticKITTI.

977 citations

Journal ArticleDOI
TL;DR: This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods and enlists a number of popular and contemporary databases together with their relevant attributes.
Abstract: 3D object recognition in cluttered scenes is a rapidly growing research area. Based on the used types of features, 3D object recognition methods can broadly be divided into two categories-global or local feature based methods. Intensive research has been done on local surface feature based methods as they are more robust to occlusion and clutter which are frequently present in a real-world scene. This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods. These methods generally comprise three phases: 3D keypoint detection, local surface feature description, and surface matching. This paper covers an extensive literature survey of each phase of the process. It also enlists a number of popular and contemporary databases together with their relevant attributes.

563 citations

Journal ArticleDOI
TL;DR: This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling and presents the performance results of these descriptors when combined with different 3D keypoint detection methods.
Abstract: A number of 3D local feature descriptors have been proposed in the literature. It is however, unclear which descriptors are more appropriate for a particular application. A good descriptor should be descriptive, compact, and robust to a set of nuisances. This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling. We first evaluate the descriptiveness of these descriptors on eight popular datasets which were acquired using different techniques. We then analyze their compactness using the recall of feature matching per each float value in the descriptor. We also test the robustness of the selected descriptors with respect to support radius variations, Gaussian noise, shot noise, varying mesh resolution, distance to the mesh boundary, keypoint localization error, occlusion, clutter, and dataset size. Moreover, we present the performance results of these descriptors when combined with different 3D keypoint detection methods. We finally analyze the computational efficiency for generating each descriptor.

503 citations

Journal ArticleDOI
TL;DR: Rotational Projection Statistics (RoPS) as discussed by the authors is a feature descriptor that is obtained by rotationally projecting the neighboring points of a feature point onto 2D planes and calculating a set of statistics including low-order central moments and entropy of the distribution of these projected points.
Abstract: Recognizing 3D objects in the presence of noise, varying mesh resolution, occlusion and clutter is a very challenging task. This paper presents a novel method named Rotational Projection Statistics (RoPS). It has three major modules: local reference frame (LRF) definition, RoPS feature description and 3D object recognition. We propose a novel technique to define the LRF by calculating the scatter matrix of all points lying on the local surface. RoPS feature descriptors are obtained by rotationally projecting the neighboring points of a feature point onto 2D planes and calculating a set of statistics (including low-order central moments and entropy) of the distribution of these projected points. Using the proposed LRF and RoPS descriptor, we present a hierarchical 3D object recognition algorithm. The performance of the proposed LRF, RoPS descriptor and object recognition algorithm was rigorously tested on a number of popular and publicly available datasets. Our proposed techniques exhibited superior performance compared to existing techniques. We also showed that our method is robust with respect to noise and varying mesh resolution. Our RoPS based algorithm achieved recognition rates of 100, 98.9, 95.4 and 96.0 % respectively when tested on the Bologna, UWA, Queen’s and Ca’ Foscari Venezia Datasets.

437 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work proposes a new neural network module suitable for CNN-based high-level tasks on point clouds, including classification and segmentation called EdgeConv, which acts on graphs dynamically computed in each layer of the network.
Abstract: Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information, so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds, including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks, including ModelNet40, ShapeNetPart, and S3DIS.

3,727 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Journal ArticleDOI
TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.
Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

1,897 citations