scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Performance Evaluation of 3D Keypoint Detectors

01 Mar 2013-International Journal of Computer Vision (Springer US)-Vol. 102, Iss: 1, pp 198-220
TL;DR: A categorization of existing methods in two classes, that allows for highlighting their common traits, is proposed, so as to abstract all algorithms to two general structures in terms of repeatability, distinctiveness and computational efficiency.
Abstract: In the past few years detection of repeatable and distinctive keypoints on 3D surfaces has been the focus of intense research activity, due on the one hand to the increasing diffusion of low-cost 3D sensors, on the other to the growing importance of applications such as 3D shape retrieval and 3D object recognition. This work aims at contributing to the maturity of this field by a thorough evaluation of several recent 3D keypoint detectors. A categorization of existing methods in two classes, that allows for highlighting their common traits, is proposed, so as to abstract all algorithms to two general structures. Moreover, a comprehensive experimental evaluation is carried out in terms of repeatability, distinctiveness and computational efficiency, based on a vast data corpus characterized by nuisances such as noise, clutter, occlusions and viewpoint changes.
Citations
More filters
Book ChapterDOI
08 Oct 2016
TL;DR: An algorithm for fast global registration of partially overlapping 3D surfaces that provides the accuracy achieved by well-initialized local refinement algorithms, without requiring an initialization and at lower computational cost.
Abstract: We present an algorithm for fast global registration of partially overlapping 3D surfaces. The algorithm operates on candidate matches that cover the surfaces. A single objective is optimized to align the surfaces and disable false matches. The objective is defined densely over the surfaces and the optimization achieves tight alignment with no initialization. No correspondence updates or closest-point queries are performed in the inner loop. An extension of the algorithm can perform joint global registration of many partially overlapping surfaces. Extensive experiments demonstrate that the presented approach matches or exceeds the accuracy of state-of-the-art global registration pipelines, while being at least an order of magnitude faster. Remarkably, the presented approach is also faster than local refinement algorithms such as ICP. It provides the accuracy achieved by well-initialized local refinement algorithms, without requiring an initialization and at lower computational cost.

667 citations


Cites background from "Performance Evaluation of 3D Keypoi..."

  • ...Some pipelines use point-to-point matches based on local geometric descriptors [16, 40], others define correspondences on pairs or tuples of points [1, 8, 26, 29]....

    [...]

Journal ArticleDOI
TL;DR: A thorough experimental evaluation vouches that SHOT outperforms state-of-the-art local descriptors in experiments addressing descriptor matching for object recognition, 3D reconstruction and shape retrieval.

602 citations


Additional excerpts

  • ...A performance evaluation of 3D keypoint detection algorithms has been recently proposed in [12]....

    [...]

Journal ArticleDOI
TL;DR: This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods and enlists a number of popular and contemporary databases together with their relevant attributes.
Abstract: 3D object recognition in cluttered scenes is a rapidly growing research area. Based on the used types of features, 3D object recognition methods can broadly be divided into two categories-global or local feature based methods. Intensive research has been done on local surface feature based methods as they are more robust to occlusion and clutter which are frequently present in a real-world scene. This paper presents a comprehensive survey of existing local surface feature based 3D object recognition methods. These methods generally comprise three phases: 3D keypoint detection, local surface feature description, and surface matching. This paper covers an extensive literature survey of each phase of the process. It also enlists a number of popular and contemporary databases together with their relevant attributes.

563 citations

Journal ArticleDOI
TL;DR: This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling and presents the performance results of these descriptors when combined with different 3D keypoint detection methods.
Abstract: A number of 3D local feature descriptors have been proposed in the literature. It is however, unclear which descriptors are more appropriate for a particular application. A good descriptor should be descriptive, compact, and robust to a set of nuisances. This paper compares ten popular local feature descriptors in the contexts of 3D object recognition, 3D shape retrieval, and 3D modeling. We first evaluate the descriptiveness of these descriptors on eight popular datasets which were acquired using different techniques. We then analyze their compactness using the recall of feature matching per each float value in the descriptor. We also test the robustness of the selected descriptors with respect to support radius variations, Gaussian noise, shot noise, varying mesh resolution, distance to the mesh boundary, keypoint localization error, occlusion, clutter, and dataset size. Moreover, we present the performance results of these descriptors when combined with different 3D keypoint detection methods. We finally analyze the computational efficiency for generating each descriptor.

503 citations

Journal ArticleDOI
TL;DR: This survey introduces feature detection, description, and matching techniques from handcrafted methods to trainable ones and provides an analysis of the development of these methods in theory and practice, and briefly introduces several typical image matching-based applications.
Abstract: As a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.

474 citations


Cites background from "Performance Evaluation of 3D Keypoi..."

  • ...Readers are referred to Tombari et al. (2013) for further discussion on other adaptive-scale detectors....

    [...]

  • ...D case, Tombari et al. (2013) presented a thorough evaluation of several state-of-the-art 3-...

    [...]

  • ...D keypoint detectors, Tombari et al. (2013) provided an excellent survey on the state-of-the-art methods and a detailed evaluation of their performances....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

01 Jan 2011
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

14,708 citations


"Performance Evaluation of 3D Keypoi..." refers background or methods in this paper

  • ...Likewise (Unnikrishnan and Hebert 2008), also MeshDoG (Zaharescu et al. 2009) deploys the 3D mesh as the representation adopted to build the scale-space, which, in turn, is created by applying different normalized Gaussian derivatives through the Difference-of-Gaussians (DoG) operator, a well-known approximation of the normalized Laplacian (Lowe 2004)....

    [...]

  • ...proposed in Lowe (2004), non-corner responses are pruned...

    [...]

  • ...…et al. 2009) deploys the 3D mesh as the representation adopted to build the scale-space, which, in turn, is created by applying different normalized Gaussian derivatives through the Difference-of-Gaussians (DoG) operator, a well-known approximation of the normalized Laplacian (Lowe 2004)....

    [...]

  • ...2009) deploys the 3D mesh as the representation adopted to build the scale-space, which, in turn, is created by applying different normalized Gaussian derivatives through the Difference-of-Gaussians (DoG) operator, a well-known approximation of the normalized Laplacian (Lowe 2004)....

    [...]

  • ...6http://www.caam.rice.edu/software/ARPACK/. to solve the sparse generalized eigenvalues problem and UMFPACK7 to perform the LU factorization of the mesh Laplacian operator L....

    [...]

Journal ArticleDOI
TL;DR: A snapshot of the state of the art in affine covariant region detectors, and compares their performance on a set of test images under varying imaging conditions to establish a reference test set of images and performance software so that future detectors can be evaluated in the same framework.
Abstract: The paper gives a snapshot of the state of the art in affine covariant region detectors, and compares their performance on a set of test images under varying imaging conditions. Six types of detectors are included: detectors based on affine normalization around Harris (Mikolajczyk and Schmid, 2002; Schaffalitzky and Zisserman, 2002) and Hessian points (Mikolajczyk and Schmid, 2002), a detector of `maximally stable extremal regions', proposed by Matas et al. (2002); an edge-based region detector (Tuytelaars and Van Gool, 1999) and a detector based on intensity extrema (Tuytelaars and Van Gool, 2000), and a detector of `salient regions', proposed by Kadir, Zisserman and Brady (2004). The performance is measured against changes in viewpoint, scale, illumination, defocus and image compression. The objective of this paper is also to establish a reference test set of images and performance software, so that future detectors can be evaluated in the same framework.

3,359 citations


"Performance Evaluation of 3D Keypoi..." refers background or methods in this paper

  • ...2010; Chen and Bhanu 2007; Zhong 2009; Novatnack and Nishino 2008; Unnikrishnan and Hebert 2008; Akagunduz and Ulusoy 2007; Zaharescu et al. 2009; Fadaifard and Wolberg 2011; Knopp et al. 2010; Castellani et al. 2008; Sun et al. 2009). Compared to its 2D counterpart, and to 3D descriptors as well, research on 3D detectors has started much more recently (mainly in the past 3 or 4 years) and, to date, only limited taxonomic and evaluation work has been carried out, with experimental comparison usually performed separately within (and relatively to) each specific proposal. The only relevant works in this respect are described in Bronstein et al. (2010), Boyer et al. (2011), which propose an experimental evaluation of 3D detectors and descriptors focused on the 3D shape retrieval scenario, and a preliminary version of this paper, which was presented in Salti et al....

    [...]

  • ...2000; Mikolajczyk et al. 2005) and mainly focused on the 3D object recognition scenario, which is peculiarly characterized by the presence of occlusions and clutter. Such a scenario differs from that addressed by Bronstein et al. (2010), Boyer et al. (2011), as 3D shape retrieval is not required to deal with occlusion, clutter and viewpoint changes, large intraclass shape variations being instead the main nuisance to be dealt with....

    [...]

  • ...For this reason, and as done in Mikolajczyk et al. (2005), we chose to use the default parameters supplied by the authors rather than tuning them....

    [...]

  • ...As discussed in Mikolajczyk et al. (2005), these differences may have an undesired impact on the repeatability scores: if the number of keypoints is large, many of them may be considered repeatable by accident and not because of the design of the detector....

    [...]

  • ...2010; Chen and Bhanu 2007; Zhong 2009; Novatnack and Nishino 2008; Unnikrishnan and Hebert 2008; Akagunduz and Ulusoy 2007; Zaharescu et al. 2009; Fadaifard and Wolberg 2011; Knopp et al. 2010; Castellani et al. 2008; Sun et al. 2009). Compared to its 2D counterpart, and to 3D descriptors as well, research on 3D detectors has started much more recently (mainly in the past 3 or 4 years) and, to date, only limited taxonomic and evaluation work has been carried out, with experimental comparison usually performed separately within (and relatively to) each specific proposal. The only relevant works in this respect are described in Bronstein et al. (2010), Boyer et al. (2011), which propose an experimental evaluation of 3D detectors and descriptors focused on the 3D shape retrieval scenario, and a preliminary version of this paper, which was presented in Salti et al. (2011). A parallel line of research has targeted the evaluation of 2D detectors and descriptors on 3D objects (Moreels and Perona 2007)....

    [...]

Journal ArticleDOI
TL;DR: It is shown how the proposed methodology applies to the problems of blob detection, junction detection, edge detection, ridge detection and local frequency estimation and how it can be used as a major mechanism in algorithms for automatic scale selection, which adapt the local scales of processing to the local image structure.
Abstract: The fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a so-called scale-space representation. Traditional scale-space theory building on this work, however, does not address the problem of how to select local appropriate scales for further analysis. This article proposes a systematic methodology for dealing with this problem. A framework is presented for generating hypotheses about interesting scale levels in image data, based on a general principle stating that local extrema over scales of different combinations of γ-normalized derivatives are likely candidates to correspond to interesting structures. Specifically, it is shown how this idea can be used as a major mechanism in algorithms for automatic scale selection, which adapt the local scales of processing to the local image structure. Support for the proposed approach is given in terms of a general theoretical investigation of the behaviour of the scale selection method under rescalings of the input pattern and by integration with different types of early visual modules, including experiments on real-world and synthetic data. Support is also given by a detailed analysis of how different types of feature detectors perform when integrated with a scale selection mechanism and then applied to characteristic model patterns. Specifically, it is described in detail how the proposed methodology applies to the problems of blob detection, junction detection, edge detection, ridge detection and local frequency estimation. In many computer vision applications, the poor performance of the low-level vision modules constitutes a major bottleneck. It is argued that the inclusion of mechanisms for automatic scale selection is essential if we are to construct vision systems to automatically analyse complex unknown environments.

2,942 citations


"Performance Evaluation of 3D Keypoi..." refers background in this paper

  • ...4, the common structure of adaptivescale detectors includes building a scale-space defined on the surface, thus directly extending to the case of 3D data the well-known concept defined for 2D images (Lindeberg 1998)....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a 3D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion is presented, which is based on matching surfaces by matching points using the spin image representation.
Abstract: We present a 3D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching surfaces by matching points using the spin image representation. The spin image is a data level shape descriptor that is used to match surfaces represented as surface meshes. We present a compression scheme for spin images that results in efficient multiple object recognition which we verify with results showing the simultaneous recognition of multiple objects from a library of 20 models. Furthermore, we demonstrate the robust performance of recognition in the presence of clutter and occlusion through analysis of recognition trials on 100 scenes.

2,798 citations