scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Distinctive Image Features from Scale-Invariant Keypoints

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

Content maybe subject to copyright    Report

Citations
More filters
Posted Content
TL;DR: Inspired by the outstanding 2D shape descriptor SIFT, a module called PointSIFT is designed that encodes information of different orientations and is adaptive to scale of shape, which outperforms state-of-the-art method on standard benchmark datasets.
Abstract: Recently, 3D understanding research sheds light on extracting features from point cloud directly, which requires effective shape pattern description of point clouds. Inspired by the outstanding 2D shape descriptor SIFT, we design a module called PointSIFT that encodes information of different orientations and is adaptive to scale of shape. Specifically, an orientation-encoding unit is designed to describe eight crucial orientations, and multi-scale representation is achieved by stacking several orientation-encoding units. PointSIFT module can be integrated into various PointNet-based architecture to improve the representation ability. Extensive experiments show our PointSIFT-based framework outperforms state-of-the-art method on standard benchmark datasets. The code and trained model will be published accompanied by this paper.

311 citations

Journal ArticleDOI
TL;DR: The robustness of this approach under various types of distortions, such as deformation, noise, outliers, rotation, and occlusion, greatly outperforms the state-of-the-art methods, especially when the data is badly degraded.
Abstract: In previous work on point registration, the input point sets are often represented using Gaussian mixture models and the registration is then addressed through a probabilistic approach, which aims to exploit global relationships on the point sets. For non-rigid shapes, however, the local structures among neighboring points are also strong and stable and thus helpful in recovering the point correspondence. In this paper, we formulate point registration as the estimation of a mixture of densities, where local features, such as shape context, are used to assign the membership probabilities of the mixture model. This enables us to preserve both global and local structures during matching. The transformation between the two point sets is specified in a reproducing kernel Hilbert space and a sparse approximation is adopted to achieve a fast implementation. Extensive experiments on both synthesized and real data show the robustness of our approach under various types of distortions, such as deformation, noise, outliers, rotation, and occlusion. It greatly outperforms the state-of-the-art methods, especially when the data is badly degraded.

311 citations

Journal ArticleDOI
TL;DR: UAV-SFM remote sensing was used to produce 3D multispectral point clouds of Temperate Deciduous forests at different levels of UAV altitude, image overlap, weather, and image processing, with accurate estimates of canopy height.
Abstract: Ecological remote sensing is being transformed by three-dimensional (3D), multispectral measurements of forest canopies by unmanned aerial vehicles (UAV) and computer vision structure from motion (SFM) algorithms. Yet applications of this technology have out-paced understanding of the relationship between collection method and data quality. Here, UAV-SFM remote sensing was used to produce 3D multispectral point clouds of Temperate Deciduous forests at different levels of UAV altitude, image overlap, weather, and image processing. Error in canopy height estimates was explained by the alignment of the canopy height model to the digital terrain model (R2 = 0.81) due to differences in lighting and image overlap. Accounting for this, no significant differences were observed in height error at different levels of lighting, altitude, and side overlap. Overall, accurate estimates of canopy height compared to field measurements (R2 = 0.86, RMSE = 3.6 m) and LIDAR (R2 = 0.99, RMSE = 3.0 m) were obtained under optimal conditions of clear lighting and high image overlap (>80%). Variation in point cloud quality appeared related to the behavior of SFM ‘image features’. Future research should consider the role of image features as the fundamental unit of SFM remote sensing, akin to the pixel of optical imaging and the laser pulse of LIDAR.

310 citations

Proceedings ArticleDOI
12 May 2009
TL;DR: This paper presents an approach for building metric 3D models of objects using local descriptors from several images, optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object.
Abstract: Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for building metric 3D models of objects using local descriptors from several images. Each model is optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object. Given a new test image, we match the local descriptors to our stored models online, using a novel combination of the RANSAC and Mean Shift algorithms to register multiple instances of each object. A robust initialization step allows for arbitrary rotation, translation and scaling of objects in the test images. The resulting system provides markerless 6-DOF pose estimation for complex objects in cluttered scenes. We provide experimental results demonstrating orientation and translation accuracy, as well a physical implementation of the pose output being used by an autonomous robot to perform grasping in highly cluttered scenes.

310 citations

Journal ArticleDOI
TL;DR: This work proposes a novel subspace clustering approach by introducing a new deep model—Structured AutoEncoder (StructAE), which learns a set of explicit transformations to progressively map input data points into nonlinear latent spaces while preserving the local and global subspace structure.
Abstract: Existing subspace clustering methods typically employ shallow models to estimate underlying subspaces of unlabeled data points and cluster them into corresponding groups. However, due to the limited representative capacity of the employed shallow models, those methods may fail in handling realistic data without the linear subspace structure. To address this issue, we propose a novel subspace clustering approach by introducing a new deep model-Structured AutoEncoder (StructAE). The StructAE learns a set of explicit transformations to progressively map input data points into nonlinear latent spaces while preserving the local and global subspace structure. In particular, to preserve local structure, the StructAE learns representations for each data point by minimizing reconstruction error w.r.t. itself. To preserve global structure, the StructAE incorporates a prior structured information by encouraging the learned representation to preserve specified reconstruction patterns over the entire data set. To the best of our knowledge, StructAE is one of first deep subspace clustering approaches. Extensive experiments show that the proposed StructAE significantly outperforms 15 state-of-the-art subspace clustering approaches in terms of five evaluation metrics.

310 citations

References
More filters
Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations


"Distinctive Image Features from Sca..." refers background or methods in this paper

  • ...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....

    [...]

  • ...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....

    [...]

  • ...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....

    [...]

  • ...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...

    [...]

  • ...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....

    [...]

Book
01 Jan 2000
TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.
Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

15,558 citations

01 Jan 2001
TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

14,282 citations


"Distinctive Image Features from Sca..." refers background in this paper

  • ...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....

    [...]

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.