Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Vision-Aided Inertial Navigation for Spacecraft Entry, Descent, and Landing

[...]

Anastasios I. Mourikis¹, Nikolas Trawny², Stergios I. Roumeliotis², Andrew E. Johnson³, Adnan Ansar³, Larry Matthies³ - Show less +2 more•Institutions (3)

University of California, Riverside¹, University of Minnesota², California Institute of Technology³

01 Apr 2009-IEEE Transactions on Robotics

TL;DR: The vision-aided inertial navigation algorithm (VISINAV) algorithm that enables precision planetary landing and validation results from a sounding-rocket test flight vastly improve current state of the art for terminal descent navigation without visual updates, and meet the requirements of future planetary exploration missions.

...read moreread less

Abstract: In this paper, we present the vision-aided inertial navigation (VISINAV) algorithm that enables precision planetary landing. The vision front-end of the VISINAV system extracts 2-D-to-3-D correspondences between descent images and a surface map (mapped landmarks), as well as 2-D-to-2-D feature tracks through a sequence of descent images (opportunistic features). An extended Kalman filter (EKF) tightly integrates both types of visual feature observations with measurements from an inertial measurement unit. The filter computes accurate estimates of the lander's terrain-relative position, attitude, and velocity, in a resource-adaptive and hence real-time capable fashion. In addition to the technical analysis of the algorithm, the paper presents validation results from a sounding-rocket test flight, showing estimation errors of only 0.16 m/s for velocity and 6.4 m for position at touchdown. These results vastly improve current state of the art for terminal descent navigation without visual updates, and meet the requirements of future planetary exploration missions.

...read moreread less

356 citations

Cites methods from "Distinctive Image Features from Sca..."

...However, the scale and rotation invariance of the SIFT keys, which increases the processing requirements, is not necessary in the EDL scenario considered here....
[...]
...SIFT keypoints can be reliably matched between images under large scale and in-plane orientation changes....
[...]
...One image-feature detection algorithm that was considered during our system’s design is the scale-invariant feature transform (SIFT) [7]....
[...]
...One image feature detection algorithm that was considered during our system’s design is the scale invariant feature transform (SIFT) [7]....
[...]

Journal Article•DOI•

Scale Invariant Feature Transform

[...]

Tony Lindeberg

22 May 2012-Scholarpedia

TL;DR: The SIFT descriptor has been proven to be very useful in practice for robust image matching and object recognition under real-world conditions and has also been extended from grey-level to colour images and from 2-D spatial images to 2+1-D spatio-temporal video.

...read moreread less

Abstract: Scale Invariant Feature Transform (SIFT) is an image descriptor for image-based matching developed by David Lowe (1999,2004). This descriptor as well as related image descriptors are used for a large number of purposes in computer vision related to point matching between different views of a 3-D scene and view-based object recognition. The SIFT descriptor is invariant to translations, rotations and scaling transformations in the image domain and robust to moderate perspective transformations and illumination variations. Experimentally, the SIFT descriptor has been proven to be very useful in practice for robust image matching and object recognition under real-world conditions.In its original formulation, the SIFT descriptor comprised a method for detecting interest points from a grey-level image at which statistics of local gradient directions of image intensities were accumulated to give a summarizing description of the local image structures in a local neighbourhood around each interest point, with the intention that this descriptor should be used for matching corresponding interest points between different images. Later, the SIFT descriptor has also been applied at dense grids (dense SIFT) which have been shown to lead to better performance for tasks such as object categorization and texture classification. The SIFT descriptor has also been extended from grey-level to colour images and from 2-D spatial images to 2+1-D spatio-temporal video.

...read moreread less

356 citations

Proceedings Article•DOI•

GMS: Grid-Based Motion Statistics for Fast, Ultra-Robust Feature Correspondence

[...]

Jia-Wang Bian¹, Wen-Yan Lin, Yasuyuki Matsushita², Sai-Kit Yeung¹, Tan-Dat Nguyen, Ming-Ming Cheng³ - Show less +2 more•Institutions (3)

Singapore University of Technology and Design¹, Osaka University², Nankai University³

21 Jul 2017

TL;DR: GMS (Grid-based Motion Statistics), a simple means of encapsulating motion smoothness as the statistical likelihood of a certain number of matches in a region, enables translation of high match numbers into high match quality.

...read moreread less

Abstract: Incorporating smoothness constraints into feature matching is known to enable ultra-robust matching. However, such formulations are both complex and slow, making them unsuitable for video applications. This paper proposes GMS (Grid-based Motion Statistics), a simple means of encapsulating motion smoothness as the statistical likelihood of a certain number of matches in a region. GMS enables translation of high match numbers into high match quality. This provides a real-time, ultra-robust correspondence system. Evaluation on videos, with low textures, blurs and wide-baselines show GMS consistently out-performs other real-time matchers and can achieve parity with more sophisticated, much slower techniques.

...read moreread less

356 citations

Journal Article•DOI•

A Robust O(n) Solution to the Perspective-n-Point Problem

[...]

Shiqi Li¹, Chi Xu¹, Ming Xie²•Institutions (2)

Huazhong University of Science and Technology¹, Nanyang Technological University²

01 Jul 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A noniterative solution for the Perspective-n-Point problem, which can robustly retrieve the optimum by solving a seventh order polynomial, and is the first noniteratives solution that can achieve more accurate results than the iterative algorithms when no redundant reference points can be used.

...read moreread less

Abstract: We propose a noniterative solution for the Perspective-n-Point ({\rm P}n{\rm P}) problem, which can robustly retrieve the optimum by solving a seventh order polynomial. The central idea consists of three steps: 1) to divide the reference points into 3-point subsets in order to achieve a series of fourth order polynomials, 2) to compute the sum of the square of the polynomials so as to form a cost function, and 3) to find the roots of the derivative of the cost function in order to determine the optimum. The advantages of the proposed method are as follows: First, it can stably deal with the planar case, ordinary 3D case, and quasi-singular case, and it is as accurate as the state-of-the-art iterative algorithms with much less computational time. Second, it is the first noniterative {\rm P}n{\rm P} solution that can achieve more accurate results than the iterative algorithms when no redundant reference points can be used (n\le 5). Third, large-size point sets can be handled efficiently because its computational complexity is O(n).

...read moreread less

355 citations

Journal Article•DOI•

Color Invariants for Person Reidentification

[...]

Igor Kviatkovsky¹, Amit Adam¹, Ehud Rivlin¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jul 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work revisits the problem of specific object recognition using color distributions using color distribution structure, and shows that it is invariant under a wide range of imaging conditions while being discriminative enough to be practical.

...read moreread less

Abstract: We revisit the problem of specific object recognition using color distributions. In some applications-such as specific person identification-it is highly likely that the color distributions will be multimodal and hence contain a special structure. Although the color distribution changes under different lighting conditions, some aspects of its structure turn out to be invariants. We refer to this structure as an intradistribution structure, and show that it is invariant under a wide range of imaging conditions while being discriminative enough to be practical. Our signature uses shape context descriptors to represent the intradistribution structure. Assuming the widely used diagonal model, we validate that our signature is invariant under certain illumination changes. Experimentally, we use color information as the only cue to obtain good recognition performance on publicly available databases covering both indoor and outdoor conditions. Combining our approach with the complementary covariance descriptor, we demonstrate results exceeding the state-of-the-art performance on the challenging VIPeR and CAVIAR4REID databases.

...read moreread less

355 citations

Additional excerpts

...The SIFT descriptor [38] has emerged as one of the most reliable descriptors....
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
…
121
122
123
124
125
126
127
…
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations