Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Efficient HOG human detection

[...]

Yanwei Pang¹, Yuan Yuan², Xuelong Li², Jing Pan³•Institutions (3)

Tianjin University¹, Chinese Academy of Sciences², Tianjin University of Technology and Education³

01 Apr 2011-Signal Processing

TL;DR: Experimental results on the top-view database and the well-known INRIA data set have demonstrated the effectiveness and efficiency of the proposed HOG+SVM method.

...read moreread less

278 citations

Journal Article•DOI•

Color-to-Grayscale: Does the Method Matter in Image Recognition?

[...]

Christopher Kanan¹, Garrison W. Cottrell¹•Institutions (1)

University of California, San Diego¹

10 Jan 2012-PLOS ONE

TL;DR: A simple method is identified that generally works best for face and object recognition, and two that work well for recognizing textures, which are tested using a modern descriptor-based image recognition framework.

...read moreread less

Abstract: In image recognition it is often assumed the method used to convert color images to grayscale has little impact on recognition performance. We compare thirteen different grayscale algorithms with four types of image descriptors and demonstrate that this assumption is wrong: not all color-to-grayscale algorithms work equally well, even when using descriptors that are robust to changes in illumination. These methods are tested using a modern descriptor-based image recognition framework, on face, object, and texture datasets, with relatively few training instances. We identify a simple method that generally works best for face and object recognition, and two that work well for recognizing textures.

...read moreread less

278 citations

Journal Article•DOI•

A Quantitative Evaluation of Confidence Measures for Stereo Vision

[...]

Xiaoyan Hu¹, Philippos Mordohai¹•Institutions (1)

Stevens Institute of Technology¹

01 Nov 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An extensive evaluation of 17 confidence measures for stereo matching that compares the most widely used measures as well as several novel techniques proposed here, and finds that such an evaluation is missing from the rapidly maturing stereo literature.

...read moreread less

Abstract: We present an extensive evaluation of 17 confidence measures for stereo matching that compares the most widely used measures as well as several novel techniques proposed here. We begin by categorizing these methods according to which aspects of stereo cost estimation they take into account and then assess their strengths and weaknesses. The evaluation is conducted using a winner-take-all framework on binocular and multibaseline datasets with ground truth. It measures the capability of each confidence method to rank depth estimates according to their likelihood for being correct, to detect occluded pixels, and to generate low-error depth maps by selecting among multiple hypotheses for each pixel. Our work was motivated by the observation that such an evaluation is missing from the rapidly maturing stereo literature and that our findings would be helpful to researchers in binocular and multiview stereo.

...read moreread less

278 citations

Journal Article•DOI•

Content-Based Copy Retrieval Using Distortion-Based Probabilistic Similarity Search

[...]

Alexis Joly¹, Olivier Buisson, Carl Frélicot²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of La Rochelle²

01 Feb 2007-IEEE Transactions on Multimedia

TL;DR: This paper proposes a new approximate similarity search technique in which the probabilistic selection of the feature space regions is not based on the distribution in the database but on the Distribution of the features distortion.

...read moreread less

Abstract: Content-based copy retrieval (CBCR) aims at retrieving in a database all the modified versions or the previous versions of a given candidate object. In this paper, we present a copy-retrieval scheme based on local features that can deal with very large databases both in terms of quality and speed. We first propose a new approximate similarity search technique in which the probabilistic selection of the feature space regions is not based on the distribution in the database but on the distribution of the features distortion. Since our CBCR framework is based on local features, the approximation can be strong and reduce drastically the amount of data to explore. Furthermore, we show how the discrimination of the global retrieval can be enhanced during its post-processing step, by considering only the geometrically consistent matches. This framework is applied to robust video copy retrieval and extensive experiments are presented to study the interactions between the approximate search and the retrieval efficiency. Largest used database contains more than 1 billion local features corresponding to 30000 h of video

...read moreread less

277 citations

Proceedings Article•DOI•

Better matching with fewer features: The selection of useful features in large database recognition problems

[...]

Panu Turcot¹, David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Sep 2009

TL;DR: This paper is able to select “useful” features, which are both robust and distinctive, by an unsupervised preprocessing step that identifies correctly matching features among the training images, and demonstrates adjacent and 2-adjacent augmentation, both of which give a substantial boost in performance.

...read moreread less

Abstract: There has been recent progress on the problem of recognizing specific objects in very large datasets. The most common approach has been based on the bag-of-words (BOW) method, in which local image features are clustered into visual words. This can provide significant savings in memory compared to storing and matching each feature independently. In this paper we take an additional step to reducing memory requirements by selecting only a small subset of the training features to use for recognition. This is based on the observation that many local features are unreliable or represent irrelevant clutter. We are able to select “useful” features, which are both robust and distinctive, by an unsupervised preprocessing step that identifies correctly matching features among the training images. We demonstrate that this selection approach allows an average of 4% of the original features per image to provide matching performance that is as accurate as the full set. In addition, we employ a graph to represent the matching relationships between images. Doing so enables us to effectively augment the feature set for each image through merging of useful features of neighboring images. We demonstrate adjacent and 2-adjacent augmentation, both of which give a substantial boost in performance.

...read moreread less

276 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
…
160
161
162
163
164
165
166
…
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations