Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Spatial coding for large scale partial-duplicate web image search

[...]

Wengang Zhou¹, Yijuan Lu², Houqiang Li¹, Yibing Song¹, Qi Tian³ - Show less +1 more•Institutions (3)

University of Science and Technology of China¹, Texas State University², University of Texas at San Antonio³

25 Oct 2010

TL;DR: This paper proposes a novel scheme, spatial coding, to encode the spatial relationships among local features in an image, and achieves a 53% improvement in mean average precision and 46% reduction in time cost over the baseline bag-of-words approach.

...read moreread less

Abstract: The state-of-the-art image retrieval approaches represent images with a high dimensional vector of visual words by quantizing local features, such as SIFT, in the descriptor space. The geometric clues among visual words in an image is usually ignored or exploited for full geometric verification, which is computationally expensive. In this paper, we focus on partial-duplicate web image retrieval, and propose a novel scheme, spatial coding, to encode the spatial relationships among local features in an image. Our spatial coding is both efficient and effective to discover false matches of local features between images, and can greatly improve retrieval performance. Experiments in partial-duplicate web image search, using a database of one million images, reveal that our approach achieves a 53% improvement in mean average precision and 46% reduction in time cost over the baseline bag-of-words approach.

...read moreread less

248 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...OUR APPROACH In our approach, we adopt SIFT features [1] for image representation....
[...]
...Generally, geometric verification [1, 6] can be adopted to refine the matching results by discovering the transformation and filtering false positives....
[...]
...Similar to text words in information retrieval, local SIFT descriptors [1] are quantized to visual words....
[...]
...Geometric verification [1, 4, 6, 8, 10] has become very popular recently as an important post-processing step to improve the retrieval precision....
[...]
...Therefore, unlike full geometric verification with RANSAC [1, 6, 14], the computational cost is very low....
[...]

Journal Article•DOI•

A review of computer vision–based structural health monitoring at local and global levels:

[...]

Chuan-Zhi Dong¹, F. Necati Catbas¹•Institutions (1)

University of Central Florida¹

01 Mar 2021-Structural Health Monitoring-an International Journal

TL;DR: A general overview of the concepts, approaches, and real-life practice of computer vision–structural health monitoring along with some relevant literature that is rapidly accumulating is presented.

...read moreread less

Abstract: Structural health monitoring at local and global levels using computer vision technologies has gained much attention in the structural health monitoring community in research and practice. Due to t...

...read moreread less

248 citations

Cites methods from "Distinctive Image Features from Sca..."

...matching which traverses all the feature points, but takes a long time,(141) or K-Nearest Neighbor method and threshold can be used to reduce the computation expenses.(139) The selection of Euclidean or Hamming distance is based on the type of the descriptor vector....
[...]

Proceedings Article•DOI•

Unbiased Metric Learning: On the Utilization of Multiple Datasets and Web Images for Softening Bias

[...]

Chen Fang¹, Ye Xu¹, Daniel N. Rockmore¹•Institutions (1)

Dartmouth College¹

01 Dec 2013

TL;DR: This work proposes Unbiased Metric Learning (UML), a metric learning approach that learns a set of less biased candidate distance metrics on training examples from multiple biased datasets, based on structural SVM.

...read moreread less

Abstract: Many standard computer vision datasets exhibit biases due to a variety of sources including illumination condition, imaging system, and preference of dataset collectors. Biases like these can have downstream effects in the use of vision datasets in the construction of generalizable techniques, especially for the goal of the creation of a classification system capable of generalizing to unseen and novel datasets. In this work we propose Unbiased Metric Learning (UML), a metric learning approach, to achieve this goal. UML operates in the following two steps: (1) By varying hyper parameters, it learns a set of less biased candidate distance metrics on training examples from multiple biased datasets. The key idea is to learn a neighborhood for each example, which consists of not only examples of the same category from the same dataset, but those from other datasets. The learning framework is based on structural SVM. (2) We do model validation on a set of weakly-labeled web images retrieved by issuing class labels as keywords to search engine. The metric with best validation performance is selected. Although the web images sometimes have noisy labels, they often tend to be less biased, which makes them suitable for the validation set in our task. Cross-dataset image classification experiments are carried out. Results show significant performance improvement on four well-known computer vision datasets.

...read moreread less

248 citations

Proceedings Article•

A Review on Image Feature Extraction and Representation Techniques

[...]

Dong ping Tian

01 Jul 2013

TL;DR: This paper analyzes the effectiveness of the fusion of global and local features in automatic image annotation and content based image retrieval community, including some classic models and their illustrations in the literature.

...read moreread less

Abstract: Feature extraction and representation is a crucial step for multimedia processing. How to extract ideal features that can reflect the intrinsic content of the images as complete as possible is still a challenging problem in computer vision. However, very little research has paid attention to this problem in the last decades. So in this paper, we focus our review on the latest development in image feature extraction and provide a comprehensive survey on image feature representation techniques. In particular, we analyze the effectiveness of the fusion of global and local features in automatic image annotation and content based image retrieval community, including some classic models and their illustrations in the literature. Finally, we summarize this paper with some important conclusions and point out the future potential research directions.

...read moreread less

248 citations

Journal Article•DOI•

State of the Art on 3D Reconstruction with RGB-D Cameras

[...]

Michael Zollhöfer¹, Michael Zollhöfer², Patrick Stotko³, Andreas Gorlitz⁴, Christian Theobalt¹, Matthias Nießner⁵, Reinhard Klein³, Andreas Kolb⁴ - Show less +4 more•Institutions (5)

Max Planck Society¹, Stanford University², University of Bonn³, University of Siegen⁴, Technische Universität München⁵

01 May 2018-Computer Graphics Forum

TL;DR: This report explains, compare, and critically analyze the common underlying algorithmic concepts that enabled recent developments in RGB‐D scene reconstruction in detail, and shows how algorithms are designed to best exploit the benefits ofRGB‐D data while suppressing their often non‐trivial data distortions.

...read moreread less

Abstract: The advent of affordable consumer grade RGB‐D cameras has brought about a profound advancement of visual scene reconstruction methods. Both computer graphics and computer vision researchers spend significant effort to develop entirely new algorithms to capture comprehensive shape models of static and dynamic scenes with RGB‐D cameras. This led to significant advances of the state of the art along several dimensions. Some methods achieve very high reconstruction detail, despite limited sensor resolution. Others even achieve real‐time performance, yet possibly at lower quality. New concepts were developed to capture scenes at larger spatial and temporal extent. Other recent algorithms flank shape reconstruction with concurrent material and lighting estimation, even in general scenes and unconstrained conditions. In this state‐of‐the‐art report, we analyze these recent developments in RGB‐D scene reconstruction in detail and review essential related work. We explain, compare, and critically analyze the common underlying algorithmic concepts that enabled these recent advancements. Furthermore, we show how algorithms are designed to best exploit the benefits of RGB‐D data while suppressing their often non‐trivial data distortions. In addition, this report identifies and discusses important open research questions and suggests relevant directions for future work.

...read moreread less

248 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
…
184
185
186
187
188
189
190
…
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations