Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Visual Turing test for computer vision systems

[...]

Donald Geman¹, Stuart Geman², Neil Hallonquist¹, Laurent Younes¹•Institutions (2)

Johns Hopkins University¹, Brown University²

24 Mar 2015-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: This work presents a different evaluation system, in which a query engine prepares a written test that uses binary questions to probe a system’s ability to identify attributes and relationships in addition to recognizing objects.

...read moreread less

Abstract: Today, computer vision systems are tested by their accuracy in detecting and localizing instances of objects. As an alternative, and motivated by the ability of humans to provide far richer descriptions and even tell a story about an image, we construct a “visual Turing test”: an operator-assisted device that produces a stochastic sequence of binary questions from a given test image. The query engine proposes a question; the operator either provides the correct answer or rejects the question as ambiguous; the engine proposes the next question (“just-in-time truthing”). The test is then administered to the computer-vision system, one question at a time. After the system’s answer is recorded, the system is provided the correct answer and the next question. Parsing is trivial and deterministic; the system being tested requires no natural language processing. The query engine employs statistical constraints, learned from a training set, to produce questions with essentially unpredictable answers—the answer to a question, given the history of questions and their correct answers, is nearly equally likely to be positive or negative. In this sense, the test is only about vision. The system is designed to produce streams of questions that follow natural story lines, from the instantiation of a unique object, through an exploration of its properties, and on to its relationships with other uniquely instantiated objects.

...read moreread less

331 citations

Proceedings Article•DOI•

A framework for visual saliency detection with applications to image thumbnailing

[...]

Luca Marchesotti¹, Claudio Cifarelli¹, Gabriela Csurka¹•Institutions (1)

Xerox¹

01 Sep 2009

TL;DR: A novel framework for visual saliency detection based on a simple principle: images sharing their global visual appearances are likely to share similar salience, which outperforms state-of-the-art approaches.

...read moreread less

Abstract: We propose a novel framework for visual saliency detection based on a simple principle: images sharing their global visual appearances are likely to share similar salience. Assuming that an annotated image database is available, we first retrieve the most similar images to the target image; secondly, we build a simple classifier and we use it to generate saliency maps. Finally, we refine the maps and we extract thumbnails. We show that in spite of its simplicity, our framework outperforms state-of-the-art approaches. Another advantage is its ability to deal with visual pop-up and application/task-driven saliency, if appropriately annotated images are available.

...read moreread less

331 citations

Proceedings Article•DOI•

Scene Modelling, Recognition and Tracking with Invariant Image Features

[...]

Iryna Skrypnyk¹, David G. Lowe¹•Institutions (1)

University of British Columbia¹

02 Nov 2004

TL;DR: A complete system architecture for fully automated markerless augmented reality that constructs a sparse metric model of the real-world environment, provides interactive means for specifying the pose of a virtual object, and performs model-based camera tracking with visually pleasing augmentation results is presented.

...read moreread less

Abstract: We present a complete system architecture for fully automated markerless augmented reality (AR). The system constructs a sparse metric model of the real-world environment, provides interactive means for specifying the pose of a virtual object, and performs model-based camera tracking with visually pleasing augmentation results. Our approach does not require camera pre-calibration, prior knowledge of scene geometry, manual initialization of the tracker or placement of special markers. Robust tracking in the presence of occlusions and scene changes is achieved by using highly distinctive natural features to establish image correspondences.

...read moreread less

330 citations

Journal Article•DOI•

Conceptual spatial representations for indoor mobile robots

[...]

Hendrik Zender¹, O. Martinez Mozos², Patric Jensfelt³, Geert-Jan M. Kruijff¹, Wolfram Burgard² - Show less +1 more•Institutions (3)

German Research Centre for Artificial Intelligence¹, University of Freiburg², Royal Institute of Technology³

01 Jun 2008-Robotics and Autonomous Systems

TL;DR: An approach for creating conceptual representations of human-made indoor environments using mobile robots that is composed of layers representing maps at different levels of abstraction and incorporates a linguistic framework that actively supports the map acquisition process.

...read moreread less

330 citations

Journal Article•DOI•

ASIFT: An Algorithm for Fully Affine Invariant Comparison

[...]

Guoshen Yu¹, Jean-Michel Morel•Institutions (1)

École Polytechnique¹

24 Feb 2011-Image Processing On Line

TL;DR: AffineSIFT (ASIFT), simulates a set of sample views of the initial images, obtainable by varying the two camera axis orientation parameters, namely the latitude and the longitude angles, which are not treated by the SIFT method.

...read moreread less

Abstract: If a physical object has a smooth or piecewise smooth boundary, its images obtained by cameras in varying positions undergo smooth apparent deformations. These deformations are locally well approximated by affine transforms of the image plane. In consequencethe solid object recognition problem has often been led back to the computation of affine invariant image local features. The similarity invariance (invariance to translation, rotation, and zoom) is dealt with rigorously by the SIFT method The method illustrated and demonstrated in this work, AffineSIFT (ASIFT), simulates a set of sample views of the initial images, obtainable by varying the two camera axis orientation parameters, namely the latitude and the longitude angles, which are not treated by the SIFT method. Then it applies the SIFT method itself to all images thus generated. Thus, ASIFT covers effectively all six parameters of the affine transform. Source Code The source code (ANSI C), its documentation, and the online demo are accessible at the IPOL web page of this article 1 .

...read moreread less

329 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
…
133
134
135
136
137
138
139
…
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations