Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Maximally Stable Colour Regions for Recognition and Matching

[...]

Per-Erik Forssén¹•Institutions (1)

University of British Columbia¹

17 Jun 2007

TL;DR: A novel colour-based affine co-variant region detector based on a Poisson image noise model that performs better than the commonly used Euclidean distance and extends the state of the art in feature repeatability tests.

...read moreread less

Abstract: This paper introduces a novel colour-based affine co-variant region detector. Our algorithm is an extension of the maximally stable extremal region (MSER) to colour. The extension to colour is done by looking at successive time-steps of an agglomerative clustering of image pixels. The selection of time-steps is stabilised against intensity scalings and image blur by modelling the distribution of edge magnitudes. The algorithm contains a novel edge significance measure based on a Poisson image noise model, which we show performs better than the commonly used Euclidean distance. We compare our algorithm to the original MSER detector and a competing colour-based blob feature detector, and show through a repeatability test that our detector performs better. We also extend the state of the art in feature repeatability tests, by using scenes consisting of two planes where one is piecewise transparent. This new test is able to evaluate how stable a feature is against changing backgrounds.

...read moreread less

294 citations

Journal Article•DOI•

Partial Face Recognition: Alignment-Free Approach

[...]

Shengcai Liao, Anil K. Jain¹, Stan Z. Li•Institutions (1)

Michigan State University¹

01 May 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is argued that a probe face image, holistic or partial, can be sparsely represented by a large dictionary of gallery descriptors by adopting a variable-size description which represents each face with a set of keypoint descriptors.

...read moreread less

Abstract: Numerous methods have been developed for holistic face recognition with impressive performance. However, few studies have tackled how to recognize an arbitrary patch of a face image. Partial faces frequently appear in unconstrained scenarios, with images captured by surveillance cameras or handheld devices (e.g., mobile phones) in particular. In this paper, we propose a general partial face recognition approach that does not require face alignment by eye coordinates or any other fiducial points. We develop an alignment-free face representation method based on Multi-Keypoint Descriptors (MKD), where the descriptor size of a face is determined by the actual content of the image. In this way, any probe face image, holistic or partial, can be sparsely represented by a large dictionary of gallery descriptors. A new keypoint descriptor called Gabor Ternary Pattern (GTP) is also developed for robust and discriminative face recognition. Experimental results are reported on four public domain face databases (FRGCv2.0, AR, LFW, and PubFig) under both the open-set identification and verification scenarios. Comparisons with two leading commercial face recognition SDKs (PittPatt and FaceVACS) and two baseline algorithms (PCA+LDA and LBP) show that the proposed method, overall, is superior in recognizing both holistic and partial faces without requiring alignment.

...read moreread less

293 citations

Proceedings Article•DOI•

Data-driven visual similarity for cross-domain image matching

[...]

Abhinav Shrivastava¹, Tomasz Malisiewicz², Abhinav Gupta¹, Alexei A. Efros¹•Institutions (2)

Carnegie Mellon University¹, Massachusetts Institute of Technology²

12 Dec 2011

TL;DR: A surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness" is proposed, yielding a generic approach that does not depend on a particular image representation or a specific visual domain.

...read moreread less

Abstract: The goal of this work is to find visually similar images even if they appear quite different at the raw pixel level This task is particularly important for matching images across visual domains, such as photos taken over different seasons or lighting conditions, paintings, hand-drawn sketches, etc We propose a surprisingly simple method that estimates the relative importance of different features in a query image based on the notion of "data-driven uniqueness" We employ standard tools from discriminative object detection in a novel way, yielding a generic approach that does not depend on a particular image representation or a specific visual domain Our approach shows good performance on a number of difficult cross-domain visual tasks eg, matching paintings or sketches to real photographs The method also allows us to demonstrate novel applications such as Internet re-photography, and painting2gps While at present the technique is too computationally intensive to be practical for interactive image retrieval, we hope that some of the ideas will eventually become applicable to that domain as well

...read moreread less

293 citations

Journal Article•DOI•

Traffic Sign Recognition With Hinge Loss Trained Convolutional Neural Networks

[...]

Junqi Jin¹, Kun Fu¹, Changshui Zhang¹•Institutions (1)

Tsinghua University¹

12 Mar 2014-IEEE Transactions on Intelligent Transportation Systems

TL;DR: The details of the model's architecture for TSR are described and a hinge loss stochastic gradient descent (HLSGD) method to train convolutional neural networks (CNNs) is suggested.

...read moreread less

Abstract: Traffic sign recognition (TSR) is an important and challenging task for intelligent transportation systems. We describe the details of our model's architecture for TSR and suggest a hinge loss stochastic gradient descent (HLSGD) method to train convolutional neural networks (CNNs). Our CNN consists of three stages (70-110-180) with 1162 284 trainable parameters. The HLSGD is evaluated on the German Traffic Sign Recognition Benchmark, which offers a faster and more stable convergence and a state-of-the-art recognition rate of 99.65%. We write a graphics processing unit package to train several CNNs and establish the final classifier in an ensemble way.

...read moreread less

292 citations

Proceedings Article•DOI•

Understanding image representations by measuring their equivariance and equivalence

[...]

Karel Lenc¹, Andrea Vedaldi¹•Institutions (1)

University of Oxford¹

07 Jun 2015

TL;DR: Three key mathematical properties of representations: equivariance, invariance, and equivalence are investigated and applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved.

...read moreread less

Abstract: Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aiming at filling this gap, we investigate three key mathematical properties of representations: equivariance, invariance, and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parametrisations of a CNN, capture the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

...read moreread less

292 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
…
152
153
154
155
156
157
158
…
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations