Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

On the Repeatability and Quality of Keypoints for Local Feature-based 3D Object Retrieval from Cluttered Scenes

[...]

Ajmal Mian¹, Mohammed Bennamoun¹, Robyn Owens¹•Institutions (1)

University of Western Australia¹

01 Sep 2010-International Journal of Computer Vision

TL;DR: An algorithm for the detection of highly repeatable keypoints on 3D models and partial views of objects and an automatic scale selection technique for extracting multi-scale and scale invariant features to match objects at different unknown scales are presented.

...read moreread less

Abstract: 3D object recognition from local features is robust to occlusions and clutter However, local features must be extracted from a small set of feature rich keypoints to avoid computational complexity and ambiguous features We present an algorithm for the detection of such keypoints on 3D models and partial views of objects The keypoints are highly repeatable between partial views of an object and its complete 3D model We also propose a quality measure to rank the keypoints and select the best ones for extracting local features Keypoints are identified at locations where a unique local 3D coordinate basis can be derived from the underlying surface in order to extract invariant features We also propose an automatic scale selection technique for extracting multi-scale and scale invariant features to match objects at different unknown scales Features are projected to a PCA subspace and matched to find correspondences between a database and query object Each pair of matching features gives a transformation that aligns the query and database object These transformations are clustered and the biggest cluster is used to identify the query object Experiments on a public database revealed that the proposed quality measure relates correctly to the repeatability of keypoints and the multi-scale features have a recognition rate of over 95% for up to 80% occluded objects

...read moreread less

432 citations

Cites methods from "Distinctive Image Features from Sca..."

...A similar criteria was used by Lowe [24] for reporting the repeatability of Scale Invariant Feature Transform (SIFT) however, the experiments were performed on synthetic data....
[...]
...A similar criteria was used by Lowe (2004) for reporting the repeatability of Scale Invariant Feature Transform (SIFT) however, the experiments were performed on synthetic data....
[...]

Book Chapter•DOI•

Describing clothing by semantic attributes

[...]

Huizhong Chen¹, Andrew C. Gallagher², Bernd Girod¹•Institutions (2)

Stanford University¹, Cornell University²

07 Oct 2012

TL;DR: A fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images is proposed, and a novel application of dressing style analysis is introduced that utilizes the semantic attributes produced by the system.

...read moreread less

Abstract: Describing clothing appearance with semantic attributes is an appealing technique for many important applications. In this paper, we propose a fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images. We extract low-level features in a pose-adaptive manner, and combine complementary features for learning attribute classifiers. Mutual dependencies between the attributes are then explored by a Conditional Random Field to further improve the predictions from independent classifiers. We validate the performance of our system on a challenging clothing attribute dataset, and introduce a novel application of dressing style analysis that utilizes the semantic attributes produced by our system.

...read moreread less

432 citations

Journal Article•DOI•

Visual attribute transfer through deep image analogy

[...]

Jing Liao¹, Yuan Yao², Lu Yuan¹, Gang Hua¹, Sing Bing Kang¹ - Show less +1 more•Institutions (2)

Microsoft¹, Shanghai Jiao Tong University²

20 Jul 2017

TL;DR: Deep image analogy as discussed by the authors finds semantically-meaningful dense correspondences between two input images by adapting the notion of image analogy with features extracted from a Deep Convolutional Neutral Network for matching; a coarse-to-fine strategy is used to compute the nearest-neighbor field for generating the results.

...read moreread less

Abstract: We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure. By visual attribute transfer, we mean transfer of visual information (such as color, tone, texture, and style) from one image to another. For example, one image could be that of a painting or a sketch while the other is a photo of a real scene, and both depict the same type of scene.Our technique finds semantically-meaningful dense correspondences between two input images. To accomplish this, it adapts the notion of "image analogy" [Hertzmann et al. 2001] with features extracted from a Deep Convolutional Neutral Network for matching; we call our technique deep image analogy. A coarse-to-fine strategy is used to compute the nearest-neighbor field for generating the results. We validate the effectiveness of our proposed method in a variety of cases, including style/texture transfer, color/style swap, sketch/painting to photo, and time lapse.

...read moreread less

432 citations

Journal Article•DOI•

Robust Feature Matching for Remote Sensing Image Registration via Locally Linear Transforming

[...]

Jiayi Ma¹, Huabing Zhou², Ji Zhao³, Yuan Gao⁴, Junjun Jiang, Jinwen Tian⁵ - Show less +2 more•Institutions (5)

Wuhan University¹, Wuhan Institute of Technology², Samsung³, City University of Hong Kong⁴, Huazhong University of Science and Technology⁵

19 Jun 2015-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: This paper proposes a flexible and general algorithm, which is called locally linear transforming (LLT), for both rigid and nonrigid feature matching of remote sensing images, which outperforms current state-of-the-art methods, particularly in the case of severe outliers.

...read moreread less

Abstract: Feature matching, which refers to establishing reliable correspondence between two sets of features (particularly point features), is a critical prerequisite in feature-based registration. In this paper, we propose a flexible and general algorithm, which is called locally linear transforming (LLT), for both rigid and nonrigid feature matching of remote sensing images. We start by creating a set of putative correspondences based on the feature similarity and then focus on removing outliers from the putative set and estimating the transformation as well. We formulate this as a maximum-likelihood estimation of a Bayesian model with hidden/latent variables indicating whether matches in the putative set are outliers or inliers. To ensure the well-posedness of the problem, we develop a local geometrical constraint that can preserve local structures among neighboring feature points, and it is also robust to a large number of outliers. The problem is solved by using the expectation–maximization algorithm (EM), and the closed-form solutions of both rigid and nonrigid transformations are derived in the maximization step. In the nonrigid case, we model the transformation between images in a reproducing kernel Hilbert space (RKHS), and a sparse approximation is applied to the transformation that reduces the method computation complexity to linearithmic. Extensive experiments on real remote sensing images demonstrate accurate results of LLT, which outperforms current state-of-the-art methods, particularly in the case of severe outliers (even up to 80%).

...read moreread less

431 citations

Journal Article•DOI•

Nonparametric Scene Parsing via Label Transfer

[...]

Ce Liu¹, Jenny Yuen², Antonio Torralba²•Institutions (2)

Microsoft¹, Massachusetts Institute of Technology²

01 Dec 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper proposes a novel, nonparametric approach for object recognition and scene parsing using a new technology the authors name label transfer, which is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.

...read moreread less

Abstract: While there has been a lot of recent work on object recognition and image understanding, the focus has been on carefully establishing mathematical models for images, scenes, and objects. In this paper, we propose a novel, nonparametric approach for object recognition and scene parsing using a new technology we name label transfer. For an input image, our system first retrieves its nearest neighbors from a large database containing fully annotated images. Then, the system establishes dense correspondences between the input image and each of the nearest neighbors using the dense SIFT flow algorithm [28], which aligns two images based on local image structures. Finally, based on the dense scene correspondences obtained from SIFT flow, our system warps the existing annotations and integrates multiple cues in a Markov random field framework to segment and recognize the query image. Promising experimental results have been achieved by our nonparametric scene parsing system on challenging databases. Compared to existing object recognition approaches that require training classifiers or appearance models for each object category, our system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.

...read moreread less

431 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
…
94
95
96
97
98
99
100
…
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations