Distinctive Image Features from Scale-Invariant Keypoints

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

read less

Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks

[...]

Davide Boscaini¹, Jonathan Masci¹, Simone Melzi², Michael M. Bronstein¹, Umberto Castellani², Pierre Vandergheynst³ - Show less +2 more•Institutions (3)

University of Lugano¹, University of Verona², École Polytechnique Fédérale de Lausanne³

06 Jul 2015

TL;DR: Experimental results show that the proposed approach allows learning class‐specific shape descriptors significantly outperforming recent state‐of‐the‐art methods on standard benchmarks.

...read moreread less

Abstract: In this paper, we propose a generalization of convolutional neural networks (CNN) to non-Euclidean domains for the analysis of deformable shapes. Our construction is based on localized frequency analysis (a generalization of the windowed Fourier transform to manifolds) that is used to extract the local behavior of some dense intrinsic descriptor, roughly acting as an analogy to patches in images. The resulting local frequency representations are then passed through a bank of filters whose coefficient are determined by a learning procedure minimizing a task-specific cost. Our approach generalizes several previous methods such as HKS, WKS, spectral CNN, and GPS embeddings. Experimental results show that the proposed approach allows learning class-specific shape descriptors significantly outperforming recent state-of-the-art methods on standard benchmarks.

...read moreread less

244 citations

Journal Article•DOI•

Remote Sensing Image Registration With Modified SIFT and Enhanced Feature Matching

[...]

Wenping Ma¹, Wen Zelian¹, Yue Wu¹, Licheng Jiao¹, Maoguo Gong¹, Yafei Zheng¹, Liang Liu¹ - Show less +3 more•Institutions (1)

Xidian University¹

01 Jan 2017-IEEE Geoscience and Remote Sensing Letters

TL;DR: A new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs and an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduction to increase the number of correct correspondences.

...read moreread less

Abstract: The scale-invariant feature transform algorithm and its many variants are widely used in feature-based remote sensing image registration. However, it may be difficult to find enough correct correspondences for remote image pairs in some cases that exhibit a significant difference in intensity mapping. In this letter, a new gradient definition is introduced to overcome the difference of image intensity between the remote image pairs. Then, an enhanced feature matching method by combining the position, scale, and orientation of each keypoint is introduced to increase the number of correct correspondences. The proposed algorithm is tested on multispectral and multisensor remote sensing images. The experimental results show that the proposed method improves the matching performance compared with several state-of-the-art methods in terms of the number of correct correspondences and aligning accuracy.

...read moreread less

243 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...More details about SIFT can be found in [7]....
[...]
...Among the feature-based methods, the scale-invariant feature transform (SIFT) [7] is the classic algorithm....
[...]

Proceedings Article•DOI•

Trainable classifier-fusion schemes: An application to pedestrian detection

[...]

Oswaldo Ludwig Junior¹, David Delgado¹, Valter Goncalves¹, Urbano Nunes¹•Institutions (1)

University of Coimbra¹

06 Nov 2009

TL;DR: This work proposes a novel classifier-fusion scheme using learning algorithms, i.e. syntactic models, instead of the usual Bayesian or heuristic rules, to solve the problem of feature extractor and classifier combinations on DaimlerChrysler Automotive Dataset.

...read moreread less

Abstract: This work proposes a novel classifier-fusion scheme using learning algorithms, i.e. syntactic models, instead of the usual Bayesian or heuristic rules. Moreover, this paper complements the previous comparative studies on DaimlerChrysler Automotive Dataset, offering a set of complementary experiments using feature extractor and classifier combinations. The experimental results provide evidence of the effectiveness of our methods regarding false positive rate, AUC, and accuracy, which reached 96.67%.

...read moreread less

242 citations

Cites methods from "Distinctive Image Features from Sca..."

...Histogram of Oriented Gradients (HOG) [7] is inspired on Scale-Invariant Feature Transform (SIFT) descriptors proposed by [5]....
[...]

Learning Deep Representations for Ground to Aerial Geolocalization (Open Access)

[...]

Tsung-Yi Lin, Yin Cui, Serge Belongie, James Hays

15 Oct 2015

TL;DR: In this article, where-CNN is used to learn a feature representation in which matching views are near one another and mismatched views are far apart, which achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases.

...read moreread less

Abstract: : The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or bird's eye imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.

...read moreread less

242 citations

Journal Article•DOI•

Copy-move forgery detection and localization by means of robust clustering with J-Linkage

[...]

Irene Amerini¹, Lamberto Ballan¹, Roberto Caldelli¹, Alberto Del Bimbo¹, Luca Del Tongo¹, Giuseppe Serra¹, Giuseppe Serra² - Show less +3 more•Institutions (2)

University of Florence¹, University of Modena and Reggio Emilia²

01 Jul 2013-Signal Processing-image Communication

TL;DR: A novel approach is presented for copy-move forgery detection and localization based on the JLinkage algorithm, which performs a robust clustering in the space of the geometric transformation, which outperforms other similar state-of-the-art techniques.

...read moreread less

Abstract: Understanding if a digital image is authentic or not, is a key purpose of image forensics. There are several different tampering attacks but, surely, one of the most common and immediate one is copy-move. A recent and effective approach for detecting copy-move forgeries is to use local visual features such as SIFT. In this kind of methods, SIFT matching is often followed by a clustering procedure to group keypoints that are spatially close. Often, this procedure could be unsatisfactory, in particular in those cases in which the copied patch contains pixels that are spatially very distant among them, and when the pasted area is near to the original source. In such cases, a better estimation of the cloned area is necessary in order to obtain an accurate forgery localization. In this paper a novel approach is presented for copy-move forgery detection and localization based on the JLinkage algorithm, which performs a robust clustering in the space of the geometric transformation. Experimental results, carried out on different datasets, show that the proposed method outperforms other similar state-of-the-art techniques both in terms of copy-move forgery detection reliability and of precision in the manipulated patch localization.

...read moreread less

242 citations

Cites methods from "Distinctive Image Features from Sca..."

...Anyway, the method is used only for copymove detection and not for accurate tampering localization....
[...]
...In this paper a novel approach is presented for copy-move forgery detection and localization based on the JLinkage algorithm, which performs a robust clustering in the space of the geometric transformation....
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
…
38
39
40
41
42
43
44
…
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

A performance evaluation of local descriptors

[...]

Krystian Mikolajczyk¹, Cordelia Schmid²•Institutions (2)

University of Oxford¹, French Institute for Research in Computer Science and Automation²

01 Oct 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

7,057 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations