Distinctive Image Features from Scale-Invariant Keypoints

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

read less

Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Multiple target detection and tracking with guaranteed framerates on mobile phones

[...]

Daniel Wagner¹, Dieter Schmalstieg¹, Horst Bischof¹•Institutions (1)

Graz University of Technology¹

19 Oct 2009

TL;DR: A method to automatically and dynamically balance the quality of detection and tracking to adapt to a variable time budget and ensure a constant frame rate is presented.

...read moreread less

Abstract: In this paper we present a novel method for real-time pose estimation and tracking on low-end devices such as mobile phones. The presented system can track multiple known targets in real-time and simultaneously detect new targets for tracking. We present a method to automatically and dynamically balance the quality of detection and tracking to adapt to a variable time budget and ensure a constant frame rate. Results from real data of a mobile phone Augmented Reality system demonstrate the efficiency and robustness of the described approach. The system can track 6 planar targets on a mobile phone simultaneously at framerates of 23fps.

...read moreread less

136 citations

Cites methods from "Distinctive Image Features from Sca..."

...Our target detection method is based on a modified SIFT [15] implementation that replaces the slow parts of the original SIFT with simpler variants, yet keeping many of the attractive properties of the original approach....
[...]
...For instance, Skrypnyk and Lowe [21] use SIFT descriptors [15] for object localization in AR....
[...]
...The work in this paper builds upon our previous publication [24], where we described modified SIFT [15] and Ferns [17] approaches and created the first real-time 6DOF natural feature tracking system running on mobile phones....
[...]

Journal Article•DOI•

Fast and Effective Image Copy-Move Forgery Detection via Hierarchical Feature Point Matching

[...]

Yuanman Li¹, Jiantao Zhou¹•Institutions (1)

University of Macau¹

01 May 2019-IEEE Transactions on Information Forensics and Security

TL;DR: This work develops a novel hierarchical matching strategy to solve the keypoint matching problems over a massive number of keypoints and proposes a novel iterative localization technique to reduce the false alarm rate and accurately localize the tampered regions.

...read moreread less

Abstract: Copy-move forgery is one of the most commonly used manipulations for tampering digital images. Keypoint-based detection methods have been reported to be very effective in revealing copy-move evidence due to their robustness against various attacks, such as large-scale geometric transformations. However, these methods fail to handle the cases when copy-move forgeries only involve small or smooth regions, where the number of keypoints is very limited. To tackle this challenge, we propose a fast and effective copy-move forgery detection algorithm through hierarchical feature point matching. We first show that it is possible to generate a sufficient number of keypoints that exist even in small or smooth regions by lowering the contrast threshold and rescaling the input image. We then develop a novel hierarchical matching strategy to solve the keypoint matching problems over a massive number of keypoints. To reduce the false alarm rate and accurately localize the tampered regions, we further propose a novel iterative localization technique by exploiting the robustness properties (including the dominant orientation and the scale information) and the color information of each keypoint. Extensive experimental results are provided to demonstrate the superior performance of our proposed scheme in terms of both efficiency and accuracy.

...read moreread less

136 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...the Scale Invariant Feature Transform (SIFT) feature [23], their method was shown to be very robust against...
[...]
...For more details about the SIFT, please refer to [23]....
[...]
...It is a well-known fact that we cannot fully trust the result of RANSAC especially when the number of inliers is limited [23]....
[...]
...As one of the most popular algorithms in computer vision to extract and describe image local features, the SIFT [23] has been shown to be excellently robust against noise distortion and geometric transformations [26], [27]....
[...]
...1In the original implementation [23], C is set as 0....
[...]

Journal Article•DOI•

Siamese Convolutional Neural Networks for Remote Sensing Scene Classification

[...]

Xuning Liu¹, Yong Zhou¹, Jiaqi Zhao¹, Rui Yao¹, Bing Liu¹, Yi Zheng¹ - Show less +2 more•Institutions (1)

China University of Mining and Technology¹

14 Feb 2019-IEEE Geoscience and Remote Sensing Letters

TL;DR: A Siamese CNN, which combines the identification and verification models of CNNs, is proposed in this letter, and experimental results show that the proposed method outperforms the existing methods.

...read moreread less

Abstract: The convolutional neural networks (CNNs) have shown powerful feature representation capability, which provides novel avenues to improve scene classification of remote sensing imagery. Although we can acquire large collections of satellite images, the lack of rich label information is still a major concern in the remote sensing field. In addition, remote sensing data sets have their own limitations, such as the small scale of scene classes and lack of image diversity. To mitigate the impact of the existing problems, a Siamese CNN, which combines the identification and verification models of CNNs, is proposed in this letter. A metric learning regularization term is explicitly imposed on the features learned through CNNs, which enforce the Siamese networks to be more robust. We carried out experiments on three widely used remote sensing data sets for performance evaluation. Experimental results show that our proposed method outperforms the existing methods.

...read moreread less

136 citations

Cites background from "Distinctive Image Features from Sca..."

...During the past decades, the works for scene classification were mainly based on handcrafted features, such as GIST [1], scale-invariant feature transform [2], and histogram of oriented gradients [3]....
[...]

Journal Article•DOI•

Foreground Focus: Unsupervised Learning from Partially Matching Images

[...]

Yong Jae Lee¹, Kristen Grauman¹•Institutions (1)

University of Texas at Austin¹

01 Nov 2009-International Journal of Computer Vision

TL;DR: It is shown that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and the technique is applied to automatically discover categories and foreground regions in images from benchmark datasets.

...read moreread less

Abstract: We present a method to automatically discover meaningful features in unlabeled image collections. Each image is decomposed into semi-local features that describe neighborhood appearance and geometry. The goal is to determine for each image which of these parts are most relevant, given the image content in the remainder of the collection. Our method first computes an initial image-level grouping based on feature correspondences, and then iteratively refines cluster assignments based on the evolving intra-cluster pattern of local matches. As a result, the significance attributed to each feature influences an image's cluster membership, while related images in a cluster affect the estimated significance of their features. We show that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and apply the technique to automatically discover categories and foreground regions in images from benchmark datasets.

...read moreread less

136 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...2004; Winn and Jojic 2005; Chum and Zisserman 2007; Ling and Soatto 2007) and robust local feature representations (Lowe 2004; Agarwal and Triggs 2006; Lazebnik et al. 2004)....
[...]
...Then we construct a standard n-word visual vocabulary by clustering a random pool of descriptors (we use SIFT (Lowe 2004)) extracted from the unlabeled image dataset, U , and record each feature’s word type....
[...]
...A strength of the affinity propagation method is that non-metric affinities are allowed, and so the authors compare images with SIFT features and a voting-based match, which is insensitive to clutter (Lowe 2004)....
[...]
...…have shown encouraging progress, particularly in terms of generic visual category learning (Weber et al. 2000; Leibe et al. 2004; Winn and Jojic 2005; Chum and Zisserman 2007; Ling and Soatto 2007) and robust local feature representations (Lowe 2004; Agarwal and Triggs 2006; Lazebnik et al. 2004)....
[...]

Journal Article•DOI•

S-ptam

[...]

Taih Pire¹, Thomas Fischer¹, Gastn Castro¹, Pablo DeCristforis¹, Javier Civera², J. Jacobo-Berlles¹ - Show less +2 more•Institutions (2)

University of Buenos Aires¹, University of Zaragoza²

01 Jul 2017-Robotics and Autonomous Systems

TL;DR: The implementation details, an exhaustive evaluation of the system in public datasets and a comparison of most state-of-the-art feature detectors and descriptors on the presented system are provided.

...read moreread less

136 citations

Cites methods from "Distinctive Image Features from Sca..."

...The most commonly used detectors are SIFT [7], SURF [8], STAR [9], GFTT [10], FAST [11], AGAST [12], and the relatively recently 3 proposed ORB [13], while among the most used descriptors we can mention SIFT, SURF, ORB, BRIEF [14], BRISK [15], and LATCH [16]....
[...]
...And the third one, a seminal work, estimates a sparse map of SIFT features....
[...]
...ORB – Oriented FAST and Rotated BRIEF [13] is another attempt to achieve a scale and rotation invariant BRIEF, as a computationally efficient alternative to SIFT and SURF....
[...]
...Given the high computational cost of SIFT and SURF feature extractors, they are not considered here, since the system is expected to run in real time....
[...]
...The most commonly used detectors are SIFT [7], SURF [8], STAR [9], GFTT [10], FAST [11], AGAST [12], and the relatively recently...
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
…
95
96
97
98
99
100
101
…
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

A performance evaluation of local descriptors

[...]

Krystian Mikolajczyk¹, Cordelia Schmid²•Institutions (2)

University of Oxford¹, French Institute for Research in Computer Science and Automation²

01 Oct 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

7,057 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations