Distinctive Image Features from Scale-Invariant Keypoints

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

read less

Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A fusion approach to unconstrained iris recognition

[...]

Gil Santos¹, Edmundo Hoyle²•Institutions (2)

University of Beira Interior¹, Federal University of Rio de Janeiro²

01 Jun 2012-Pattern Recognition Letters

TL;DR: A novel fusion of different recognition approaches is proposed and described how it can contribute to more reliable noncooperative iris recognition by compensating for degraded images captured in less constrained acquisition setups and protocols under visible wavelengths and varying lighting conditions.

...read moreread less

104 citations

Cites methods from "Distinctive Image Features from Sca..."

...To achieve those results, a publicly available SIFT implementation3 was used, and its parameters optimized based on tests performed on the training dataset....
[...]
...II - http://www.nice2.di.ubi.pt. dðu;vÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXn i¼1 ðui v iÞ2 vuut ð6Þ As for the features extracted by the SIFT, the distance-ratiobased matching scheme (Lowe, 2004) was applied....
[...]
...Differing from the previous method, where features were only extracted from the region closest to the eye, the Scale-Invariant Feature Transform (SIFT) (Lowe, 2004) was applied to all available data, here seeking salient regions (e....
[...]
...Differing from the previous method, where features were only extracted from the region closest to the eye, the Scale-Invariant Feature Transform (SIFT) (Lowe, 2004) was applied to all available data, here seeking salient regions (e.g., facial marks)....
[...]
...AUC LBP 0.99 31.87 0.76 SIFT 0.87 32.09 0.74 1-...
[...]

Journal Article•DOI•

Learning to Distribute Vocabulary Indexing for Scalable Visual Search

[...]

Rongrong Ji¹, Ling-Yu Duan¹, Jie Chen¹, Lexing Xie², Hongxun Yao³, Wen Gao¹ - Show less +2 more•Institutions (3)

Peking University¹, Australian National University², Harbin Institute of Technology³

01 Jan 2013-IEEE Transactions on Multimedia

TL;DR: This paper proposes to parallelize the near duplicate visual search architecture to index millions of images over multiple servers, including the distribution of both visual vocabulary and the corresponding indexing structure, and validates the distributed vocabulary indexing scheme in a real world location search system over 10 million landmark images.

...read moreread less

Abstract: In recent years, there is an ever-increasing research focus on Bag-of-Words based near duplicate visual search paradigm with inverted indexing. One fundamental yet unexploited challenge is how to maintain the large indexing structures within a single server subject to its memory constraint, which is extremely hard to scale up to millions or even billions of images. In this paper, we propose to parallelize the near duplicate visual search architecture to index millions of images over multiple servers, including the distribution of both visual vocabulary and the corresponding indexing structure. We optimize the distribution of vocabulary indexing from a machine learning perspective, which provides a “memory light” search paradigm that leverages the computational power across multiple servers to reduce the search latency. Especially, our solution addresses two essential issues: “What to distribute” and “How to distribute”. “What to distribute” is addressed by a “lossy” vocabulary Boosting, which discards both frequent and indiscriminating words prior to distribution. “How to distribute” is addressed by learning an optimal distribution function, which maximizes the uniformity of assigning the words of a given query to multiple servers. We validate the distributed vocabulary indexing scheme in a real world location search system over 10 million landmark images. Comparing to the state-of-the-art alternatives of single-server search [5], [6], [16] and distributed search [23], our scheme has yielded a significant gain of about 200% speedup at comparable precision by distributing only 5% words. We also report excellent robustness even when partial servers crash.

...read moreread less

104 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...In general, state-of-the-art visual search systems are built based upon a visual vocabulary model with an inverted indexing structure [4]–[7], which quantizes local features [1], [2] of query and reference images into visual words....
[...]
...C OMING with the popularity of local feature representations [1]–[3], recent years have witnessed an ever-increasing research focus on near duplicate visual search, with numerous applications in mobile location search, mobile product...
[...]
...11 Parameter Setting and Storage Cost: We extract SIFT features [1] for each image in each reference dataset....
[...]

Proceedings Article•DOI•

Seam-Driven Image Stitching

[...]

Junhong Gao¹, Yu Li¹, Tat-Jun Chin², Michael S. Brown¹•Institutions (2)

National University of Singapore¹, University of Adelaide²

01 Jan 2013

TL;DR: This work proposes a seam-driven image stitching strategy where instead of estimating a geometric transform based on the best fit of feature correspondences, the goodness of a transform is evaluated based upon the resulting visual quality of the seam-cut.

...read moreread less

Abstract: Image stitching computes geometric transforms to align images based on the best fit of feature correspondences between overlapping images. Seam-cutting is used afterwards to to hide misalignment artifacts. Interestingly it is often the seam-cutting step that is the most crucial for obtaining a perceptually seamless result. This motivates us to propose a seam-driven image stitching strategy where instead of estimating a geometric transform based on the best fit of feature correspondences, we evaluate the goodness of a transform based on the resulting visual quality of the seam-cut. We show that this new image stitching strategy can often produce better perceptual results than existing methods especially for challenging scenes.

...read moreread less

104 citations

Journal Article•DOI•

BWIBots: A platform for bridging the gap between AI and human–robot interaction research:

[...]

Piyush Khandelwal¹, Shiqi Zhang¹, Shiqi Zhang², Jivko Sinapov¹, Matteo Leonetti¹, Matteo Leonetti³, Jesse Thomason¹, Fangkai Yang, Ilaria Gori¹, Maxwell Svetlik¹, Priyanka Khante¹, Vladimir Lifschitz¹, Jake K. Aggarwal¹, Raymond J. Mooney¹, Peter Stone¹ - Show less +11 more•Institutions (3)

University of Texas at Austin¹, Cleveland State University², University of Leeds³

08 Feb 2017-The International Journal of Robotics Research

TL;DR: A novel, custom-designed multi-robot platform for research on AI, robotics, and especially human–robot interaction for service robots designed as a part of the Building-Wide Intelligence project at the University of Texas at Austin is introduced.

...read moreread less

Abstract: Recent progress in both AI and robotics have enabled the development of general purpose robot platforms that are capable of executing a wide variety of complex, temporally extended service tasks in open environments. This article introduces a novel, custom-designed multi-robot platform for research on AI, robotics, and especially human–robot interaction for service robots. Called BWIBots, the robots were designed as a part of the Building-Wide Intelligence (BWI) project at the University of Texas at Austin. The article begins with a description of, and justification for, the hardware and software design decisions underlying the BWIBots, with the aim of informing the design of such platforms in the future. It then proceeds to present an overview of various research contributions that have enabled the BWIBots to better (a) execute action sequences to complete user requests, (b) efficiently ask questions to resolve user requests, (c) understand human commands given in natural language, and (d) understand hum...

...read moreread less

104 citations

Cites methods from "Distinctive Image Features from Sca..."

...It has primarily been used for detecting objects using SIFT visual features (Lowe, 2004)....
[...]
...It has primarily been used for detecting objects using SIFT visual features (Lowe 2004)....
[...]

Proceedings Article•DOI•

Face recognition using SURF features

[...]

Geng Du¹, Fei Su¹, Anni Cai¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

30 Oct 2009

TL;DR: This paper proposes to exploit SURF features in face recognition in this paper by exploiting the advantages of SURF, a scale and in-plane rotation invariant detector and descriptor with comparable or even better performance with SIFT.

...read moreread less

Abstract: The Scale Invariant Feature Transform (SIFT) proposed by David G. Lowe has been used in face recognition and proved to perform well. Recently, a new detector and descriptor, named Speed-Up Robust Features (SURF) suggested by Herbert Bay, attracts people's attentions. SURF is a scale and in-plane rotation invariant detector and descriptor with comparable or even better performance with SIFT. Because each of SURF feature has only 64 dimensions in general and an indexing scheme is built by using the sign of the Laplacian, SURF is much faster than the 128-dimensional SIFT at the matching step. Thus based on the above advantages of SURF, we propose to exploit SURF features in face recognition in this paper.

...read moreread less

104 citations

Cites methods from "Distinctive Image Features from Sca..."

...In this paper, based on point matching method suggest in [5][6], we introduce geometric constraints into point-matching based on SURF features to increase the matching speed and robustness....
[...]
...Lowe [5][6] has been widely used in object detection and recognition....
[...]
...In [4][3] they all used the point matching method mentioned in [5][6] as a part of their evaluation of matching....
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
…
130
131
132
133
134
135
136
…
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

A performance evaluation of local descriptors

[...]

Krystian Mikolajczyk¹, Cordelia Schmid²•Institutions (2)

University of Oxford¹, French Institute for Research in Computer Science and Automation²

01 Oct 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

7,057 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations