Distinctive Image Features from Scale-Invariant Keypoints

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

read less

Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Fiji: an open-source platform for biological-image analysis

[...]

Johannes Schindelin¹, Ignacio Arganda-Carreras², Erwin Frise³, Verena Kaynig⁴, Mark Longair⁴, Tobias Pietzsch¹, Stephan Preibisch¹, Curtis Rueden⁵, Stephan Saalfeld¹, Benjamin Schmid¹, Jean-Yves Tinevez¹, Daniel J. White¹, Volker Hartenstein¹, Kevin W. Eliceiri⁵, Pavel Tomancak¹, Albert Cardona¹ - Show less +12 more•Institutions (5)

Max Planck Society¹, Massachusetts Institute of Technology², Lawrence Berkeley National Laboratory³, ETH Zurich⁴, University of Wisconsin-Madison⁵

01 Jul 2012-Nature Methods

TL;DR: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis that facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system.

...read moreread less

Abstract: Fiji is a distribution of the popular open-source software ImageJ focused on biological-image analysis. Fiji uses modern software engineering practices to combine powerful software libraries with a broad range of scripting languages to enable rapid prototyping of image-processing algorithms. Fiji facilitates the transformation of new algorithms into ImageJ plugins that can be shared with end users through an integrated update system. We propose Fiji as a platform for productive collaboration between computer science and biology research communities.

...read moreread less

43,540 citations

Journal Article•DOI•

The Pascal Visual Object Classes (VOC) Challenge

[...]

Mark Everingham¹, Luc Van Gool², Christopher Williams³, John Winn⁴, Andrew Zisserman⁵ - Show less +1 more•Institutions (5)

University of Leeds¹, Katholieke Universiteit Leuven², University of Edinburgh³, Microsoft⁴, University of Oxford⁵

01 Jun 2010-International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

Abstract: The Pascal Visual Object Classes (VOC) challenge is a benchmark in visual object category recognition and detection, providing the vision and machine learning communities with a standard dataset of images and annotation, and standard evaluation procedures. Organised annually from 2005 to present, the challenge and its associated dataset has become accepted as the benchmark for object detection. This paper describes the dataset and evaluation procedure. We review the state-of-the-art in evaluated methods for both classification and detection, analyse whether the methods are statistically different, what they are learning from the images (e.g. the object or its context), and what the methods find easy or confuse. The paper concludes with lessons learnt in the three year history of the challenge, and proposes directions for future improvement and extension.

...read moreread less

15,935 citations

Proceedings Article•DOI•

ORB: An efficient alternative to SIFT or SURF

[...]

Ethan Rublee¹, Vincent Rabaud¹, Kurt Konolige¹, Gary Bradski¹•Institutions (1)

Willow Garage¹

06 Nov 2011

TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.

...read moreread less

Abstract: Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.

...read moreread less

8,702 citations

Proceedings Article•DOI•

Scalable Person Re-identification: A Benchmark

[...]

Liang Zheng¹, Liang Zheng², Liyue Shen¹, Lu Tian¹, Shengjin Wang¹, Jingdong Wang³, Qi Tian¹ - Show less +3 more•Institutions (3)

Tsinghua University¹, University of Texas at San Antonio², Microsoft³

07 Dec 2015

TL;DR: A minor contribution, inspired by recent advances in large-scale image search, an unsupervised Bag-of-Words descriptor is proposed that yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large- scale 500k dataset.

...read moreread less

Abstract: This paper contributes a new high quality dataset for person re-identification, named "Market-1501". Generally, current datasets: 1) are limited in scale, 2) consist of hand-drawn bboxes, which are unavailable under realistic settings, 3) have only one ground truth and one query image for each identity (close environment). To tackle these problems, the proposed Market-1501 dataset is featured in three aspects. First, it contains over 32,000 annotated bboxes, plus a distractor set of over 500K images, making it the largest person re-id dataset to date. Second, images in Market-1501 dataset are produced using the Deformable Part Model (DPM) as pedestrian detector. Third, our dataset is collected in an open system, where each identity has multiple images under each camera. As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor. We view person re-identification as a special task of image search. In experiment, we show that the proposed descriptor yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large-scale 500k dataset.

...read moreread less

3,564 citations

Cites methods from "Distinctive Image Features from Sca..."

...On the other hand, the field of image search has been greatly advanced since the introduction of the SIFT descriptor [24] and the BoW model....
[...]

Proceedings Article•DOI•

Vlfeat: an open and portable library of computer vision algorithms

[...]

Andrea Vedaldi¹, Brian Fulkerson²•Institutions (2)

University of Oxford¹, University of California, Los Angeles²

25 Oct 2010

TL;DR: VLFeat is an open and portable library of computer vision algorithms that includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization.

...read moreread less

Abstract: VLFeat is an open and portable library of computer vision algorithms. It aims at facilitating fast prototyping and reproducible research for computer vision scientists and students. It includes rigorous implementations of common building blocks such as feature detectors, feature extractors, (hierarchical) k-means clustering, randomized kd-tree matching, and super-pixelization. The source code and interfaces are fully documented. The library integrates directly with MATLAB, a popular language for computer vision research.

...read moreread less

3,417 citations

Cites methods from "Distinctive Image Features from Sca..."

...The Scale Invariant Feature Transform (SIFT) [8, 9] is probably...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Invariant Features from Interest Point Groups

[...]

Matthew Brown¹, David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Sep 2002

TL;DR: This work introduces a family of features which use groups of interest points to form geometrically invariant descriptors of image regions to ensure robust matching between images in which there are large changes in viewpoint, scale and illumi- nation.

...read moreread less

Abstract: This paper approaches the problem of ¯nding correspondences between images in which there are large changes in viewpoint, scale and illumi- nation. Recent work has shown that scale-space `interest points' may be found with good repeatability in spite of such changes. Further- more, the high entropy of the surrounding image regions means that local descriptors are highly discriminative for matching. For descrip- tors at interest points to be robustly matched between images, they must be as far as possible invariant to the imaging process. In this work we introduce a family of features which use groups of interest points to form geometrically invariant descriptors of image regions. Feature descriptors are formed by resampling the image rel- ative to canonical frames de¯ned by the points. In addition to robust matching, a key advantage of this approach is that each match implies a hypothesis of the local 2D (projective) transformation. This allows us to immediately reject most of the false matches using a Hough trans- form. We reject remaining outliers using RANSAC and the epipolar constraint. Results show that dense feature matching can be achieved in a few seconds of computation on 1GHz Pentium III machines.

...read moreread less

723 citations

A Comparison of SIFT, PCA-SIFT and SURF

[...]

S. Govindarajulu, Kumar Reddy

01 Jan 2012

TL;DR: KNN (K-Nearest Neighbor) and Random Sample Consensus (RANSAC) are added to the three robust feature detection methods in order to analyze the results of the methods‟ application in recognition.

...read moreread less

Abstract: This paper summarizes the three robust feature detection methods: Scale Invariant Feature Transform (SIFT), Principal Component Analysis (PCA–SIFT) and Speeded Up Robust Features (SURF). This paper uses KNN (K-Nearest Neighbor) and Random Sample Consensus (RANSAC) to the three methods in order to analyze the results of the methods‟ application in recognition. KNN is used to find the matches, and RANSAC to reject inconsistent matches from which the inliers can take as correct matches. The performance of the robust feature detection methods are compared for scale changes, rotation, and blur. All the experiments use repeatability measurement and the number of correct matches for the evaluation measurements. SIFT presents its stability in most situations although it‟s slow. SURF is the fastest one with good performance as the same as SIFT. PCA-SIFT show its advantages in rotation and illumination changes.

...read moreread less

612 citations

Journal Article•DOI•

Detecting salient blob-like image structures and their scales with a scale-space primal sketch: a method for focus-of-attention

[...]

Tony Lindeberg¹•Institutions (1)

Royal Institute of Technology¹

01 Dec 1993-International Journal of Computer Vision

TL;DR: In this article, a multiscale representation of grey-level shape called the scale-space primal sketch is presented, which makes explicit both features in scale space and the relations between structures at different scales, and a methodology for extracting significant blob-like image structures from this representation.

...read moreread less

Abstract: This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes. The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

...read moreread less

523 citations

Journal Article•

Detecting salient blob-like image structures and their scales with a scale-space primal sketch: a me

[...]

Tony Lindeberg

01 Jan 1994-International Journal of Computer Vision

TL;DR: A multiscale representation of grey-level shape called the scale-space primal sketch is presented, which gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way.

...read moreread less

Abstract: This article presents: (i) a multiscale representation of grey-level shape called the scale-space primal sketch, which makes explicit both features in scale-space and the relations between structures at different scales, (ii) a methodology for extracting significant blob-like image structures from this representation, and (iii) applications to edge detection, histogram analysis, and junction classification demonstrating how the proposed method can be used for guiding later-stage visual processes.The representation gives a qualitative description of image structure, which allows for detection of stable scales and associated regions of interest in a solely bottom-up data-driven way. In other words, it generates coarse segmentation cues, and can hence be seen as preceding further processing, which can then be properly tuned. It is argued that once such information is available, many other processing tasks can become much simpler. Experiments on real imagery demonstrate that the proposed theory gives intuitive results.

...read moreread less

449 citations

Proceedings Article•DOI•

Shape recognition with edge-based features

[...]

Krystian Mikolajczyk, Andrew Zisserman, Cordelia Schmid

08 Sep 2003

TL;DR: An approach to recognizing poorly textured objects, that may contain holes and tubular parts, in cluttered scenes under arbitrary viewing conditions is described and a new edge-based local feature detector that is invariant to similarity transformations is introduced.

...read moreread less

Abstract: In this paper we describe an approach to recognizing poorly textured objects, that may contain holes and tubular parts, in cluttered scenes under arbitrary viewing conditions. To this end we develop a number of novel components. First, we introduce a new edge-based local feature detector that is invariant to similarity transformations. The features are localized on edges and a neighbourhood is estimated in a scale invariant manner. Second, the neighbourhood descriptor computed for foreground features is not affected by background clutter, even if the feature is on an object boundary. Third, the descriptor generalizes Lowe's SIFT method to edges. An object model is learnt from a single training image. The object is then recognized in new images in a series of steps which apply progressively tighter geometric restrictions. A final contribution of this work is to allow sufficient flexibility in the geometric representation that objects in the same visual class can be recognized. Results are demonstrated for various object classes including bikes and rackets.

...read moreread less

234 citations