Distinctive Image Features from Scale-Invariant Keypoints

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-

TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.

read less

Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Combining Self Training and Active Learning for Video Segmentation

[...]

Alireza Fathi¹, Maria-Florina Balcan¹, Xiaofeng Ren², James M. Rehg¹•Institutions (2)

Georgia Institute of Technology¹, Intel²

01 Sep 2011

TL;DR: It is shown that video object segmentation can be naturally cast as a semi-supervised learning problem and be efficiently solved using harmonic functions and an incremental self-training approach by iteratively labeling the least uncertain frame and updating similarity metrics is proposed.

...read moreread less

Abstract: This work addresses the problem of segmenting an object of interest out of a video. We show that video object segmentation can be naturally cast as a semi-supervised learning problem and be efficiently solved using harmonic functions. We propose an incremental self-training approach by iteratively labeling the least uncertain frame and updating similarity metrics. Our self-training video segmentation produces superior results both qualitatively and quantitatively. Moreover, usage of harmonic functions naturally supports interactive segmentation. We suggest active learning methods for providing guidance to user on what to annotate in order to improve labeling efficiency. We present experimental results using a ground truth data set and a quantitative comparison to a representative object segmentation system.

...read moreread less

109 citations

Cites methods from "Distinctive Image Features from Sca..."

...In addition to optical flow, we also compute sparse SIFT features[16] correspondence....
[...]

Proceedings Article•DOI•

Towards semantic knowledge propagation from text corpus to web images

[...]

Guo-Jun Qi¹, Charu C. Aggarwal², Thomas S. Huang¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, IBM²

28 Mar 2011

TL;DR: A mathematical model for the functional relationships between text and image features is developed so as to indirectly transfer semantic knowledge through feature transformations, which is accomplished by mapping instances from different domains into a common space of unspecific topics.

...read moreread less

Abstract: In this paper, we study the problem of transfer learning from text to images in the context of network data in which link based bridges are available to transfer the knowledge between the different domains. The problem of classification of image data is often much more challenging than text data because of the following two reasons: (a) Labeled text data is very widely available for classification purposes. On the other hand, this is often not the case for image data, in which a lot of images are available from many sources, but many of them are often not labeled. (b) The image features are not directly related to semantic concepts inherent in class labels. On the other hand, since text data tends to have natural semantic interpretability (because of their human origins), they are often more directly related to class labels. Therefore, the relationships between the images and text features also provide additional hints for the classification process in terms of the image feature transformations which provide the most effective results. The semantic challenges of image features are glaringly evident, when we attempt to recognize complex abstract concepts, and the visual features often fail to discriminate such concepts. However, the copious availability of bridging relationships between text and images in the context of web and social network data can be used in order to design for effective classifiers for image data. One of our goals in this paper is to develop a mathematical model for the functional relationships between text and image features, so as indirectly transfer semantic knowledge through feature transformations. This feature transformation is accomplished by mapping instances from different domains into a common space of unspecific topics. This is used as a bridge to semantically connect the two heterogeneous spaces. This is also helpful for the cases where little image data is available for the classification process. We evaluate our knowledge transfer techniques on an image classification task with labeled text corpora and show the effectiveness with respect to competing algorithms.

...read moreread less

109 citations

Cites methods from "Distinctive Image Features from Sca..."

...These include the 500 dimensional bag of words based on SIFT descriptors [10]....
[...]

Journal Article•DOI•

Feature-based morphometry: discovering group-related anatomical patterns.

[...]

Matthew Toews¹, William M. Wells², William M. Wells¹, D. Louis Collins³, Tal Arbel⁴ - Show less +1 more•Institutions (4)

Brigham and Women's Hospital¹, Massachusetts Institute of Technology², Montreal Neurological Institute and Hospital³, McGill University⁴

01 Feb 2010-NeuroImage

TL;DR: In this article, feature-based morphometry (FBM) is proposed to identify distinctive anatomical patterns that may only be present in subsets of subjects, due to disease or anatomical variability.

...read moreread less

109 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...…Hð Þ of the determinant vs. the cubed trace of H is low indicate degenerate patterns whose localization within the image is under-determined, and a threshold can be imposed based on this ratio to discard such Discovering group-related anatomical patterns, NeuroImage (2009), features (Lowe, 2004)....
[...]
...Such patterns are identified and represented as distinctive scaleinvariant features (Lowe, 2004; Mikolajczyk and Schmid, 2004), i.e., generic image patterns that can be automatically extracted in the image by a front-end salient feature detector....
[...]
...The DOG scale-space (Lowe, 2004) is used to identify feature geometries (xi, σi)....
[...]
...Saliency in scale-spaces is commonly formulated in terms of derivative operators (Lowe, 2004; Mikolajczyk and Schmid, 2004), which reflect changes in image content with respect to changes in location and/or scale....
[...]
...…be robustly identified despite image intensity variations due to factors such as scanner non-uniformity, etc. Finally, features can be efficiently extracted in O(N log N) time and space complexity using image pyramid data representations (Lowe, 2004), where N is the size of the image in voxels....
[...]

Journal Article•DOI•

Combined 2D-3D categorization and classification for multimodal perception systems

[...]

Zoltan-Csaba Marton¹, Dejan Pangercic¹, Nico Blodow¹, Michael Beetz¹•Institutions (1)

Technische Universität München¹

01 Sep 2011-The International Journal of Robotics Research

TL;DR: The system employs a library of specialized perception routines that solve different, well-defined perceptual sub-tasks and can be combined into composite perceptual activities including the construction of an object model database, multimodal object classification, and object model reconstruction for grasping.

...read moreread less

Abstract: In this article we describe an object perception system for autonomous robots performing everyday manipulation tasks in kitchen environments. The perception system gains its strengths by exploiting that the robots are to perform the same kinds of tasks with the same objects over and over again. It does so by learning the object representations necessary for the recognition and reconstruction in the context of pick-and-place tasks. The system employs a library of specialized perception routines that solve different, well-defined perceptual sub-tasks and can be combined into composite perceptual activities including the construction of an object model database, multimodal object classification, and object model reconstruction for grasping. We evaluate the effectiveness of our methods, and give examples of application scenarios using our personal robotic assistants acting in a human living environment.

...read moreread less

109 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...Depending on the type of perception data, various different 2D (e.g. Lowe 2004) and 3D (e.g. Rusu et al. 2008a) distinctive local features have been developed....
[...]
...In order to extract the visual SIFT features from the images we use an open-source implementation of the standard SIFT algorithm (Fast SIFT Image Features Library12) as initially described by Lowe (2004). Each SIFT feature is characterized by a 128-dimensional descriptor vector, 2 image coordinates, a scale and an orientation value....
[...]
...As argued by Marton et al. (2010a), the approximated radii of the smallest and biggest fitting curves to a local neighborhood are values with physical meaning, which can be tied directly to the underlying surface without the need for classification....
[...]
...2), Scale-Invariant Feature Transform (SIFT) (Lowe 2004) feature using Vocabulary Trees6 (Section 6....
[...]
...Extracting SIFT features In order to extract the visual SIFT features from the images we use an open-source implementation of the standard SIFT algorithm (Fast SIFT Image Features Library12) as initially described by Lowe (2004)....
[...]

Proceedings Article•DOI•

Automatic Personality and Interaction Style Recognition from Facebook Profile Pictures

[...]

Fabio Celli¹, Elia Bruni², Bruno Lepri•Institutions (2)

University of Trento¹, Free University of Bozen-Bolzano²

03 Nov 2014

TL;DR: This paper recruited volunteers among Facebook users and collected a dataset of profile pictures, labeled with gold standard self-assessed personality and interaction style labels, and exploited a bag-of-visual-words technique to extract features from pictures.

...read moreread less

Abstract: In this paper, we address the issue of personality and interaction style recognition from profile pictures in Facebook. We recruited volunteers among Facebook users and collected a dataset of profile pictures, labeled with gold standard self-assessed personality and interaction style labels. Then, we exploited a bag-of-visual-words technique to extract features from pictures. Finally, different machine learning approaches were used to test the effectiveness of these features in predicting personality and interaction style traits. Our good results show that this task is very promising, because profile pictures convey a lot of information about a user and are directly connected to impression formation and identity management.

...read moreread less

108 citations

Cites methods from "Distinctive Image Features from Sca..."

...one of the most popular and effective feature extraction technique used for object recognition [20]....
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
…
123
124
125
126
127
128
129
…
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

46,906 citations

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

A performance evaluation of local descriptors

[...]

Krystian Mikolajczyk¹, Cordelia Schmid²•Institutions (2)

University of Oxford¹, French Institute for Research in Computer Science and Automation²

01 Oct 2005-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

...read moreread less

7,057 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations