Distinctive Image Features from Scale-Invariant Keypoints

doi:10.1023/B:VISI.0000029664.99615.94

Home
/
Papers
/
Distinctive Image Features from Scale-Invariant Keypoints

Journal Article•DOI•

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Nov 2004-International Journal of Computer Vision (Kluwer Academic Publishers)-Vol. 60, Iss: 2, pp 91-110

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

read less

Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Monocular visual odometry in urban environments using an omnidirectional camera

[...]

Jean-Philippe Tardif¹, Y. Pavlidis¹, Kostas Daniilidis¹•Institutions (1)

University of Pennsylvania¹

14 Oct 2008

TL;DR: The key aspect of the system is a fast and simple pose estimation algorithm that uses information not only from the estimated 3D map, but also from the epipolar constraint, which leads to a much more stable estimation of the camera trajectory than the conventional approach.

...read moreread less

Abstract: We present a system for monocular simultaneous localization and mapping (mono-SLAM) relying solely on video input. Our algorithm makes it possible to precisely estimate the camera trajectory without relying on any motion model. The estimation is completely incremental: at a given time frame, only the current location is estimated while the previous camera positions are never modified. In particular, we do not perform any simultaneous iterative optimization of the camera positions and estimated 3D structure (local bundle adjustment). The key aspect of the system is a fast and simple pose estimation algorithm that uses information not only from the estimated 3D map, but also from the epipolar constraint. We show that the latter leads to a much more stable estimation of the camera trajectory than the conventional approach. We perform high precision camera trajectory estimation in urban scenes with a large amount of clutter. Using an omnidirectional camera placed on a vehicle, we cover one of the longest distance ever reported, up to 2.5 kilometers.

...read moreread less

259 citations

Journal Article•DOI•

Sparse points matching by combining 3D mesh saliency with statistical descriptors

[...]

Umberto Castellani¹, Marco Cristani¹, Simone Fantoni¹, Vittorio Murino¹•Institutions (1)

University of Verona¹

01 Apr 2008-Computer Graphics Forum

TL;DR: New methodology for the detection and matching of salient points over several views of an object, modelled by a Hidden Markov Model, which is trained in an unsupervised way by using contextual 3D neighborhood information, thus providing a robust and invariant point signature is proposed.

...read moreread less

Abstract: This paper proposes new methodology for the detection and matching of salient points over several views of an object. The process is composed by three main phases. In the first step, detection is carried out by adopting a new perceptually-inspired 3D saliency measure. Such measure allows the detection of few sparse salient points that characterize distinctive portions of the surface. In the second step, a statistical learning approach is considered to describe salient points across different views. Each salient point is modelled by a Hidden Markov Model (HMM), which is trained in an unsupervised way by using contextual 3D neighborhood information, thus providing a robust and invariant point signature. Finally, in the third step, matching among points of different views is performed by evaluating a pairwise similarity measure among HMMs. An extensive and comparative experimental session has been carried out, considering real objects acquired by a 3D scanner from different points of view, where objects come from standard 3D databases. Results are promising, as the detection of salient points is reliable, and the matching is robust and accurate.

...read moreread less

259 citations

Journal Article•DOI•

Real-time traffic sign recognition from video by class-specific discriminative features

[...]

Andrzej Ruta¹, Yongmin Li¹, Xiaohui Liu¹•Institutions (1)

Brunel University London¹

01 Jan 2010-Pattern Recognition

TL;DR: An efficient road sign recognition system is built, based on a conventional nearest neighbour classifier and a simple temporal integration scheme, which demonstrates a competitive performance in the experiments involving real traffic video.

...read moreread less

259 citations

Journal Article•DOI•

Comparison of SIFT Encoded and Deep Learning Features for the Classification and Detection of Esca Disease in Bordeaux Vineyards

[...]

Florian Rançon, Lionel Bombrun, Barna Keresztes, Christian Germain

20 Dec 2018-Remote Sensing

TL;DR: Good correlation between annotated and detected symptomatic surface per plant was obtained, meaning slightly symptomatic plants can be efficiently separated from severely attacked plants, and efficiency of simple transfer learning approaches without the need to design an ad-hoc specific feature extractor is demonstrated.

...read moreread less

Abstract: Grapevine wood fungal diseases such as esca are among the biggest threats in vineyards nowadays. The lack of very efficient preventive (best results using commercial products report 20% efficiency) and curative means induces huge economic losses. The study presented in this paper is centered around the in-field detection of foliar esca symptoms during summer, exhibiting a typical “striped” pattern. Indeed, in-field disease detection has shown great potential for commercial applications and has been successfully used for other agricultural needs such as yield estimation. Differentiation with foliar symptoms caused by other diseases or abiotic stresses was also considered. Two vineyards from the Bordeaux region (France, Aquitaine) were chosen as the basis for the experiment. Pictures of diseased and healthy vine plants were acquired during summer 2017 and labeled at the leaf scale, resulting in a patch database of around 6000 images (224 × 224 pixels) divided into red cultivar and white cultivar samples. Then, we tackled the classification part of the problem comparing state-of-the-art SIFT encoding and pre-trained deep learning feature extractors for the classification of database patches. In the best case, 91% overall accuracy was obtained using deep features extracted from MobileNet network trained on ImageNet database, demonstrating the efficiency of simple transfer learning approaches without the need to design an ad-hoc specific feature extractor. The third part aimed at disease detection (using bounding boxes) within full plant images. For this purpose, we integrated the deep learning base network within a “one-step” detection network (RetinaNet), allowing us to perform detection queries in real time (approximately six frames per second on GPU). Recall/Precision (RP) and Average Precision (AP) metrics then allowed us to evaluate the performance of the network on a 91-image (plants) validation database. Overall, 90% precision for a 40% recall was obtained while best esca AP was about 70%. Good correlation between annotated and detected symptomatic surface per plant was also obtained, meaning slightly symptomatic plants can be efficiently separated from severely attacked plants.

...read moreread less

259 citations

Cites background or methods from "Distinctive Image Features from Sca..."

...Scale-Invariant Feature Transform (SIFT) [19] is commonly used to describe local regions from an image in a scale and rotational invariant way....
[...]
...SIFT keypoint detection is a powerful method used both for image classification and image correspondence [19]....
[...]

Proceedings Article•DOI•

ReVision: automated classification, analysis and redesign of chart images

[...]

Manolis Savva¹, Nicholas Kong², Arti Chhajta¹, Li Fei-Fei¹, Maneesh Agrawala², Jeffrey Heer¹ - Show less +2 more•Institutions (2)

Stanford University¹, University of California, Berkeley²

16 Oct 2011

TL;DR: ReVision is a system that automatically redesigns visualizations to improve graphical perception, and applies perceptually-based design principles to populate an interactive gallery of redesigned charts.

...read moreread less

Abstract: Poorly designed charts are prevalent in reports, magazines, books and on the Web Most of these charts are only available as bitmap images; without access to the underlying data it is prohibitively difficult for viewers to create more effective visual representations In response we present ReVision, a system that automatically redesigns visualizations to improve graphical perception Given a bitmap image of a chart as input, ReVision applies computer vision and machine learning techniques to identify the chart type (eg, pie chart, bar chart, scatterplot, etc) It then extracts the graphical marks and infers the underlying data Using a corpus of images drawn from the web, ReVision achieves image classification accuracy of 96% across ten chart categories It also accurately extracts marks from 79% of bar charts and 62% of pie charts, and from these charts it successfully extracts data from 71% of bar charts and 64% of pie charts ReVision then applies perceptually-based design principles to populate an interactive gallery of redesigned charts With this interface, users can view alternative chart designs and retarget content to different visual styles

...read moreread less

258 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
…
175
176
177
178
179
180
181
…
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Object recognition from local scale-invariant features

[...]

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

16,989 citations

"Distinctive Image Features from Sca..." refers background or methods in this paper

...The initial implementation of this approach (Lowe, 1999) simply located keypoints at the location and scale of the central sample point....
[...]
...Earlier work by the author (Lowe, 1999) extended the local feature approach to achieve scale invariance....
[...]
...More details on applications of these features to recognition are available in other pape rs (Lowe, 1999; Lowe, 2001; Se, Lowe and Little, 2002)....
[...]
...To efficiently detect stable keypoint locations in scale space, we have proposed (Lowe, 1999) using scalespace extrema in the difference-of-Gaussian function convolved with the image, D(x, y, σ ), which can be computed from the difference of two nearby scales separated by a constant multiplicative…...
[...]
...More details on applications of these features to recognition are available in other papers (Lowe, 1999, 2001; Se et al., 2002)....
[...]

Book•

Multiple view geometry in computer vision

[...]

Richard Hartley¹, Andrew Zisserman²•Institutions (2)

Australian National University¹, University of Oxford²

01 Jan 2000

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

...read moreread less

Abstract: From the Publisher: A basic problem in computer vision is to understand the structure of a real world scene given several images of it. Recent major developments in the theory and practice of scene reconstruction are described in detail in a unified framework. The book covers the geometric principles and how to represent objects algebraically so they can be computed and applied. The authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly.

...read moreread less

15,558 citations

Multiple View Geometry in Computer Vision.

[...]

Bernhard P. Wrobel

01 Jan 2001

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.

...read moreread less

Abstract: Downloading the book in this website lists can give you more advantages. It will show you the best book collections and completed collections. So many books can be found in this website. So, this is not only this multiple view geometry in computer vision. However, this book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts. This is simple, read the soft file of the book and you get it.

...read moreread less

14,282 citations

"Distinctive Image Features from Sca..." refers background in this paper

...A more general solution would be to solve for the fundamental matrix (Luong and Faugeras, 1996; Hartley and Zisserman, 2000)....
[...]

Proceedings Article•DOI•

A Combined Corner and Edge Detector

[...]

Chris Harris, Mike Stephens

01 Jan 1988

TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.

...read moreread less

Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

...read moreread less

13,993 citations

Journal Article•DOI•

Robust wide-baseline stereo from maximally stable extremal regions

[...]

Jiri Matas¹, Ondrej Chum, Martin Urban, Tomas Pajdla•Institutions (1)

University of Surrey¹

01 Sep 2004-Image and Vision Computing

TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

...read moreread less

3,422 citations