Object recognition from local scale-invariant features

doi:10.1109/ICCV.1999.790410

Home
/
Papers
/
Object recognition from local scale-invariant features

Proceedings Article•DOI•

Object recognition from local scale-invariant features

David G. Lowe¹•Institutions (1)

University of British Columbia¹

20 Sep 1999-Vol. 2, pp 1150-1157

TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

read less

Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Automatic Detection and Classification of Sewer Defects via Hierarchical Deep Learning

[...]

Qian Xie¹, Dawei Li¹, Jinxuan Xu¹, Zhenghao Yu¹, Jun Wang¹ - Show less +1 more•Institutions (1)

Nanjing University of Aeronautics and Astronautics¹

11 Mar 2019-IEEE Transactions on Automation Science and Engineering

TL;DR: The authors propose an automatic detection and classification method for sewer defects based on hierarchical deep learning based on a two-level hierarchical deep convolutional neural network, which shows high performance with respect to classification accuracy.

...read moreread less

Abstract: Video and image sources are frequently applied in the area of defect inspection in industrial community. For the recognition and classification of sewer defects, a significant number of videos and images of sewers are collected. These data are then checked by human and some traditional methods to recognize and classify the sewer defects, which is inefficient and error-prone. Previously developed features like SIFT are unable to comprehensively represent such defects. Therefore, feature representation is especially important for defect autoclassification. In this paper, we study the automatic extraction of feature representation for sewer defects via deep learning. Moreover, a complete automatic system for classifying sewer defects is proposed built on a two-level hierarchical deep convolutional neural network, which shows high performance with respect to classification accuracy. The proposed network is trained on a novel data set with over 40 000 sewer images. The system has been successfully applied in the practical production, confirming its robustness and feasibility to real-world applications. The source code and trained model are available at the project website. 1 Note to Practitioners —Automatic defect inspection has become a fundamental research topic in engineering application field. Specifically, sewer defect detection is an important measure for maintenance, renewal, and rehabilitation activities of sewer infrastructure. In the current operation procedure, all the captured videos need to be inspected by experts frame by frame to recognize defects, yielding a significant low inspection rate with a significant amount of time. Previous work has attempted to employ traditional image processing methods for automated sewer defect classification. However, these methods get poor generalization capabilities since they use pre-engineered features. In most cases, sewerage inspection companies have to hire numerous professional inspectors to do this job, thereby consuming a lot of human and material resources. To address this problem, the authors propose an automatic detection and classification method for sewer defects based on hierarchical deep learning. Demonstrated by various experiments, the designed framework achieves a high defect classification accuracy, which can be easily integrated into an automatic sewer defect inspection system. 1 https://github.com/NUAAXQ/SewerDefectDetection

...read moreread less

104 citations

Journal Article•DOI•

Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence

[...]

Niclas Thomas¹, Katharine Best¹, Mattia Cinelli¹, Shlomit Reich-Zeliger¹, Hilah Gal¹, Eric Shifrut¹, Asaf Madi¹, Nir Friedman¹, John Shawe-Taylor¹, Benny Chain¹ - Show less +6 more•Institutions (1)

Weizmann Institute of Science¹

15 Nov 2014-Bioinformatics

TL;DR: The results reinforce the remarkable diversity of the TcR repertoire, resulting in many diverse private TcRs contributing to the T-cell response even in genetically identical mice responding to the same antigen.

...read moreread less

Abstract: Motivation: The clonal theory of adaptive immunity proposes that immunological responses are encoded by increases in the frequency of lymphocytes carrying antigen-specific receptors. In this study, we measure the frequency of different T-cell receptors (TcR) in CD4 + T cell populations of mice immunized with a complex antigen, killed Mycobacterium tuberculosis, using high throughput parallel sequencing of the TcRβ chain. Our initial hypothesis that immunization would induce repertoire convergence proved to be incorrect, and therefore an alternative approach was developed that allows accurate stratification of TcR repertoires and provides novel insights into the nature of CD4 + T-cell receptor recognition. Results: To track the changes induced by immunization within this heterogeneous repertoire, the sequence data were classified by counting the frequency of different clusters of short (3 or 4) continuous stretches of amino acids within the antigen binding complementarity determining region 3 (CDR3) repertoire of different mice. Both unsupervised (hierarchical clustering) and supervised (support vector machine) analyses of these different distributions of sequence clusters differentiated between immunized and unimmunized mice with 100% efficiency. The CD4 + TcR repertoires of mice 5 and 14 days postimmunization were clearly different from that of unimmunized mice but were not distinguishable from each other. However, the repertoires of mice 60 days postimmunization were distinct both from naive mice and the day 5/14 animals. Our results reinforce the remarkable diversity of the TcR repertoire, resulting in many diverse private TcRs contributing to the T-cell response even in genetically identical mice responding to the same antigen. However, specific motifs defined by short stretches of amino acids within the CDR3 region may determine TcR specificity and define a new approach to TcR sequence classification. Availability and implementation: The analysis was implemented in R and Python, and source code can be found in Supplementary Data. Contact: ku.ca.lcu@niahc.b Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

104 citations

Cites background or methods from "Object recognition from local scale..."

...These can be individual words of text, image features or any other simple descriptive features [see e.g. (Csurka et al., 2004; Joachims, 1998; Lowe, 1999)]....
[...]
...In this study, we develop an approach based on the well-studied bag-of-words (BOW) (Csurka et al., 2004; Joachims, 1998; Lowe, 1999) algorithm to categorize and classify sets of TcR sequences from immunized and unimmunized mice at different times postimmunization....
[...]

Fusion of Imaging and Inertial Sensors for Navigation

[...]

Michael J. Veth

01 Sep 2006

TL;DR: The research begins by rigorously describing the imaging and navigation problem and developing practical models of the sensors, then presenting a transformation technique to detect features within an image, which utilizes inertial measurements to predict vectors in the feature space between images.

...read moreread less

Abstract: : The motivation of this research is to address the limitations of satellite-based navigation by fusing imaging and inertial systems. The research begins by rigorously describing the imaging and navigation problem and developing practical models of the sensors, then presenting a transformation technique to detect features within an image. Given a set of features, a statistical feature projection technique is developed which utilizes inertial measurements to predict vectors in the feature space between images. This coupling of the imaging and inertial sensors at a deep level is then used to aid the statistical feature matching function. The feature matches and inertial measurements are then used to estimate the navigation trajectory using an extended Kalman filter. After accomplishing a proper calibration, the image-aided inertial navigation algorithm is then tested using a combination of simulation and ground tests using both tactical and consumer- grade inertial sensors. While limitations of the Kalman filter are identified, the experimental results demonstrate a navigation performance improvement of at least two orders of magnitude over the respective inertial-only solutions.

...read moreread less

103 citations

Cites methods from "Object recognition from local scale..."

...An example is the scale-invariant feature tracker (SIFT) method developed by Lowe [33]....
[...]
...The method presented by Lowe 88 builds a histogram of gradient orientations around the feature, then selects a primary orientation of gradient vector which corresponds to the maximum histogram bin....
[...]
...For this work, this is accomplished using a variant of the scale-invariant feature tracking (SIFT) algorithm developed by Lowe [33]....
[...]
...An example is the scale-invariant feature tracker (SIFT) method developed by Lowe [33]....
[...]
...For more information of the SIFT feature transformation algorithm, see [33], [34] and [27]....
[...]

Proceedings Article•DOI•

DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving

[...]

Stefano Alletto¹, Andrea Palazzi¹, Francesco Solera¹, Simone Calderara¹, Rita Cucchiara¹ - Show less +1 more•Institutions (1)

University of Modena and Reggio Emilia¹

01 Jun 2016

TL;DR: A novel and publicly available dataset acquired during actual driving that contains drivers' gaze fixations and their temporal integration providing task-specific saliency maps and can foster new discussions on better understanding, exploiting and reproducing the driver's attention process in the autonomous and assisted cars of future generations.

...read moreread less

Abstract: Autonomous and assisted driving are undoubtedly hot topics in computer vision. However, the driving task is extremely complex and a deep understanding of drivers' behavior is still lacking. Several researchers are now investigating the attention mechanism in order to define computational models for detecting salient and interesting objects in the scene. Nevertheless, most of these models only refer to bottom up visual saliency and are focused on still images. Instead, during the driving experience the temporal nature and peculiarity of the task influence the attention mechanisms, leading to the conclusion that real life driving data is mandatory. In this paper we propose a novel and publicly available dataset acquired during actual driving. Our dataset, composed by more than 500,000 frames, contains drivers' gaze fixations and their temporal integration providing task-specific saliency maps. Geo-referenced locations, driving speed and course complete the set of released data. To the best of our knowledge, this is the first publicly available dataset of this kind and can foster new discussions on better understanding, exploiting and reproducing the driver's attention process in the autonomous and assisted cars of future generations.

...read moreread less

103 citations

Cites methods from "Object recognition from local scale..."

...To estimate this transformation, Scale Invariant Feature Transform (SIFT) keypoints are extracted from the two frames [18] and a first, tentative nearest-neighbor matching is performed....
[...]

Journal Article•DOI•

Rapid object indexing using locality sensitive hashing and joint 3D-signature space estimation

[...]

Bogdan Matei¹, Ying Shan¹, Harpreet Sawhney¹, Yi Tan¹, Rakesh Kumar¹, Daniel Huber, Martial Hebert - Show less +3 more•Institutions (1)

Sarnoff Corporation¹

01 Jul 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new method for rapid 3D object indexing that combines feature-based methods with coarse alignment-based matching techniques is proposed, achieving a sublinear complexity on the number of models and maintaining at the same time a high degree of performance for real 3D sensed data that is acquired in largely uncontrolled settings.

...read moreread less

Abstract: We propose a new method for rapid 3D object indexing that combines feature-based methods with coarse alignment-based matching techniques. Our approach achieves a sublinear complexity on the number of models, maintaining at the same time a high degree of performance for real 3D sensed data that is acquired in largely uncontrolled settings. The key component of our method is to first index surface descriptors computed at salient locations from the scene into the whole model database using the locality sensitive hashing (LSH), a probabilistic approximate nearest neighbor method. Progressively complex geometric constraints are subsequently enforced to further prune the initial candidates and eliminate false correspondences due to inaccuracies in the surface descriptors and the errors of the LSH algorithm. The indexed models are selected based on the MAP rule using posterior probability of the models estimated in the joint 3D-signature space. Experiments with real 3D data employing a large database of vehicles, most of them very similar in shape, containing 1,000,000 features from more than 365 models demonstrate a high degree of performance in the presence of occlusion and obscuration, unmodeled vehicle interiors and part articulations, with an average processing time between 50 and 100 seconds per query

...read moreread less

103 citations

Cites methods from "Object recognition from local scale..."

...Since, in our case, the number of possible pose candidate can be OðQ2Þ for each pair of scene features, we employ importance sampling of matches based on a similarity measure in feature space and we progressively enforce geometric constraints between the descriptors base points in order to retain…...
[...]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
…
180
181
182
183
184
185
186
…
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Color indexing

[...]

Michael J. Swain, Dana H. Ballard

01 Nov 1991-International Journal of Computer Vision

TL;DR: In this paper, color histograms of multicolored objects provide a robust, efficient cue for indexing into a large database of models, and they can differentiate among a large number of objects.

...read moreread less

Abstract: Computer vision is moving into a new era in which the aim is to develop visual skills for robots that allow them to interact with a dynamic, unconstrained environment. To achieve this aim, new kinds of vision algorithms need to be developed which run in real time and subserve the robot's goals. Two fundamental goals are determining the identity of an object with a known location, and determining the location of a known object. Color can be successfully used for both tasks. This dissertation demonstrates that color histograms of multicolored objects provide a robust, efficient cue for indexing into a large database of models. It shows that color histograms are stable object representations in the presence of occlusion and over change in view, and that they can differentiate among a large number of objects. For solving the identification problem, it introduces a technique called Histogram Intersection, which matches model and image histograms and a fast incremental version of Histogram Intersection which allows real-time indexing into a large database of stored models. It demonstrates techniques for dealing with crowded scenes and with models with similar color signatures. For solving the location problem it introduces an algorithm called Histogram Backprojection which performs this task efficiently in crowded scenes.

...read moreread less

5,672 citations

Journal Article•DOI•

Generalizing the hough transform to detect arbitrary shapes

[...]

Dana H. Ballard¹•Institutions (1)

University of Rochester¹

01 Jan 1987-Pattern Recognition

TL;DR: It is shown how the boundaries of an arbitrary non-analytic shape can be used to construct a mapping between image space and Hough transform space, which makes the generalized Houghtransform a kind of universal transform which can beused to find arbitrarily complex shapes.

...read moreread less

4,310 citations

Journal Article•DOI•

Visual learning and recognition of 3-D objects from appearance

[...]

Hiroshi Murase, Shree K. Nayar¹•Institutions (1)

Columbia University¹

01 Jan 1995-International Journal of Computer Vision

TL;DR: A near real-time recognition system with 20 complex objects in the database has been developed and a compact representation of object appearance is proposed that is parametrized by pose and illumination.

...read moreread less

Abstract: The problem of automatically learning object models for recognition and pose estimation is addressed. In contrast to the traditional approach, the recognition problem is formulated as one of matching appearance rather than shape. The appearance of an object in a two-dimensional image depends on its shape, reflectance properties, pose in the scene, and the illumination conditions. While shape and reflectance are intrinsic properties and constant for a rigid object, pose and illumination vary from scene to scene. A compact representation of object appearance is proposed that is parametrized by pose and illumination. For each object of interest, a large set of images is obtained by automatically varying pose and illumination. This image set is compressed to obtain a low-dimensional subspace, called the eigenspace, in which the object is represented as a manifold. Given an unknown input image, the recognition system projects the image to eigenspace. The object is recognized based on the manifold it lies on. The exact position of the projection on the manifold determines the object's pose in the image. A variety of experiments are conducted using objects with complex appearance characteristics. The performance of the recognition and pose estimation algorithms is studied using over a thousand input images of sample objects. Sensitivity of recognition to the number of eigenspace dimensions and the number of learning samples is analyzed. For the objects used, appearance representation in eigenspaces with less than 20 dimensions produces accurate recognition results with an average pose estimation error of about 1.0 degree. A near real-time recognition system with 20 complex objects in the database has been developed. The paper is concluded with a discussion on various issues related to the proposed learning and recognition methodology.

...read moreread less

2,037 citations

Journal Article•DOI•

Local grayvalue invariants for image retrieval

[...]

Cordelia Schmid, Roger Mohr

01 May 1997-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper addresses the problem of retrieving images from large image databases with a method based on local grayvalue invariants which are computed at automatically detected interest points and allows for efficient retrieval from a database of more than 1,000 images.

...read moreread less

Abstract: This paper addresses the problem of retrieving images from large image databases. The method is based on local grayvalue invariants which are computed at automatically detected interest points. A voting algorithm and semilocal constraints make retrieval possible. Indexing allows for efficient retrieval from a database of more than 1,000 images. Experimental results show correct retrieval in the case of partial visibility, similarity transformations, extraneous features, and small perspective deformations.

...read moreread less

1,756 citations

"Object recognition from local scale..." refers background or methods in this paper

...This allows for the use of more distinctive image descriptors than the rotation-invariant ones used by Schmid and Mohr, and the descriptor is further modified to improve its stability to changes in affine projection and illumination....
[...]
...For the object recognition problem, Schmid & Mohr [19] also used the Harris corner detector to identify interest points, and then created a local image descriptor at each interest point from an orientation-invariant vector of derivative-of-Gaussian image measurements....
[...]
..., Schmid & Mohr [19]) has shown that efficient recognition can often be achieved by using local image descriptors sampled at a large number of repeatable locations....
[...]
...However, recent research on the use of dense local features (e.g., Schmid & Mohr [19]) has shown that efficient recognition can often be achieved by using local image descriptors sampled at a large number of repeatable locations....
[...]

Journal Article•DOI•

A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry

[...]

Zhengyou Zhang, Rachid Deriche, Olivier Faugeras, Quang-Tuan Luong

15 Oct 1995-Artificial Intelligence

TL;DR: A robust approach to image matching by exploiting the only available geometric constraint, namely, the epipolar constraint, is proposed and a new strategy for updating matches is developed, which only selects those matches having both high matching support and low matching ambiguity.

...read moreread less

1,574 citations