scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Proceedings ArticleDOI
22 Mar 2018
TL;DR: This paper describes a learning approach based on training convolutional neural networks (CNN) for a traffic sign classification system and presents the preliminary classification results of applying this CNN to learn features and classify RGB-D images task.
Abstract: This paper describes a learning approach based on training convolutional neural networks (CNN) for a traffic sign classification system. In addition, it presents the preliminary classification results of applying this CNN to learn features and classify RGB-D images task. To determine the appropriate architecture, we explore the transfer learning technique called “fine tuning technique”, of reusing layers trained on the ImageNet dataset in order to provide a solution for a four-class classification task of a new set of data.

143 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...The first step is to fix the architecture by fixing the number of chosen layers, their sizes and matrix operations that connect them, [23]....

    [...]

Proceedings ArticleDOI
01 Jan 2010
TL;DR: This work looks at the application of the recent extension to the seminal SIFT approach to the 3D volumetric recognition of rigid objects within this complexvolumetric environment including significant noise artefacts.
Abstract: The automatic detection of objects within complex volumetric imagery is becoming of increased interest due to the use of dual energy Computed Tomography (CT) scanners as an aviation security deterrent. These devices produce a volumetric image akin to that encountered in prior medical CT work but in this case we are dealing with a complex multi-object volumetric environment including significant noise artefacts. In this work we look at the application of the recent extension to the seminal SIFT approach to the 3D volumetric recognition of rigid objects within this complex volumetric environment. A detailed overview of the approach and results when applied to a set of exemplar CT volumetric imagery is presented.

143 citations

Journal ArticleDOI
TL;DR: A theoretical analysis of the scale selection properties of a generalized framework for detecting interest points from scale-space features and it is shown that the scale estimates obtained from the determinant of the Hessian operator are affine covariant for an anisotropic Gaussian blob model.
Abstract: Scale-invariant interest points have found several highly successful applications in computer vision, in particular for image-based matching and recognition. This paper presents a theoretical analy ...

143 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...Specifically, highly successful applications can be found in image-based recognition (Lowe [48]; Bay et al. [2])....

    [...]

  • ...Since the commonly used difference-of-Gaussians operator can be seen as a discrete approximation of the Laplacian operator [41], the analysis of the scale selection properties for the Laplacian operator also provides a theoretical model for analyzing the scale selection properties of the difference-of-Gaussian keypoint detector used in the SIFT operator [48]....

    [...]

  • ...Computing local image descriptors at integration scales proportional to the detection scales of scale-invariant image features, moreover makes it possible to compute scale-invariant image descriptors (Lindeberg [35]; Bretzner and Lindeberg [4]; Mikolajczyk and Schmid [49]; Lowe [48]; Bay et al. [2]; Lindeberg [38, 43])....

    [...]

  • ...Computing local image descriptors at integration scales proportional to the detection scales of scale-invariant image features, moreover makes it possible to compute scale-invariant image descriptors (Lindeberg [35]; Bretzner and Lindeberg [4]; Mikolajczyk and Schmid [49]; Lowe [48]; Bay et al....

    [...]

  • ...Specifically, highly successful applications can be found in image-based recognition (Lowe [48]; Bay et al....

    [...]

Journal ArticleDOI
TL;DR: 3D shape analysis frameworks able to quantify the deformation of a shape into another in terms of the variation of real functions yields a new interpretation of the 3D shape similarity assessment, and the most promising directions for future research developments are discussed.
Abstract: The recent introduction of 3D shape analysis frameworks able to quantify the deformation of a shape into another in terms of the variation of real functions yields a new interpretation of the 3D shape similarity assessment and opens new perspectives. Indeed, while the classical approaches to similarity mainly quantify it as a numerical score, map-based methods also define dense shape correspondences. After presenting in detail the theoretical foundations underlying these approaches, we classify them by looking at their most salient features, including the kind of structure and invariance properties they capture, as well as the distances and the output modalities according to which the similarity between shapes is assessed and returned. We also review the usage of these methods in a number of 3D shape application domains, ranging from matching and retrieval to annotation and segmentation. Finally, the most promising directions for future research developments are discussed.

143 citations

Book ChapterDOI
Shengcai Liao1, Dong Yi1, Zhen Lei1, Rui Qin1, Stan Z. Li1 
04 Jun 2009
TL;DR: MB-LBP, an extension of LBP operator, is applied to encode the local image structures in the transformed domain, and further learn the most discriminant local features for recognition in heterogeneous face images.
Abstract: Heterogeneous face images come from different lighting conditions or different imaging devices, such as visible light (VIS) and near infrared (NIR) based. Because heterogeneous face images can have different skin spectra-optical properties, direct appearance based matching is no longer appropriate for solving the problem. Hence we need to find facial features common in heterogeneous images. For this, first we use Difference-of-Gaussian filtering to obtain a normalized appearance for all heterogeneous faces. We then apply MB-LBP, an extension of LBP operator, to encode the local image structures in the transformed domain, and further learn the most discriminant local features for recognition. Experiments show that the proposed method significantly outperforms existing ones in matching between VIS and NIR face images.

143 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.