scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Proceedings ArticleDOI
19 Oct 2009
TL;DR: A method to automatically and dynamically balance the quality of detection and tracking to adapt to a variable time budget and ensure a constant frame rate is presented.
Abstract: In this paper we present a novel method for real-time pose estimation and tracking on low-end devices such as mobile phones. The presented system can track multiple known targets in real-time and simultaneously detect new targets for tracking. We present a method to automatically and dynamically balance the quality of detection and tracking to adapt to a variable time budget and ensure a constant frame rate. Results from real data of a mobile phone Augmented Reality system demonstrate the efficiency and robustness of the described approach. The system can track 6 planar targets on a mobile phone simultaneously at framerates of 23fps.

136 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...Our target detection method is based on a modified SIFT [15] implementation that replaces the slow parts of the original SIFT with simpler variants, yet keeping many of the attractive properties of the original approach....

    [...]

  • ...For instance, Skrypnyk and Lowe [21] use SIFT descriptors [15] for object localization in AR....

    [...]

  • ...The work in this paper builds upon our previous publication [24], where we described modified SIFT [15] and Ferns [17] approaches and created the first real-time 6DOF natural feature tracking system running on mobile phones....

    [...]

Journal ArticleDOI
TL;DR: This work develops a novel hierarchical matching strategy to solve the keypoint matching problems over a massive number of keypoints and proposes a novel iterative localization technique to reduce the false alarm rate and accurately localize the tampered regions.
Abstract: Copy-move forgery is one of the most commonly used manipulations for tampering digital images. Keypoint-based detection methods have been reported to be very effective in revealing copy-move evidence due to their robustness against various attacks, such as large-scale geometric transformations. However, these methods fail to handle the cases when copy-move forgeries only involve small or smooth regions, where the number of keypoints is very limited. To tackle this challenge, we propose a fast and effective copy-move forgery detection algorithm through hierarchical feature point matching. We first show that it is possible to generate a sufficient number of keypoints that exist even in small or smooth regions by lowering the contrast threshold and rescaling the input image. We then develop a novel hierarchical matching strategy to solve the keypoint matching problems over a massive number of keypoints. To reduce the false alarm rate and accurately localize the tampered regions, we further propose a novel iterative localization technique by exploiting the robustness properties (including the dominant orientation and the scale information) and the color information of each keypoint. Extensive experimental results are provided to demonstrate the superior performance of our proposed scheme in terms of both efficiency and accuracy.

136 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...the Scale Invariant Feature Transform (SIFT) feature [23], their method was shown to be very robust against...

    [...]

  • ...For more details about the SIFT, please refer to [23]....

    [...]

  • ...It is a well-known fact that we cannot fully trust the result of RANSAC especially when the number of inliers is limited [23]....

    [...]

  • ...As one of the most popular algorithms in computer vision to extract and describe image local features, the SIFT [23] has been shown to be excellently robust against noise distortion and geometric transformations [26], [27]....

    [...]

  • ...1In the original implementation [23], C is set as 0....

    [...]

Journal ArticleDOI
TL;DR: A Siamese CNN, which combines the identification and verification models of CNNs, is proposed in this letter, and experimental results show that the proposed method outperforms the existing methods.
Abstract: The convolutional neural networks (CNNs) have shown powerful feature representation capability, which provides novel avenues to improve scene classification of remote sensing imagery. Although we can acquire large collections of satellite images, the lack of rich label information is still a major concern in the remote sensing field. In addition, remote sensing data sets have their own limitations, such as the small scale of scene classes and lack of image diversity. To mitigate the impact of the existing problems, a Siamese CNN, which combines the identification and verification models of CNNs, is proposed in this letter. A metric learning regularization term is explicitly imposed on the features learned through CNNs, which enforce the Siamese networks to be more robust. We carried out experiments on three widely used remote sensing data sets for performance evaluation. Experimental results show that our proposed method outperforms the existing methods.

136 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...During the past decades, the works for scene classification were mainly based on handcrafted features, such as GIST [1], scale-invariant feature transform [2], and histogram of oriented gradients [3]....

    [...]

Journal ArticleDOI
TL;DR: It is shown that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and the technique is applied to automatically discover categories and foreground regions in images from benchmark datasets.
Abstract: We present a method to automatically discover meaningful features in unlabeled image collections. Each image is decomposed into semi-local features that describe neighborhood appearance and geometry. The goal is to determine for each image which of these parts are most relevant, given the image content in the remainder of the collection. Our method first computes an initial image-level grouping based on feature correspondences, and then iteratively refines cluster assignments based on the evolving intra-cluster pattern of local matches. As a result, the significance attributed to each feature influences an image's cluster membership, while related images in a cluster affect the estimated significance of their features. We show that this mutual reinforcement of object-level and feature-level similarity improves unsupervised image clustering, and apply the technique to automatically discover categories and foreground regions in images from benchmark datasets.

136 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...2004; Winn and Jojic 2005; Chum and Zisserman 2007; Ling and Soatto 2007) and robust local feature representations (Lowe 2004; Agarwal and Triggs 2006; Lazebnik et al. 2004)....

    [...]

  • ...Then we construct a standard n-word visual vocabulary by clustering a random pool of descriptors (we use SIFT (Lowe 2004)) extracted from the unlabeled image dataset, U , and record each feature’s word type....

    [...]

  • ...A strength of the affinity propagation method is that non-metric affinities are allowed, and so the authors compare images with SIFT features and a voting-based match, which is insensitive to clutter (Lowe 2004)....

    [...]

  • ...…have shown encouraging progress, particularly in terms of generic visual category learning (Weber et al. 2000; Leibe et al. 2004; Winn and Jojic 2005; Chum and Zisserman 2007; Ling and Soatto 2007) and robust local feature representations (Lowe 2004; Agarwal and Triggs 2006; Lazebnik et al. 2004)....

    [...]

Journal ArticleDOI

[...]

TL;DR: The implementation details, an exhaustive evaluation of the system in public datasets and a comparison of most state-of-the-art feature detectors and descriptors on the presented system are provided.

136 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...The most commonly used detectors are SIFT [7], SURF [8], STAR [9], GFTT [10], FAST [11], AGAST [12], and the relatively recently 3 proposed ORB [13], while among the most used descriptors we can mention SIFT, SURF, ORB, BRIEF [14], BRISK [15], and LATCH [16]....

    [...]

  • ...And the third one, a seminal work, estimates a sparse map of SIFT features....

    [...]

  • ...ORB – Oriented FAST and Rotated BRIEF [13] is another attempt to achieve a scale and rotation invariant BRIEF, as a computationally efficient alternative to SIFT and SURF....

    [...]

  • ...Given the high computational cost of SIFT and SURF feature extractors, they are not considered here, since the system is expected to run in real time....

    [...]

  • ...The most commonly used detectors are SIFT [7], SURF [8], STAR [9], GFTT [10], FAST [11], AGAST [12], and the relatively recently...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.