scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Proceedings ArticleDOI
30 Aug 2010
TL;DR: A violence detector built on the concept of visual codebooks using linear support vector machines is presented, confirming that motion patterns are crucial to distinguish violence from regular activities in comparison with visual descriptors that rely solely on the space domain.
Abstract: In this paper we presented a violence detector built on the concept of visual codebooks using linear support vector machines. It differs from the existing works of violence detection in what concern the data representation, as none has considered local spatio-temporal features with bags of visual words. An evaluation of the importance of local spatio-temporal features for characterizing the multimedia content is conducted through the cross-validation method. The results obtained confirm that motion patterns are crucial to distinguish violence from regular activities in comparison with visual descriptors that rely solely on the space domain.

118 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...performance we also applied SIFT [11])....

    [...]

  • ...ison of the violence detector using SIFT [11] and STIP [3] was carried out....

    [...]

  • ...In the literature, many interest point detectors have been proposed including detectors of spatio-temporal features [10] [3], scale-invariant features [11], and features invariant to scale and affine transformations [12]....

    [...]

  • ...The results somehow claim how relevant is to work with the space-time domain for encountering unique characteristics of the behaviour of the interest structures in contrast to a visual descriptor that relies solely on the space domain [11]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a novel method for object detection based on structural feature description and query expansion that is evaluated on high-resolution satellite images and demonstrates its clear advantages over several other object detection methods.
Abstract: Object detection is an important task in very high-resolution remote sensing image analysis. Traditional detection approaches are often not sufficiently robust in dealing with the variations of targets and sometimes suffer from limited training samples. In this paper, we tackle these two problems by proposing a novel method for object detection based on structural feature description and query expansion. The feature description combines both local and global information of objects. After initial feature extraction from a query image and representative samples, these descriptors are updated through an augmentation process to better describe the object of interest. The object detection step is implemented using a ranking support vector machine (SVM), which converts the detection task to a ranking query task. The ranking SVM is first trained on a small subset of training data with samples automatically ranked based on similarities to the query image. Then, a novel query expansion method is introduced to update the initial object model by active learning with human inputs on ranking of image pairs. Once the query expansion process is completed, which is determined by measuring entropy changes, the model is then applied to the whole target data set in which objects in different classes shall be detected. We evaluate the proposed method on high-resolution satellite images and demonstrate its clear advantages over several other object detection methods.

118 citations

Journal ArticleDOI
08 Oct 2015-Cell
TL;DR: A volumetric super-resolution reconstruction platform for large-volume imaging and automated segmentation of neurons and synapses with molecular identity information is developed, used to map inhibitory synaptic input fields of On-Off direction-selective ganglion cells (On-Off DSGCs), which are important for computing visual motion direction in the mouse retina.

117 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...Corresponding SIFT features between adjacent sections were used to determine a rigid linear transformation between sections, which was applied to all sections in the dataset to achieve a coarse, 3D rigid alignment of the data....

    [...]

  • ...We developed an automated image analysis pipeline for processing STORM and conventional images, which included corrections of chromatic aberration and lens distortions using bead fiducials, as well as montage and serial-section alignment using scale-invariant feature transformation (SIFT) followed by elastic registration (Saalfeld et al., 2012) to generate large-volume reconstructions (Figure 1A) (see the Experimental Procedures for details)....

    [...]

  • ...For mosaic imaging, Scale-Invariant Feature Transformation (SIFT) (Lowe, 2004) was used to find points of similarity between overlapping regions in adjacent image tiles in the WGA channel and generate a rigid alignment transformation that was applied to the conventional and STORM images to stitch…...

    [...]

  • ...On average, the residual offset in alignment between SIFT points of similarity in two adjacent image tiles was <40 nm....

    [...]

  • ...For mosaic imaging, Scale-Invariant Feature Transformation (SIFT) (Lowe, 2004) was used to find points of similarity between overlapping regions in adjacent image tiles in the WGA channel and generate a rigid alignment transformation that was applied to the conventional and STORM images to stitch overlapping image tiles....

    [...]

01 Jan 2011
TL;DR: The usage of inertial sensors has traditionally been confined primarily to the aviation and marine industry due to their associated cost and bulkiness as mentioned in this paper, however, during the last decade, however, inertial sensing has been used in the medical field.
Abstract: The usage of inertial sensors has traditionally been confined primarily to the aviation and marine industry due to their associated cost and bulkiness. During the last decade, however, inertial sen ...

117 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...Common examples are SIFT (Lowe, 2004) and more recently SURF (Bay et al....

    [...]

Proceedings ArticleDOI
12 Aug 2012
TL;DR: This work learns privacy classifiers trained on a large set of manually assessed Flickr photos, combining textual metadata of images with a variety of visual features, and employs the resulting classification models for specifically searching for private photos, and for diversifying query results to provide users with a better coverage of private and public content.
Abstract: Modern content sharing environments such as Flickr or YouTube contain a large amount of private resources such as photos showing weddings, family holidays, and private parties. These resources can be of a highly sensitive nature, disclosing many details of the users' private sphere. In order to support users in making privacy decisions in the context of image sharing and to provide them with a better overview on privacy related visual content available on the Web, we propose techniques to automatically detect private images, and to enable privacy-oriented image search. To this end, we learn privacy classifiers trained on a large set of manually assessed Flickr photos, combining textual metadata of images with a variety of visual features. We employ the resulting classification models for specifically searching for private photos, and for diversifying query results to provide users with a better coverage of private and public content. Large-scale classification experiments reveal insights into the predictive performance of different visual and textual features, and a user evaluation of query result rankings demonstrates the viability of our approach.

117 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.