scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Posted Content
TL;DR: This paper proposes its own algorithm for Alzheimer's Disease diagnostics based on a convolutional neural network and sMRI and DTI modalities fusion on hippocampal ROI using data from the Alzheimers Disease Neuroimaging Initiative (ADNI) database and shows promising results.
Abstract: Computer-aided early diagnosis of Alzheimers Disease (AD) and its prodromal form, Mild Cognitive Impairment (MCI), has been the subject of extensive research in recent years. Some recent studies have shown promising results in the AD and MCI determination using structural and functional Magnetic Resonance Imaging (sMRI, fMRI), Positron Emission Tomography (PET) and Diffusion Tensor Imaging (DTI) modalities. Furthermore, fusion of imaging modalities in a supervised machine learning framework has shown promising direction of research. In this paper we first review major trends in automatic classification methods such as feature extraction based methods as well as deep learning approaches in medical image analysis applied to the field of Alzheimer's Disease diagnostics. Then we propose our own algorithm for Alzheimer's Disease diagnostics based on a convolutional neural network and sMRI and DTI modalities fusion on hippocampal ROI using data from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (this http URL). Comparison with a single modality approach shows promising results. We also propose our own method of data augmentation for balancing classes of different size and analyze the impact of the ROI size on the classification results as well.

86 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...The originality of the work consisted in the usage of Gauss-Laguerre Harmonic Functions (GL-CHFs) instead of traditional SIFT[12] and SURF[13] descriptors....

    [...]

  • ...The authors construct the united data representation for each patient: F = [X1; X2; Z1; Z2] ∈ R4d×n and calculate SIFT descriptors....

    [...]

  • ...tions (GL-CHFs) instead of traditional SIFT[12] and SURF[13] descriptors....

    [...]

  • ...As an alternative to heavy volumetric methods, feature-based approaches were applied in the problem of AD detection using domain knowledge both on the ROI biomarkers and on the nature of the signal in sMRI and DTI modalities which is blurry and cannot be sufficiently well described by conventional differential descriptors such as SIFT[12] and SURF[13]....

    [...]

  • ...ferential descriptors such as SIFT[12] and SURF[13]....

    [...]

Journal ArticleDOI
TL;DR: A novel graph-based learning method for effectively retrieving remote sensing images that integrates the strengths of query expansion and fusion of holistic and local features and significantly enhances the retrieval precision without sacrificing the scalability.
Abstract: With the emergence of huge volumes of high-resolution remote sensing images produced by all sorts of satellites and airborne sensors, processing and analysis of these images require effective retrieval techniques. To alleviate the dramatic variation of the retrieval accuracy among queries caused by the single image feature algorithms, we developed a novel graph-based learning method for effectively retrieving remote sensing images. The method utilizes a three-layer framework that integrates the strengths of query expansion and fusion of holistic and local features. In the first layer, two retrieval image sets are obtained by, respectively, using the retrieval methods based on holistic and local features, and the top-ranked and common images from both of the top candidate lists subsequently form graph anchors. In the second layer, the graph anchors as an expansion query retrieve six image sets from the image database using each individual feature. In the third layer, the images in the six image sets are evaluated for generating positive and negative data, and SimpleMKL is applied to learn suitable query-dependent fusion weights for achieving the final image retrieval result. Extensive experiments were performed on the UC Merced Land Use-Land Cover data set. The source code has been available at our website. Compared with other related methods, the retrieval precision is significantly enhanced without sacrificing the scalability of our approach.

85 citations

Journal ArticleDOI
TL;DR: This research focuses on replacing a manufacturer-provided, menu-based interface with a vision-based system while adding autonomy to reduce the cognitive load, and presents the complete system which can autonomously retrieve a desired object from a shelf.
Abstract: Wheelchair-mounted robotic arms have been commercially available for a decade. In order to operate these robotic arms, a user must have a high level of cognitive function. Our research focuses on replacing a manufacturer-provided, menu-based interface with a vision-based system while adding autonomy to reduce the cognitive load. Instead of manual task decomposition and execution, the user explicitly designates the end goal, and the system autonomously retrieves the object. In this paper, we present the complete system which can autonomously retrieve a desired object from a shelf. We also present the results of a 15-week study in which 12 participants from our target population used our system, totaling 198 trials.

85 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...We use the Scale Invariant Feature Tracking (SIFT) descriptor to estimate the 3D position for the target object from the gripper stereo camera [25]....

    [...]

Proceedings ArticleDOI
13 Jun 2010
TL;DR: An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented and it is shown experimentally that co-ocsets are not rare, and that they may ruin retrieval performance if present in the query image.
Abstract: An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features co-occurring with high probability and are called co-ocsets. Sparse high dimensional descriptors, such as bag of words, have been proven very effective in the domain of image retrieval. To maintain high efficiency even for very large data collection, features are assumed independent. We show experimentally that co-ocsets are not rare, i.e. the independence assumption is often violated, and that they may ruin retrieval performance if present in the query image. Two methods for managing co-ocsets in such cases are proposed. Both methods significantly outperform the state-of-the-art in image retrieval, one is also significantly faster.

85 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...In particular, we use hessian affine features and the SIFT descriptor [12]....

    [...]

Journal ArticleDOI
TL;DR: In this article, a parallel processing strategy is introduced to improve the computational time of the photogrammetric process, which can be used for the acquisition of spatial information at large mapping scales, with rapid response and precise modelling in three dimensions.
Abstract: Low-altitude images acquired by unpiloted aerial vehicles have the advantages of high overlap, multiple viewing angles and very high ground resolution. These kinds of images can be used in various applications that need high accuracy or fine texture. A novel approach is proposed for parallel processing of low-altitude images acquired by unpiloted aerial vehicles, which can automatically fly according to predefined flight routes under the control of an autopilot system. The general overlap and relative rotation angles between two adjacent images are estimated by overall matching with an improved scale-invariant feature transform (SIFT) operator. Precise conjugate points and relative orientation parameters are determined by a pyramid-based least squares image matching strategy and the relative orientation process. Bundle adjustment is performed with automatically matched conjugate points and interactively measured ground control points. After this aerial triangulation process the high-resolution images can be used to advantage in obtaining precise spatial information products such as digital surface models, digital orthophotomaps and 3D city models. A parallel processing strategy is introduced in this paper to improve the computational time of the photogrammetric process. Experimental results show that the proposed approaches are effective for processing low-altitude images, and have high potential for the acquisition of spatial information at large mapping scales, with rapid response and precise modelling in three dimensions. Resume Les images acquises a basse altitude par des vehicules aeriens sans pilote presentent plusieurs avantages: un recouvrement important, des angles de visee multiples et une tres haute resolution au sol. De telles images peuvent etre utilisees pour differentes applications necessitant une grande precision ou une texture fine. Une nouvelle approche est proposee pour le traitement parallele d’images acquises a basse altitude par des vehicules aeriens sans pilote, capables de voler en mode automatique selon des lignes de vol predefinies, sous le controle d’un systeme autopilote. Le taux moyen de recouvrement et les angles de rotation relative entre images adjacentes sont estimes par une correlation globale basee sur un operateur SIFT ameliore. Des points conjugues et les parametres d’orientation relative sont determines avec precision grâce a une strategie pyramidale d’appariement d’images par moindres carres et au processus d’orientation relative. La compensation par faisceaux est realisee au moyen de points de liaison apparies automatiquement et de points d’appui mesures de maniere interactive. Apres l’aerotriangulation, les images a haute resolution peuvent etre utilisees avantageusement pour obtenir des donnees spatialisees precises comme des modeles numeriques de surface (MNS), des orthophotoplans numeriques et des modeles urbains 3D. Une strategie de traitement parallele est introduite pour ameliorer la rapidite du processus photogrammetrique. Des resultats experimentaux montrent que l’approche proposee convient au traitement d’images acquises a basse altitude et presente un potentiel eleve pour l’acquisition d’informations spatiales a grande echelle et la modelisation precise en trois dimensions. Zusammenfassung Luftbilder, die von unbemannten Flugobjekten aus geringer Hohe aufgenommen werden, bieten hohe Uberlappungsraten, mehrfache Ansichten eines Objektes und eine sehr hohe Bodenauflosung. Fur die verschiedensten Anwendungen, die eine hohe Genauigkeit oder die Erfassung feiner Texturen erfordern, konnen diese Luftbilder verwendet werden. Zur Auswertung der Aufnahmen der unbemannten Flugzeuge wird ein Vorschlag zur Parallelprozessierung prasentiert. Die unbemannten Flugobjekte fliegen vordefinierte Flugrouten mit Hilfe eines Autopiloten ab. Die Uberlappung und die relativen Orientierungswinkel zweier aufeinanderfolgender Aufnahmen werden durch Bildzuordnung auf der Basis eines verbesserten Scale Invariant Feature Transform (SIFT) Operators bestimmt. Eine Kleinste Quadrate Zuordnung und eine relative Orientierung dienen zur Bestimmung homologer Punkte und der relativen Orientierungswinkel. Die Bundelausgleichung stutzt sich auf automatisch zugeordnete homologe Punkte und manuell gemessene Passpunkte. Die Ergebnisse dieser Aerotriangulation dienen als Grundlage fur die Ableitung digitaler Oberflachenmodelle, digitaler Orthophotokarten und 3D Stadtmodellen. Durch die parallele Prozessierung soll die Effizienz des photogrammetrischen Prozesses gesteigert werden. Die experimentellen Untersuchungen bestatigen, dass die vorgeschlagenen Ansatze in der Lage sind, derartige Aufnahmen zu prozessieren und generell ein groses Potential besitzen, Rauminformation fur grosmasstabige Karten schnell und genau dreidimensional zu erfassen. Resumen Las imagenes aereas tomadas a baja altura por vehiculos aereos no tripulados ofrecen multiples ventajas como el elevado solape, multiples angulos de toma y una resolucion en el terreno muy alta. Este tipo de imagenes puede utilizarse en aquellas aplicaciones que exijan una elevada precision o una textura fina. Se ha desarrollado un nuevo procedimiento para el procesado paralelo de imagenes tomadas por estos vehiculos, que pueden volar automaticamente siguiendo rutas predefinidas controlados por un sistema de navegacion automatico. Los angulos de solape global y de rotacion relativa entre dos imagenes adyacentes se estiman por correspondencia global con un operador SIFT mejorado. Los puntos homologos y los parametros de orientacion relativa se determinan con precision mediante una estrategia de correspondencia de imagenes basada en una piramide de minimos cuadrados y el proceso de orientacion relativa. El ajuste por haces se basa en puntos homologos obtenidos automaticamente y en puntos de apoyo terrestre medidos interactivamente. Tras este proceso de triangulacion aerea las imagenes de alta resolucion pueden utilizarse para la obtencion de productos cartograficos precisos tales como modelos digitales de superficie, ortofotomapas digitales y modelos urbanos tridimensionales. Se presenta una estrategia de proceso paralelo con el objetivo de mejorar la eficiencia del proceso fotogrametrico. Los resultados experimentales indican que los metodos propuestos son efectivos para procesar imagenes tomadas a baja altura, lo que permite mejorar la obtencion de informacion espacial a escalas grandes, una rapida respuesta y un modelado tridimensional preciso.

85 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...The scale-invariant feature transform (SIFT) operator (Lowe, 1999, 2004) can transform image data into scale-invariant coordinates relative to local features....

    [...]

  • ...According to Lowe (1999, 2004), the major stages of computation used to generate a set of image features are: scale-space extrema [sic] detection, keypoint localisation, orientation assignment and keypoint descriptor....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.