scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Proceedings ArticleDOI
07 Jul 2010
TL;DR: A panorama image stitching system which combines an image matching algorithm; modified SURF and an image blending algorithm; multi-band blending and it can make the stitching seam invisible and get a perfect panorama for large image data and it is faster than previous method.
Abstract: SURF (Speeded Up Robust Features) is one of the famous feature-detection algorithms. This paper proposes a panorama image stitching system which combines an image matching algorithm; modified SURF and an image blending algorithm; multi-band blending. The process is divided in the following steps: first, get feature descriptor of the image using modified SURF; secondly, find matching pairs, check the neighbors by K-NN (K-nearest neighbor), and remove the mismatch couples by RANSAC(Random Sample Consensus); then, adjust the images by bundle adjustment and estimate the accurate homography matrix; lastly, blend images by multi-band blending. Also, comparison of SIFT (Scale Invariant Feature Transform) and modified SURF are also shown as a base of selection of image matching algorithm. According to the experiments, the present system can make the stitching seam invisible and get a perfect panorama for large image data and it is faster than previous method.

92 citations


Cites background or methods from "Distinctive Image Features from Sca..."

  • ...I. INTRODUCTION Stitching multiple images together can create beautiful highresolution panoramas....

    [...]

  • ...After extracting invariant scale features, we got potential feature matches by using k-nearest neighbor method, and then remove the mismatches with RANSAC algorithm....

    [...]

Journal ArticleDOI
TL;DR: A novel proposed model, which the authors call multinomial Beta-Liouville mixture, is optimized by deterministic annealing expectation-maximization and minimum description length, and strives to achieve a high accuracy of count data clustering and model selection.
Abstract: In this paper, we consider the problem of constructing accurate and flexible statistical representations for count data, which we often confront in many areas such as data mining, computer vision, and information retrieval. In particular, we analyze and compare several generative approaches widely used for count data clustering, namely multinomial, multinomial Dirichlet, and multinomial generalized Dirichlet mixture models. Moreover, we propose a clustering approach via a mixture model based on a composition of the Liouville family of distributions, from which we select the Beta-Liouville distribution, and the multinomial. The novel proposed model, which we call multinomial Beta-Liouville mixture, is optimized by deterministic annealing expectation-maximization and minimum description length, and strives to achieve a high accuracy of count data clustering and model selection. An important feature of the multinomial Beta-Liouville mixture is that it has fewer parameters than the recently proposed multinomial generalized Dirichlet mixture. The performance evaluation is conducted through a set of extensive empirical experiments, which concern text and image texture modeling and classification and shape modeling, and highlights the merits of the proposed models and approaches.

92 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...The next step is computing their SIFT descriptors [52], giving a 128-dimensional vector for each local region, which have been shown to outperform other descriptors such as RIFT and SPIN [47]....

    [...]

Proceedings ArticleDOI
03 Jun 2009
TL;DR: A robust system architecture for the reliable recognition of circular traffic signs by employing complementing approaches for the different stages of current TSR systems and adding a technique called contracting curve density (CCD) to refine the localization of the detected traffic sign candidates and therefore increase the performance of the subsequent classification module.
Abstract: The demand for reliable traffic sign recognition (TSR) increases with the development of safety driven advanced driver assistance systems (ADAS). Emerging technologies like brake-by-wire or steer-by-wire pave the way for collision avoidance and threat identification systems. Obviously, decision making in such critical situations requires high reliability of the information base. Especially for comfort systems, we need to take into account that the user tends to trust the information provided by the ADAS [1]. In this paper, we present a robust system architecture for the reliable recognition of circular traffic signs. Our system employs complementing approaches for the different stages of current TSR systems. This introduces the application of local SIFT features for content-based traffic sign detection along with widely applied shape-based approaches. We further add a technique called contracting curve density (CCD) to refine the localization of the detected traffic sign candidates and therefore increase the performance of the subsequent classification module. Finally, the recognition stage based on SIFT and SURF descriptions of the candidates executed by a neural net provides a robust classification of structured image content like traffic signs. By applying these steps we compensate the weaknesses of the utilized approaches, and thus, improve the system's performance.

92 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...[25] (b) Scheme of the SURF descriptor....

    [...]

  • ...The first one is to a large extent similar to the SIFT descriptor [25]....

    [...]

  • ...different traffic scenes, SIFT (Scale-invariant feature transform [25]) turned out to be the most suitable one for reliable TSR....

    [...]

Journal ArticleDOI
01 May 2012
TL;DR: A hierarchical filtered motion (HFM) method to recognize actions in crowded videos by the use of motion history image (MHI) as basic representations of motion because of its robustness and efficiency is proposed.
Abstract: Action recognition with cluttered and moving background is a challenging problem. One main difficulty lies in the fact that the motion field in an action region is contaminated by the background motions. We propose a hierarchical filtered motion (HFM) method to recognize actions in crowded videos by the use of motion history image (MHI) as basic representations of motion because of its robustness and efficiency. First, we detect interest points as the two-dimensional Harris corners with recent motion, e.g., locations with high intensities in the MHI. Then, a global spatial motion smoothing filter is applied to the gradients of the MHI to eliminate isolated unreliable or noisy motions. At each interest point, a local motion field filter is applied to the smoothed gradients of the MHI by computing structure proximity between any pixel in the local region and the interest point. Thus, the motion at a pixel is enhanced or weakened based on its structure proximity with the interest point. To validate its effectiveness, we characterize the spatial and temporal features by histograms of oriented gradient in the intensity image and the MHI, respectively, and use a Gaussian-mixture-model-based classifier for action recognition. The performance of the proposed approach achieves the state-of-the-art results on the KTH dataset that has clean background. More importantly, we perform cross-dataset action classification and detection experiments, where the KTH dataset is used for training, while the microsoft research (MSR) action dataset II that consists of crowded videos with people moving in the background is used for testing. Our experiments show that the proposed HFM method significantly outperforms existing techniques.

92 citations

Journal ArticleDOI
TL;DR: The significant challenges currently facing ISPRS and its communities are examined, such as providing high-quality information, enabling advanced geospatial computing, and supporting collaborative problem solving.
Abstract: With the increased availability of very high-resolution satellite imagery, terrain based imaging and participatory sensing, inexpensive platforms, and advanced information and communication technologies, the application of imagery is now ubiquitous, playing an important role in many aspects of life and work today. As a leading organisation in this field, the International Society for Photogrammetry and Remote Sensing (ISPRS) has been devoted to effectively and efficiently obtaining and utilising information from imagery since its foundation in the year 1910. This paper examines the significant challenges currently facing ISPRS and its communities, such as providing high-quality information, enabling advanced geospatial computing, and supporting collaborative problem solving. The state-of-the-art in ISPRS related research and development is reviewed and the trends and topics for future work are identified. By providing an overarching scientific vision and research agenda, we hope to call on and mobilise all ISPRS scientists, practitioners and other stakeholders to continue improving our understanding and capacity on information from imagery and to deliver advanced geospatial knowledge that enables humankind to better deal with the challenges ahead, posed for example by global change, ubiquitous sensing, and a demand for real-time information generation.

92 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...…one is the development of invariant feature detectors, also called interest point operators, such as the Förstner (Förstner and Gülch 1987) and Harris (Harries and Stephen 1988) operators, scale-invariant feature transform (SIFT, Lowe 2004) and speeded up robust features (SURF, Bay et al. 2008)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.