scispace - formally typeset
Search or ask a question

Showing papers on "Scale-invariant feature transform published in 2004"


Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations


Proceedings ArticleDOI
27 Jun 2004
TL;DR: This paper examines (and improves upon) the local image descriptor used by SIFT, and demonstrates that the PCA-based local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation.
Abstract: Stable local feature detection and representation is a fundamental component of many image registration and object recognition algorithms. Mikolajczyk and Schmid (June 2003) recently evaluated a variety of approaches and identified the SIFT [D. G. Lowe, 1999] algorithm as being the most resistant to common image deformations. This paper examines (and improves upon) the local image descriptor used by SIFT. Like SIFT, our descriptors encode the salient aspects of the image gradient in the feature point's neighborhood; however, instead of using SIFT's smoothed weighted histograms, we apply principal components analysis (PCA) to the normalized gradient patch. Our experiments demonstrate that the PCA-based local descriptors are more distinctive, more robust to image deformations, and more compact than the standard SIFT representation. We also present results showing that using these descriptors in an image retrieval application results in increased accuracy and faster matching.

3,325 citations


Journal ArticleDOI
TL;DR: An incoherent method to search for continuous gravitational waves based on the Hough transform, a well-known technique used for detecting patterns in digital images, is described.
Abstract: This paper describes an incoherent method to search for continuous gravitational waves based on the Hough transform, a well-known technique used for detecting patterns in digital images. We apply the Hough transform to detect patterns in the time-frequency plane of the data produced by an earth-based gravitational wave detector. Two different flavors of searches will be considered, depending on the type of input to the Hough transform: either Fourier transforms of the detector data or the output of a coherent matched-filtering type search. We present the technical details for implementing the Hough transform algorithm for both kinds of searches, their statistical properties, and their sensitivities.

131 citations


01 Jan 2004
TL;DR: This paper presents a method to reduce the size, complexity and matching time of SIFT feature sets for use in indoor image retrieval and robot localisation, and outlines how the scale information of the SIFT features can be used to improve the accuracy of a localisation filter.
Abstract: SIFT features are distinctive invariant features used to robustly describe and match digital image content between different views of a scene. While invariant to scale and rotation, and robust to other image transforms, the SIFT feature description of an image is typically large and slow to compute. This paper presents a method to reduce the size, complexity and matching time of SIFT feature sets for use in indoor image retrieval and robot localisation. Our method takes advantage of the structure of typical indoor environments to reduce the complexity of each SIFT feature and the number of SIFT features required to describe a scene. Our results show that there is a minimal loss of accuracy in feature retrieval while achieving a significant reduction in image descriptor size and matching time. We also outline how the scale information of the SIFT features can be used to improve the accuracy of a localisation filter. The results were obtained using digital images from interior home and office environments.

131 citations


Journal ArticleDOI
TL;DR: In this article, a statistical study of the Hough transform was conducted, where the authors established strong consistency, rates of convergence, and characterisation of the limiting distribution of the estimator.
Abstract: This paper pursues a statistical study of the Hough transform; the celebrated computer vision algorithm used to detect the presence of lines in a noisy image. We first study asymptotic properties of the Hough transform estimator, whose objective is to find the line that “best” fits a set of planar points. In particular, we establish strong consistency, rates of convergence and characterize the limiting distribution of the Hough transform estimator. While the convergence rates are seen to be slower than those found in some standard regression methods, the Hough transform estimator is shown to be more robust as measured by its breakdown point. We next study the Hough transform in the context of the problem of detecting multiple lines. This is addressed via the framework of excess mass functionals and modality testing. Throughout, several numerical examples help illustrate various properties of the estimator. Relations between the Hough transform and more mainstream statistical paradigms and methods are discussed as well. Short Title: The Hough transform estimator

80 citations


Proceedings ArticleDOI
26 Aug 2004
TL;DR: Hough transform algorithm is introduced and is adopted in iris segmentation and a modified fast algorithm is presented to solving low speed of Hough transform.
Abstract: Hough transform is a common algorithm used to detect geometry shape of objects in computer image processing, some points satisfied lines, circles and other curves can be detected easily by using Hough transform in image. In iris recognition system, both the inner boundary and the outer boundary of a typical iris can approximately be taken as circles, so the two circles of iris can be obtained by using Hough transform. In this paper, Hough transform algorithm is introduced and is adopted in iris segmentation. A modified fast algorithm is presented to solving low speed of Hough transform. Simulations and research results show that Hough transforms are satisfied in iris segmentation.

80 citations


Book ChapterDOI
11 May 2004
TL;DR: The technique uses the relative simplicity of small baseline tracking in image sequences to develop descriptors suitable for the more challenging task of wide baseline matching across significant viewpoint changes, motivated by the problems of mobile robot navigation and localization.
Abstract: We present a method for learning feature descriptors using multiple images, motivated by the problems of mobile robot navigation and localization. The technique uses the relative simplicity of small baseline tracking in image sequences to develop descriptors suitable for the more challenging task of wide baseline matching across significant viewpoint changes. The variations in the appearance of each feature are learned using kernel principal component analysis (KPCA) over the course of image sequences. An approximate version of KPCA is applied to reduce the computational complexity of the algorithms and yield a compact representation. Our experiments demonstrate robustness to wide appearance variations on non-planar surfaces, including changes in illumination, viewpoint, scale, and geometry of the scene.

52 citations


Journal ArticleDOI
TL;DR: A two-steps adaptive generalized Hough transform (GHT) for the detection of non-analytic objects undergoing weak affine transformations in images that makes the computation more amenable and ensures high accuracy, while keeping the size of the accumulator array small.

42 citations


Journal ArticleDOI
TL;DR: An adaptive approach is proposed to determine the policies for selecting the point pairs such that the resultant voting process is able to reduce spurious peaks and the point selection process is made more efficient.

31 citations


Proceedings ArticleDOI
25 Oct 2004
TL;DR: This paper shows how to select a set of robust features that are appropriate for the task of visual servoing and investigates the application of scale-invariant feature transform (SIFT) in robotic visual Servoing (RVS).
Abstract: In this paper, we focus on the robust feature selection and investigate the application of scale-invariant feature transform (SIFT) in robotic visual servoing (RVS). We consider a camera mounted onto the endpoint of an anthropomorphic manipulator (eye-in-hand configuration). The objective of such RVS system is to control the pose of the camera so that a desired relative pose between the camera and the object of interest is maintained. It is seen that the SIFT feature point correspondences are not unique and hence those feature points with more than a unique match are disregarded. When the endpoint moves along a trajectory, the robust SIFT feature points are found and then for a similar trajectory the same selected feature points are used to keep track of the object in the current view. The point correspondences of the remaining robust feature points would provide the epipolar geometry of the two scenes, where knowing the camera calibration the motion of the camera is retrieved. The robot joint angle vector is then determined solving the inverse kinematics of the manipulator. We show how to select a set of robust features that are appropriate for the task of visual servoing. Robust SIFT feature points are scale and rotation invariant and effective when the current position of the endpoint is farther than and rotated with respect to the desired position.

18 citations


Proceedings ArticleDOI
01 Jan 2004
TL;DR: In this article, a novel circle searching algorithm, Chord Reconstruction, was developed to identify and locate tomatoes in a robotic harvesting system, which was shown to be less accurate than the traditional Hough transform.
Abstract: A novel circle searching algorithm, Chord Reconstruction, was developed to identify and locate tomatoes in a robotic harvesting system. This algorithm searches circles by first locating the centers and then determining the radii. The results showed that Chord Reconstruction was less accurate than circular Hough transform. However, the mean processing time for each 640×480 images was 1 second, which was 42.3 times faster than circular Hough transform. The results further revealed that both algorithms were capable of identifying and locating tomatoes that were partially occluded. Linear Hough transform and curvature criterion were implemented to detect obstacles in front of tomatoes. Curvature criterion was more accurate than linear Hough transform because linear Hough transform was not able to detect tomato stems, which were not perfectly straight.

Book ChapterDOI
TL;DR: A new hierarchical Hough transform algorithm is proposed, which is faster and more accurate compared to conventional generalized Houghtransform, for the biometric identification applications.
Abstract: This paper addresses the improvement on the matching accuracy and speed of generalized Hough transform for the biometric identification applications. The difficulties encountered in generalized Hough transform are investigated and a new hierarchical Hough transform algorithm is proposed, which is faster and more accurate compared to conventional generalized Hough transform.

10 May 2004
TL;DR: This paper proposes to use statistical dimension reduction techniques to obtain a more discriminant representation of data, in order to increase recognition results.
Abstract: This paper addresses the challenging task of recognizing and locating objects in natural images. In computer vision, many successful approaches to object recognition use local image descriptors. Such descriptors do not require segmentation, in addition they are robust to partial occlusion and invariant to image transformations (particularly scale changes). Among the existing descriptors, a recent comparison [4] showed that the SIFT descriptor [2] was particularly robust. However, the SIFT descriptor is high-dimensional (typically 128-dimensional) and this penalizes classification. In this paper, we propose to use statistical dimension reduction techniques to obtain a more discriminant representation of data, in order to increase recognition results. We will first describe the two stages of the recognition process (See Fig. 1), learning and recognition, then we will present experimental results obtained on motorbikes images.

Proceedings ArticleDOI
28 Sep 2004
TL;DR: The model learning framework is presented, and experimental results illustrating the success of the method for learning models that are useful for robot localization and visualization tasks are presented.
Abstract: We present a method for learning a set of models of visual features which are invariant to scale and translation in the image domain. The models are constructed by first applying the scale-invariant feature transform (SIFT) to a set of training images, and matching the extracted features across the images, followed by learning the pose-dependent behavior of the features. The modeling process avoids assumptions with respect to scene and imaging geometry, but rather learns the direct mapping from camera pose to feature observation. Such models are useful for applications to robotic tasks, such as localization, as well as visualization tasks. We present the model learning framework, and experimental results illustrating the success of the method for learning models that are useful for robot localization.

Journal Article
TL;DR: A new fast Hough transform algorithm which can effectively decrease the time complexity of the algorithm and improve the efficiency and speed is discussed.
Abstract: Hough transform is a widely used checking algorithm in image processing. It can effectively check the given information in the image even there is much noise. However the standard algorithm must process much data and is slow.This paper discusses a new fast Hough transform algorithm which can effectively decrease the time complexity of the algorithm and improve the efficiency and speed. This algorithm is useful to enhance the speed of image processing.

Proceedings Article
01 Sep 2004
TL;DR: An approach to apply the Hough Transform to the recognition of scale variant patterns is introduced based on the Euclidean distance of the image and the pattern and the concept of Graduated Non-Convexity is applied to the problem of evaluating the parameter space.
Abstract: The Hough Transform is a histogram method for pattern recognition. In this paper an approach to apply the Hough Transform to the recognition of scale variant patterns is introduced. The approach is based on the Euclidean distance of the image and the pattern. Further, the concept of Graduated Non-Convexity (GNC) is applied to the problem of evaluating the parameter space. A new, fast Hough Transform algorithm is the result which can be generalized to high-dimensional parameter spaces.

Proceedings Article
01 Jan 2004
TL;DR: A method to use the idea of Hough Transform implemented in grey scale images to color images for region extraction to extract the homogeneous regions in the images taken from the Indian Remote Sensing Satellites is proposed.
Abstract: This article aims to propose a method to use the idea of Hough Transform (HT) implemented in grey scale images to color images for region extraction. A region in an image is seen as a union of pixels on several line segments having the homogeneity property. A line segment in an image is seen as a collection of pixels having the property of straight line in Euclidean plane and possessing the same property. The property ’homogeneity’ in a color image is based on the trace of the variance covariance matrix of the colors of the pixels on the straight line. As a possible application of the method, it is used to extract the homogeneous regions in the images taken from the Indian Remote Sensing Satellites.

01 Jan 2004
TL;DR: An adaptive method for accurate and robust grouping of local features belonging to planes of interior scenes and object planar surfaces is presented to fill the gap between low-level vision (front end) and high level vision, i.e., domain specific reasoning about geometric structures.
Abstract: In this work an adaptive method for accurate and robust grouping of local features belonging to planes of interior scenes and object planar surfaces is presented. For arbitrary set of images acquired from different views, the method organizes a huge number of local SIFT features to fill the gap between low-level vision (front end) and high level vision, i.e., domain specific reasoning about geometric structures. The proposed method consists of three steps: exploration, selection, and merging with verification. The exploration is a data driven technique that proposes a set of hypothesis clusters. To select the final hypotheses a matrix of preferences is introduced. It evaluates each of the hypothesis in terms of number of features, error of transformation, and feature duplications and is applied in quadratic form in the process of maximization. Then, merging process combines the information from multiple views to reduce the redundancy and to enrich the selected representations. As demonstrated by experimental results, the proposed method is an example of unsupervised learning of planar parts of the scene and objects with planar surfaces.

Book ChapterDOI
23 Aug 2004
TL;DR: This paper addresses the problem of content-based synchronization for robust watermarking with a new synchronization method based on the scale invariant feature transform that is invariant to noise, spatial filtering, geometric distortions, and illumination changes of the image.
Abstract: This paper addresses the problem of content-based synchronization for robust watermarking. Synchronization is a process of extracting the location to embed and detect the signature, copyright information, so that it is crucial for the robustness of the watermarking system. In this paper, we will review representative content-based approaches and propose a new synchronization method based on the scale invariant feature transform. In content-based synchronization approaches, it is important to extract robust features even with image distortions and we suspect that the consideration of local image characteristics will be helpful. The scale invariant feature transform regards these characteristics and is invariant to noise, spatial filtering, geometric distortions, and illumination changes of the image. Through experiments, we will compare the proposed method with representative content-based approaches and show the appropriateness for robust watermarking.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: The Hough transform with projection with projection (HTP) is extended to 3D lines in a particular case and the robustness of this method, compared to the classical HT, is displayed.
Abstract: We propose to extend the Hough transform (HT) to 3D lines in a particular case: for each 3D point, we search for the line passing through this point and fitting the other points. The method uses projection of data on different planes and the combination of these results using least square algorithm. We call this algorithm "Hough transform with projection (HTP)". The robustness of this method, compared to the classical HT is displayed. We later use this algorithm to compute the speed of moving regions in images sequences, without any tracking.


01 Jan 2004
TL;DR: This paper describes the approach to the reconstruction of parameters of the three uncalibrated cameras using information from the three projections of the static scene, and builds the optimal trifocal tensor which is used to reconstruct projective cameras.
Abstract: One of the most important parts of 3D computer vision systems is reconstruction of cameras. In this paper we describe our approach to the reconstruction of parameters of the three uncalibrated cameras using information from the three projections of the static scene. We match distinct features of the scene (such as corner points and straight line segments) and robustly sift out outlier matches using RANSAC techniques. Then the optimal trifocal tensor is built using an iterative algorithm which uses inlier matches. This trifocal tensor is used to reconstruct projective cameras. Finally these cameras may be transformed to metric if certain assumptions are presumed. The algorithm pipeline is fully automatic.

Proceedings ArticleDOI
23 Aug 2004
TL;DR: This paper presents a method of a Hough transform for a log-polar mapped edge image that includes both foveal and peripheral visual information and shows that this method achieves a good performance for line detection on the log- polar space.
Abstract: Line equation in Cartesian space can be mapped conformally to log-polar. But many the applications of log-polar mapping are using only the peripheral region. This paper presents a method of a Hough transform for a log-polar mapped edge image that includes both foveal and peripheral visual information. It is shown that this method achieves a good performance for line detection on the log-polar space.