scispace - formally typeset
Search or ask a question

Distinctive Image Features from Scale-Invariant Keypoints

01 Jan 2011-
TL;DR: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images that can then be used to reliably match objects in diering images.
Abstract: The Scale-Invariant Feature Transform (or SIFT) algorithm is a highly robust method to extract and consequently match distinctive invariant features from images. These features can then be used to reliably match objects in diering images. The algorithm was rst proposed by Lowe [12] and further developed to increase performance resulting in the classic paper [13] that served as foundation for SIFT which has played an important role in robotic and machine vision in the past decade.
Citations
More filters
Journal ArticleDOI
TL;DR: Novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer are employed and different features extracting strategies are proposed to improve the detection performance.
Abstract: Prostate is a second leading causes of cancer deaths among men. Early detection of cancer can effectively reduce the rate of mortality caused by Prostate cancer. Due to high and multiresolution of MRIs from prostate cancer require a proper diagnostic systems and tools. In the past researchers developed Computer aided diagnosis (CAD) systems that help the radiologist to detect the abnormalities. In this research paper, we have employed novel Machine learning techniques such as Bayesian approach, Support vector machine (SVM) kernels: polynomial, radial base function (RBF) and Gaussian and Decision Tree for detecting prostate cancer. Moreover, different features extracting strategies are proposed to improve the detection performance. The features extracting strategies are based on texture, morphological, scale invariant feature transform (SIFT), and elliptic Fourier descriptors (EFDs) features. The performance was evaluated based on single as well as combination of features using Machine Learning Classification techniques. The Cross validation (Jack-knife k-fold) was performed and performance was evaluated in term of receiver operating curve (ROC) and specificity, sensitivity, Positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR). Based on single features extracting strategies, SVM Gaussian Kernel gives the highest accuracy of 98.34% with AUC of 0.999. While, using combination of features extracting strategies, SVM Gaussian kernel with texture + morphological, and EFDs + morphological features give the highest accuracy of 99.71% and AUC of 1.00.

82 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...Lowe [42] proposed SIFT features that have been 211...

    [...]

Journal ArticleDOI
TL;DR: A novel algorithm to perform3-D modeling, object detection, and pose estimation from unordered point-clouds is presented, which is automatic, model free, and does not rely on any prior information about the objects in the scene.
Abstract: 3-D modeling, object detection, and pose estimation are three of the most challenging tasks in the area of 3-D computer vision. This paper presents a novel algorithm to perform these tasks simultaneously from unordered point-clouds. Given a set of input point-clouds in the presence of clutter and occlusion, an initial model is first constructed by performing pair-wise registration between any two point-clouds. The resulting model is then updated from the remaining point-clouds using a novel model growing technique. Once the final model is reconstructed, the instances of the object are detected and the poses of its instances in the scenes are estimated. This algorithm is automatic, model free, and does not rely on any prior information about the objects in the scene. The algorithm was comprehensively tested on the University of Western Australia data set. Experimental results show that our algorithm achieved accurate modeling, detection, and pose estimation performance.

82 citations


Cites background or result from "Distinctive Image Features from Sca..."

  • ..., scale invariant feature transform [38], speeded up robust features [48], and histogram of gradients [49]) can be combined to further improve the performance of the proposed algorithm....

    [...]

  • ...It is demonstrated that the correct feature correspondences have a much lower ratio compared with those which correspond to incorrect feature correspondences [38]....

    [...]

Proceedings ArticleDOI
01 May 2017
TL;DR: The proposed Fast-SeqSLAM has a much reduced time complexity without degrading the accuracy, and this is achieved by using an approximate nearest neighbor (ANN) algorithm to match the current image with those in the robot map and extending the idea of SequSLAM to greedily search a sequence of images that best match with the current sequence.
Abstract: Loop closure detection or place recognition is a fundamental problem in robot simultaneous localization and mapping (SLAM). SeqSLAM is considered to be one of the most successful algorithms for loop closure detection as it has been demonstrated to be able to handle significant environmental condition changes including those due to illumination, weather, and time of the day. However, SeqSLAM relies heavily on exhaustive sequence matching, a computationally expensive process that prevents the algorithm from being used in dealing with large maps. In this paper, we propose Fast-SeqSLAM, an efficient version of SeqSLAM. Fast-SeqSLAM has a much reduced time complexity without degrading the accuracy, and this is achieved by using an approximate nearest neighbor (ANN) algorithm to match the current image with those in the robot map and extending the idea of SeqSLAM to greedily search a sequence of images that best match with the current sequence. We demonstrate the effectiveness of our Fast-SeqSLAM algorithm in appearance based loop closure detection.

82 citations


Cites methods from "Distinctive Image Features from Sca..."

  • ...In a typical place recognition system that uses the BoW approach, scale-invariant local image descriptors, such as SIFT [12] and SURF [13] keypoints, are extracted from the images and these keypoints are vector-quantized to serve as words in the BoW technique....

    [...]

Journal ArticleDOI
TL;DR: An overview of the state-of-the-art in drone cinematography is presented, along with a brief review of current commercial UAV technologies and legal restrictions on their deployment, and a novel taxonomy of UAV cinematography visual building blocks is proposed.
Abstract: Camera-equipped unmanned aerial vehicles (UAVs), or “drones,” are a recent addition to standard audiovisual shooting technologies. As drone cinematography is expected to further revolutionize media production, this paper presents an overview of the state-of-the-art in this area, along with a brief review of current commercial UAV technologies and legal restrictions on their deployment. A novel taxonomy of UAV cinematography visual building blocks, in the context of filming outdoor events where targets (e.g., athletes) must be actively followed, is additionally proposed. Such a taxonomy is necessary for progress in intelligent/autonomous UAV shooting, which has the potential of addressing current technology challenges. Subsequently, the concepts and advantages inherent in multiple-UAV cinematography are introduced. The core of multiple-UAV cinematography consists in identifying different combinations of multiple single-UAV camera motion types, assembled in meaningful sequences. Finally, based on the defined UAV/camera motion types, tools for managing a partially autonomous, multiple-UAV fleet from the director’s point of view are presented. Although the overall focus is on cinematic coverage of sports events, the majority of our contributions also apply in different scenarios, such as movies/TV production, newsgathering, or advertising.

82 citations


Cites background from "Distinctive Image Features from Sca..."

  • ...In general, they traditionally rely on image stitching methods, composed of successive keypoint detection (SIFT [33] keypoints are typically employed), image alignment (single or double homography estimation is common), calibration and blending stages....

    [...]

Journal ArticleDOI
TL;DR: A method for reconstructing the surface of the whole bladder from endoscopic video using structure from motion to create a 3-D surface model of the bladder that can be expediently reviewed.
Abstract: Flexible cystoscopy is frequently performed for recurrent bladder cancer surveillance, making it the most expensive cancer to treat over the patient's lifetime. An automated bladder surveillance system is being developed to robotically scan the bladder surface using an ultrathin and highly flexible endoscope. Such a system would allow cystoscopic procedures to be overseen by technical staff while urologists could review cystoscopic video postoperatively. In this paper, we demonstrate a method for reconstructing the surface of the whole bladder from endoscopic video using structure from motion. Video is acquired from a custom ultrathin and highly flexible endoscope that can retroflex to image the entire internal surface of the bladder. Selected frames are subsequently stitched into a mosaic and mapped to a reconstructed surface, creating a 3-D surface model of the bladder that can be expediently reviewed. Our software was tested on endoscopic video of an excised pig bladder. The resulting reconstruction possessed a projection error of 1.66 pixels on average and covered 99.6% of the bladder surface area.

82 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Abstract: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.

46,906 citations

Proceedings ArticleDOI
20 Sep 1999
TL;DR: Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.
Abstract: An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds.

16,989 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: The problem the authors are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work.
Abstract: The problem we are addressing in Alvey Project MMI149 is that of using computer vision to understand the unconstrained 3D world, in which the viewed scenes will in general contain too wide a diversity of objects for topdown recognition techniques to work. For example, we desire to obtain an understanding of natural scenes, containing roads, buildings, trees, bushes, etc., as typified by the two frames from a sequence illustrated in Figure 1. The solution to this problem that we are pursuing is to use a computer vision system based upon motion analysis of a monocular image sequence from a mobile camera. By extraction and tracking of image features, representations of the 3D analogues of these features can be constructed.

13,993 citations

Journal ArticleDOI
TL;DR: It is observed that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best and Moments and steerable filters show the best performance among the low dimensional descriptors.
Abstract: In this paper, we compare the performance of descriptors computed for local interest regions, as, for example, extracted by the Harris-Affine detector [Mikolajczyk, K and Schmid, C, 2004]. Many different descriptors have been proposed in the literature. It is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [Belongie, S, et al., April 2002], steerable filters [Freeman, W and Adelson, E, Setp. 1991], PCA-SIFT [Ke, Y and Sukthankar, R, 2004], differential invariants [Koenderink, J and van Doorn, A, 1987], spin images [Lazebnik, S, et al., 2003], SIFT [Lowe, D. G., 1999], complex filters [Schaffalitzky, F and Zisserman, A, 2002], moment invariants [Van Gool, L, et al., 1996], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

7,057 citations

Journal ArticleDOI
TL;DR: The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes.

3,422 citations

Trending Questions (1)
How can distinctive features theory be applied to elision?

The provided information does not mention anything about the application of distinctive features theory to elision.