Showing papers on "Scale-invariant feature transform published in 2013"

PDF

Open Access

Proceedings Article•DOI•

Recognizing Text with Perspective Distortion in Natural Scenes

[...]

Trung Quy Phan¹, Palaiahnakote Shivakumara², Shangxuan Tian¹, Chew Lim Tan¹•Institutions (2)

National University of Singapore¹, University of Malaya²

01 Dec 2013

TL;DR: This paper introduces a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints and significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.

...read moreread less

Abstract: This paper presents an approach to text recognition in natural scene images. Unlike most existing works which assume that texts are horizontal and frontal parallel to the image plane, our method is able to recognize perspective texts of arbitrary orientations. For individual character recognition, we adopt a bag-of-key points approach, in which Scale Invariant Feature Transform (SIFT) descriptors are extracted densely and quantized using a pre-trained vocabulary. Following [1, 2], the context information is utilized through lexicons. We formulate word recognition as finding the optimal alignment between the set of characters and the list of lexicon words. Furthermore, we introduce a new dataset called StreetViewText-Perspective, which contains texts in street images with a great variety of viewpoints. Experimental results on public datasets and the proposed dataset show that our method significantly outperforms the state-of-the-art on perspective texts of arbitrary orientations.

...read moreread less

378 citations

Proceedings Article•DOI•

Probabilistic Elastic Matching for Pose Variant Face Verification

[...]

Haoxiang Li¹, Gang Hua¹, Zhe Lin², Jonathan Brandt², Jianchao Yang² - Show less +1 more•Institutions (2)

Stevens Institute of Technology¹, Adobe Systems²

23 Jun 2013

TL;DR: This work proposes a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy.

...read moreread less

Abstract: Pose variation remains to be a major challenge for real-world face recognition. We approach this problem through a probabilistic elastic matching method. We take a part based representation by extracting local features (e.g., LBP or SIFT) from densely sampled multi-scale image patches. By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatial-appearance distribution of all face images in the training corpus. Each mixture component of the GMM is confined to be a spherical Gaussian to balance the influence of the appearance and the location terms. Each Gaussian component builds correspondence of a pair of features to be matched between two faces/face tracks. For face verification, we train an SVM on the vector concatenating the difference vectors of all the feature pairs to decide if a pair of faces/face tracks is matched or not. We further propose a joint Bayesian adaptation algorithm to adapt the universally trained GMM to better model the pose variations between the target pair of faces/face tracks, which consistently improves face verification accuracy. Our experiments show that our method outperforms the state-of-the-art in the most restricted protocol on Labeled Face in the Wild (LFW) and the YouTube video face database by a significant margin.

...read moreread less

232 citations

Journal Article•DOI•

Automated identification of animal species in camera trap images

[...]

Xiaoyuan Yu¹, Xiaoyuan Yu², Jiangping Wang¹, Roland Kays³, Roland Kays⁴, Roland Kays⁵, Patrick A. Jansen³, Patrick A. Jansen⁶, Tianjiang Wang², Thomas S. Huang¹ - Show less +6 more•Institutions (6)

University of Illinois at Urbana–Champaign¹, Huazhong University of Science and Technology², Smithsonian Tropical Research Institute³, North Carolina State University⁴, North Carolina Museum of Natural Sciences⁵, Wageningen University and Research Centre⁶

04 Sep 2013-Eurasip Journal on Image and Video Processing

TL;DR: An automated species identification method for wildlife pictures captured by remote camera traps that uses improved sparse coding spatial pyramid matching (ScSPM), which extracts dense SIFT descriptor and cell-structured LBP as the local features and generates global feature via weighted sparse coding and max pooling using multi-scale pyramid kernel.

...read moreread less

Abstract: Image sensors are increasingly being used in biodiversity monitoring, with each study generating many thousands or millions of pictures. Efficiently identifying the species captured by each image is a critical challenge for the advancement of this field. Here, we present an automated species identification method for wildlife pictures captured by remote camera traps. Our process starts with images that are cropped out of the background. We then use improved sparse coding spatial pyramid matching (ScSPM), which extracts dense SIFT descriptor and cell-structured LBP (cLBP) as the local features, that generates global feature via weighted sparse coding and max pooling using multi-scale pyramid kernel, and classifies the images by a linear support vector machine algorithm. Weighted sparse coding is used to enforce both sparsity and locality of encoding in feature space. We tested the method on a dataset with over 7,000 camera trap images of 18 species from two different field cites, and achieved an average classification accuracy of 82%. Our analysis demonstrates that the combination of SIFT and cLBP can serve as a useful technique for animal species recognition in real, complex scenarios.

...read moreread less

184 citations

Journal Article•

A Comparison of SIFT and SURF

[...]

P M Panchal, S R Panchal, S K Shah

01 Jan 2013-International Journal of Innovative Research in Computer and Communication Engineering

TL;DR: Two different methods for scale and rotation invariant interest point/feature detector and descriptor are presented: Scale Invariant Feature Transform (SIFT) and Speed Up Robust Features (SURF).

...read moreread less

Abstract: Accurate, robust and automatic image registration is critical task in many applications. To perform image registration/alignment, required steps are: Feature detection, Feature matching, derivation of transformation function based on corresponding features in images and reconstruction of images based on derived transformation function. Accuracy of registered image depends on accurate feature detection and matching. So these two intermediate steps are very important in many image applications: image registration, computer vision, image mosaic etc. This paper presents two different methods for scale and rotation invariant interest point/feature detector and descriptor: Scale Invariant Feature Transform (SIFT) and Speed Up Robust Features (SURF). It also presents a way to extract distinctive invariant features from images that can be used to perform reliable matching between different views of an object/scene.

...read moreread less

166 citations

Journal Article•DOI•

Registration of Optical and SAR Satellite Images by Exploring the Spatial Relationship of the Improved SIFT

[...]

Bin Fan, Chunlei Huo, Chunhong Pan, Qingqun Kong

01 Jul 2013-IEEE Geoscience and Remote Sensing Letters

TL;DR: An improved version of the scale-invariant feature transform is first proposed to obtainInitial matching features from optical and SAR images, and the initial matching features are refined by exploring their spatial relationship.

...read moreread less

Abstract: Although feature-based methods have been successfully developed in the past decades for the registration of optical images, the registration of optical and synthetic aperture radar (SAR) images is still a challenging problem in remote sensing. In this letter, an improved version of the scale-invariant feature transform is first proposed to obtain initial matching features from optical and SAR images. Then, the initial matching features are refined by exploring their spatial relationship. The refined feature matches are finally used for estimating registration parameters. Experimental results have shown the effectiveness of the proposed method.

...read moreread less

163 citations

Journal Article•DOI•

A Comparative Study of SIFT and its Variants

[...]

Jian Wu¹, Zhiming Cui, Victor S. Sheng, Pengpeng Zhao, Dongliang Su, Shengrong Gong - Show less +2 more•Institutions (1)

Soochow University (Suzhou)¹

01 Jun 2013-Measurement Science Review

TL;DR: This paper systematically analyzed SIFT and its variants and evaluated their performance in different situations: scale change, rotation change, blur change, illumination change, and affine change to show that each has its own advantages.

...read moreread less

Abstract: SIFT is an image local feature description algorithm based on scale-space. Due to its strong matching ability, SIFT has many applications in different fields, such as image retrieval, image stitching, and machine vision. After SIFT was proposed, researchers have never stopped tuning it. The improved algorithms that have drawn a lot of attention are PCA-SIFT, GSIFT, CSIFT, SURF and ASIFT. In this paper, we first systematically analyze SIFT and its variants. Then, we evaluate their performance in different situations: scale change, rotation change, blur change, illumination change, and affine change. The experimental results show that each has its own advantages. SIFT and CSIFT perform the best under scale and rotation change. CSIFT improves SIFT under blur change and affine change, but not illumination change. GSIFT performs the best under blur change and illumination change. ASIFT performs the best under affine change. PCA-SIFT is always the second in different situations. SURF performs the worst in different situations, but runs the fastest.

...read moreread less

159 citations

Proceedings Article•DOI•

Segmentation Driven Object Detection with Fisher Vectors

[...]

Ramazan Gokberk Cinbis¹, Jakob Verbeek¹, Cordelia Schmid¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Dec 2013

TL;DR: A method to produce tentative object segmentation masks to suppress background clutter in the features to improve object detection significantly and exploit contextual features in the form of a full-image FV descriptor, and an inter-category rescoring mechanism.

...read moreread less

Abstract: We present an object detection system based on the Fisher vector (FV) image representation computed over SIFT and color descriptors. For computational and storage efficiency, we use a recent segmentation-based method to generate class-independent object detection hypotheses, in combination with data compression techniques. Our main contribution is a method to produce tentative object segmentation masks to suppress background clutter in the features. Re-weighting the local image features based on these masks is shown to improve object detection significantly. We also exploit contextual features in the form of a full-image FV descriptor, and an inter-category rescoring mechanism. Our experiments on the VOC 2007 and 2010 datasets show that our detector improves over the current state-of-the-art detection results.

...read moreread less

140 citations

Journal Article•DOI•

Image region description using orthogonal combination of local binary patterns enhanced with color information

[...]

Chao Zhu¹, Charles-Edmond Bichot¹, Liming Chen¹•Institutions (1)

École centrale de Lyon¹

01 Jul 2013-Pattern Recognition

TL;DR: A new operator called the orthogonal combination of local binary patterns and six new local descriptors based on OC-LBP enhanced with color information for image region description are proposed to increase both discriminative power and photometric invariance properties of the original LBP operator while keeping its computational efficiency.

...read moreread less

128 citations

Journal Article•DOI•

Flip-Invariant SIFT for Copy and Object Detection

[...]

Wan-Lei Zhao¹, Chong-Wah Ngo²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, City University of Hong Kong²

01 Mar 2013-IEEE Transactions on Image Processing

TL;DR: A new descriptor, named flip-invariant SIFT (or F-SIFT), that preserves the original properties of SIFT while being tolerant to flips is proposed and demonstrated, which leads to a more than 50% savings in computational cost.

...read moreread less

Abstract: Scale-invariant feature transform (SIFT) feature has been widely accepted as an effective local keypoint descriptor for its invariance to rotation, scale, and lighting changes in images. However, it is also well known that SIFT, which is derived from directionally sensitive gradient fields, is not flip invariant. In real-world applications, flip or flip-like transformations are commonly observed in images due to artificial flipping, opposite capturing viewpoint, or symmetric patterns of objects. This paper proposes a new descriptor, named flip-invariant SIFT (or F-SIFT), that preserves the original properties of SIFT while being tolerant to flips. F-SIFT starts by estimating the dominant curl of a local patch and then geometrically normalizes the patch by flipping before the computation of SIFT. We demonstrate the power of F-SIFT on three tasks: large-scale video copy detection, object recognition, and detection. In copy detection, a framework, which smartly indices the flip properties of F-SIFT for rapid filtering and weak geometric checking, is proposed. F-SIFT not only significantly improves the detection accuracy of SIFT, but also leads to a more than 50% savings in computational cost. In object recognition, we demonstrate the superiority of F-SIFT in dealing with flip transformation by comparing it to seven other descriptors. In object detection, we further show the ability of F-SIFT in describing symmetric objects. Consistent improvement across different kinds of keypoint detectors is observed for F-SIFT over the original SIFT.

...read moreread less

111 citations

Journal Article•DOI•

Fast SIFT Design for Real-Time Visual Feature Extraction

[...]

Liang-Chi Chiu, Tian-Sheuan Chang¹, Jiun-Yen Chen¹, N. Y-C Chang•Institutions (1)

National Chiao Tung University¹

24 Apr 2013-IEEE Transactions on Image Processing

TL;DR: A layer parallel SIFT (LPSIFT) with integral image, and its parallel hardware design with an on-the-fly feature extraction flow for real-time application needs, which reduces the computational amount by 90% and memory usage by 95%.

...read moreread less

Abstract: Visual feature extraction with scale invariant feature transform (SIFT) is widely used for object recognition. However, its real-time implementation suffers from long latency, heavy computation, and high memory storage because of its frame level computation with iterated Gaussian blur operations. Thus, this paper proposes a layer parallel SIFT (LPSIFT) with integral image, and its parallel hardware design with an on-the-fly feature extraction flow for real-time application needs. Compared with the original SIFT algorithm, the proposed approach reduces the computational amount by 90% and memory usage by 95%. The final implementation uses 580-K gate count with 90-nm CMOS technology, and offers 6000 feature points/frame for VGA images at 30 frames/s and ~ 2000 feature points/frame for 1920 × 1080 images at 30 frames/s at the clock rate of 100 MHz.

...read moreread less

108 citations

Journal Article•DOI•

An efficient approach for robust multimodal retinal image registration based on UR-SIFT features and PIIFD descriptors

[...]

Zeinab Ghassabi¹, Jamshid Shanbehzadeh², Amin Sedaghat³, Emad Fatemizadeh⁴•Institutions (4)

Islamic Azad University¹, Kharazmi University², K.N.Toosi University of Technology³, Sharif University of Technology⁴

28 Apr 2013-Eurasip Journal on Image and Video Processing

TL;DR: A novel integrated approach which exploits features of uniform robust scale invariant feature transform (UR-SIFT) and PIIFD and is robust against low content contrast of color images and large content, appearance, and scale changes between color and other retinal image modalities like the fluorescein angiography.

...read moreread less

Abstract: Existing algorithms based on scale invariant feature transform (SIFT) and Harris corners such as edge-driven dual-bootstrap iterative closest point and Harris-partial intensity invariant feature descriptor (PIIFD) respectivley have been shown to be robust in registering multimodal retinal images. However, they fail to register color retinal images with other modalities in the presence of large content or scale changes. Moreover, the approaches need preprocessing operations such as image resizing to do well. This restricts the application of image registration for further analysis such as change detection and image fusion. Motivated by the need for efficient registration of multimodal retinal image pairs, this paper introduces a novel integrated approach which exploits features of uniform robust scale invariant feature transform (UR-SIFT) and PIIFD. The approach is robust against low content contrast of color images and large content, appearance, and scale changes between color and other retinal image modalities like the fluorescein angiography. Due to low efficiency of standard SIFT detector for multimodal images, the UR-SIFT algorithm extracts high stable and distinctive features in the full distribution of location and scale in images. Then, feature points are adequate and repeatable. Moreover, the PIIFD descriptor is symmetric to contrast, which makes it suitable for robust multimodal image registration. After the UR-SIFT feature extraction and the PIIFD descriptor generation in images, an initial cross-matching process is performed and followed by a mismatch elimination algorithm. Our dataset consists of 120 pairs of multimodal retinal images. Experiment results show the outperformance of the UR-SIFT-PIIFD over the Harris-PIIFD and similar algorithms in terms of efficiency and positional accuracy.

...read moreread less

Journal Article•DOI•

A comparative study of image low level feature extraction algorithms

[...]

M.M. El-gayar¹, Hassan Soliman¹, N. meky¹•Institutions (1)

Mansoura University¹

01 Jul 2013-Egyptian Informatics Journal

TL;DR: This study addresses the limitations of the existing comparative tools and delivers a generalized criterion to determine beforehand the level of efficiency expected from a matching algorithm given the type of images evaluated.

...read moreread less

Journal Article•DOI•

A comparison of 3D interest point descriptors with application to airport baggage object detection in complex CT imagery

[...]

Greg T. Flitton¹, Toby P. Breckon¹, Najla Megherbi¹•Institutions (1)

Cranfield University¹

01 Sep 2013-Pattern Recognition

TL;DR: It is shown that, in the complex CT imagery domain containing a high degree of noise and imaging artefacts, a specific instance object recognition system using simpler descriptors appears to outperform a more complex RIFT/SIFT solution.

...read moreread less

Proceedings Article•DOI•

EVSAC: Accelerating Hypotheses Generation by Modeling Matching Scores with Extreme Value Theory

[...]

Victor Fragoso¹, Pradeep Sen¹, Sergio Rodriguez¹, Matthew Turk¹•Institutions (1)

University of California, Santa Barbara¹

01 Dec 2013

TL;DR: A probabilistic parametric model that allows us to assign confidence values for each matching correspondence and therefore accelerates the generation of hypothesis models for RANSAC under these conditions and is able to estimate accurate hypotheses at low inlier ratios significantly faster than previous state-of-the-art approaches.

...read moreread less

Abstract: Algorithms based on RANSAC that estimate models using feature correspondences between images can slow down tremendously when the percentage of correct correspondences (inliers) is small. In this paper, we present a probabilistic parametric model that allows us to assign confidence values for each matching correspondence and therefore accelerates the generation of hypothesis models for RANSAC under these conditions. Our framework leverages Extreme Value Theory to accurately model the statistics of matching scores produced by a nearest-neighbor feature matcher. Using a new algorithm based on this model, we are able to estimate accurate hypotheses with RANSAC at low inlier ratios significantly faster than previous state-of-the-art approaches, while still performing comparably when the number of inliers is large. We present results of homography and fundamental matrix estimation experiments for both SIFT and SURF matches that demonstrate that our method leads to accurate and fast model estimations.

...read moreread less

Proceedings Article•DOI•

Light Field Distortion Feature for Transparent Object Recognition

[...]

Kazuki Maeno¹, Hajime Nagahara¹, Atsushi Shimada¹, Rin-ichiro Taniguchi¹•Institutions (1)

Kyushu University¹

23 Jun 2013

TL;DR: This paper uses a single-shot light field image as an input and proposes a new feature, called the light field distortion (LFD) feature, for identifying a transparent object, which is incorporated into the bag-of-features approach for recognizing transparent objects.

...read moreread less

Abstract: Current object-recognition algorithms use local features, such as scale-invariant feature transform (SIFT) and speeded-up robust features (SURF), for visually learning to recognize objects. These approaches though cannot apply to transparent objects made of glass or plastic, as such objects take on the visual features of background objects, and the appearance of such objects dramatically varies with changes in scene background. Indeed, in transmitting light, transparent objects have the unique characteristic of distorting the background by refraction. In this paper, we use a single-shot light field image as an input and model the distortion of the light field caused by the refractive property of a transparent object. We propose a new feature, called the light field distortion (LFD) feature, for identifying a transparent object. The proposal incorporates this LFD feature into the bag-of-features approach for recognizing transparent objects. We evaluated its performance in laboratory and real settings.

...read moreread less

Journal Article•DOI•

SIFT match verification by geometric coding for large-scale partial-duplicate web image search

[...]

Wengang Zhou¹, Houqiang Li¹, Yijuan Lu², Qi Tian³•Institutions (3)

University of Science and Technology of China¹, Texas State University², University of Texas at San Antonio³

19 Feb 2013-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: This article proposes a novel geometric coding algorithm, to encode the spatial context among local features for large-scale partial-duplicate Web image retrieval, which achieves comparable performance to other state-of-the-art global geometric verification methods, but is more computationally efficient.

...read moreread less

Abstract: Most large-scale image retrieval systems are based on the bag-of-visual-words model. However, the traditional bag-of-visual-words model does not capture the geometric context among local features in images well, which plays an important role in image retrieval. In order to fully explore geometric context of all visual words in images, efficient global geometric verification methods have been attracting lots of attention. Unfortunately, current existing methods on global geometric verification are either computationally expensive to ensure real-time response, or cannot handle rotation well. To solve the preceding problems, in this article, we propose a novel geometric coding algorithm, to encode the spatial context among local features for large-scale partial-duplicate Web image retrieval. Our geometric coding consists of geometric square coding and geometric fan coding, which describe the spatial relationships of SIFT features into three geo-maps for global verification to remove geometrically inconsistent SIFT matches. Our approach is not only computationally efficient, but also effective in detecting partial-duplicate images with rotation, scale changes, partial-occlusion, and background clutter.Experiments in partial-duplicate Web image search, using two datasets with one million Web images as distractors, reveal that our approach outperforms the baseline bag-of-visual-words approach even following a RANSAC verification in mean average precision. Besides, our approach achieves comparable performance to other state-of-the-art global geometric verification methods, for example, spatial coding scheme, but is more computationally efficient.

...read moreread less

Book Chapter•DOI•

Image Matching Using Generalized Scale-Space Interest Points

[...]

Tony Lindeberg¹•Institutions (1)

Royal Institute of Technology¹

02 Jun 2013

TL;DR: It is shown how a significant increase in matching performance can be obtained in relation to the underlying interest point detectors in the SIFT and the SURF operators.

...read moreread less

Abstract: The performance of matching and object recognition methods based on interest points depends on both the properties of the underlying interest points and the associated image descriptors. This paper demonstrates the advantages of using generalized scale-space interest point detectors when computing image descriptors for image-based matching. These generalized scale-space interest points are based on linking of image features over scale and scale selection by weighted averaging along feature trajectories over scale and allow for a higher ratio of correct matches and a lower ratio of false matches compared to previously known interest point detectors within the same class. Specifically, it is shown how a significant increase in matching performance can be obtained in relation to the underlying interest point detectors in the SIFT and the SURF operators. We propose that these generalized scale-space interest points when accompanied by associated scale-invariant image descriptors should allow for better performance of interest point based methods for image-based matching, object recognition and related vision tasks.

...read moreread less

Proceedings Article•DOI•

Evaluation of low-complexity visual feature detectors and descriptors

[...]

Antonio Canclini, Matteo Cesana, Alessandro Redondi, Marco Tagliasacchi, Joao Ascenso, R. Cilla - Show less +2 more

01 Jul 2013

TL;DR: An up-to-date detailed, clear, and complete evaluation of local feature detector and descriptors, focusing on the methods that were designed with complexity constraints is provided, providing a much needed reference for researchers in this field.

...read moreread less

Abstract: Several visual feature extraction algorithms have recently appeared in the literature, with the goal of reducing the computational complexity of state-of-the-art solutions (e.g., SIFT and SURF). Therefore, it is necessary to evaluate the performance of these emerging visual descriptors in terms of processing time, repeatability and matching accuracy, and whether they can obtain competitive performance in applications such as image retrieval. This paper aims to provide an up-to-date detailed, clear, and complete evaluation of local feature detector and descriptors, focusing on the methods that were designed with complexity constraints, providing a much needed reference for researchers in this field. Our results demonstrate that recent feature extraction algorithms, e.g., BRISK and ORB, have competitive performance requiring much lower complexity and can be efficiently used in low-power devices.

...read moreread less

Journal Article•DOI•

Joint Depth Map and Color Consistency Estimation for Stereo Images with Different Illuminations and Cameras

[...]

Yong Seok Heo¹, Kyoung Mu Lee¹, Sang Uk Lee¹•Institutions (1)

Seoul National University¹

01 May 2013-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experimental results show that the proposed Stereo Color Histogram Equalization (SCHE) method produces both accurate depth maps and color-consistent stereo images, even for stereo images with severe radiometric differences.

...read moreread less

Abstract: In this paper, we propose a method that infers both accurate depth maps and color-consistent stereo images for radiometrically varying stereo images. In general, stereo matching and performing color consistency between stereo images are a chicken-and-egg problem since it is not a trivial task to simultaneously achieve both goals. Hence, we have developed an iterative framework in which these two processes can boost each other. First, we transform the input color images to log-chromaticity color space, from which a linear relationship can be established during constructing a joint pdf of transformed left and right color images. From this joint pdf, we can estimate a linear function that relates the corresponding pixels in stereo images. Based on this linear property, we present a new stereo matching cost by combining Mutual Information (MI), SIFT descriptor, and segment-based plane-fitting to robustly find correspondence for stereo image pairs which undergo radiometric variations. Meanwhile, we devise a Stereo Color Histogram Equalization (SCHE) method to produce color-consistent stereo image pairs, which conversely boost the disparity map estimation. Experimental results show that our method produces both accurate depth maps and color-consistent stereo images, even for stereo images with severe radiometric differences.

...read moreread less

Book Chapter•DOI•

MATRIOSKA: A Multi-level Approach to Fast Tracking by Learning

[...]

Mario Edoardo Maresca¹, Alfredo Petrosino¹•Institutions (1)

University of Naples Federico II¹

09 Sep 2013

TL;DR: A novel framework for the detection and tracking in real-time of unknown object in a video stream using multiple keypoint-based methods inside a fallback model, to correctly localize the object frame by frame exploiting the strengths of each method.

...read moreread less

Abstract: In this paper we propose a novel framework for the detection and tracking in real-time of unknown object in a video stream. We decompose the problem into two separate modules: detection and learning. The detection module can use multiple keypoint-based methods (ORB, FREAK, BRISK, SIFT, SURF and more) inside a fallback model, to correctly localize the object frame by frame exploiting the strengths of each method. The learning module updates the object model, with a growing and pruning approach, to account for changes in its appearance and extracts negative samples to further improve the detector performance. To show the effectiveness of the proposed tracking-by-detection algorithm, we present quantitative results on a number of challenging sequences where the target object goes through changes of pose, scale and illumination.

...read moreread less

Journal Article•DOI•

Perspective-SIFT: An efficient tool for low-altitude remote sensing image registration

[...]

Guorong Cai¹, Pierre-Marc Jodoin², Shaozi Li³, Yundong Wu¹, Songzhi Su³, Zhen-Kun Huang¹ - Show less +2 more•Institutions (3)

Jimei University¹, Université de Sherbrooke², Xiamen University³

01 Nov 2013-Signal Processing

TL;DR: Experimental results show that PSIFT outperforms significantly the state-of-the-art ASIFT, SIFT, Random Ferns, Harris-Affine, MSER and Hessian Affine, especially when images suffer severe perspective distortion.

...read moreread less

Proceedings Article•DOI•

Automatic Nile Tilapia fish classification approach using machine learning techniques

[...]

Mohamed Mostafa M. Fouad¹, Hossam M. Zawbaa², Nashwa El-Bendary¹, Aboul Ella Hassanien³•Institutions (3)

Arab Academy for Science, Technology & Maritime Transport¹, Beni-Suef University², Cairo University³

01 Dec 2013

TL;DR: An automatic classification approach for the Nile Tilapia fish using support vector machines (SVMs) algorithm in conjunction with feature extraction techniques based on Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF) algorithms is introduced.

...read moreread less

Abstract: Commonly, aquatic experts use traditional methods such as casting nets or underwater human monitoring for detecting existence and quantities of different species of fish. However, the recent breakthrough in digital cameras and storage abilities, with consequent cost reduction, can be utilized for automatically observing different underwater species. This article introduces an automatic classification approach for the Nile Tilapia fish using support vector machines (SVMs) algorithm in conjunction with feature extraction techniques based on Scale Invariant Feature Transform (SIFT) and Speeded Up Robust Features (SURF) algorithms. The core of this approach is to apply the feature extraction algorithms in order to describe local features extracted from a set of fish images. Then, the proposed approach classifies the fish images using a number of support vector machines classifiers to differentiate between fish species. Experimental results obtained show that the support vector machines algorithm outperformed other machine learning techniques, such as artificial neural networks (ANN) and k-nearest neighbor (k-NN) algorithms, in terms of the overall classification accuracy.

...read moreread less

Journal Article•DOI•

Fast segmentation and adaptive SURF descriptor for iris recognition

[...]

Hunny Mehrotra¹, Pankaj Kumar Sa¹, Banshidhar Majhi¹•Institutions (1)

National Institute of Technology, Rourkela¹

01 Jul 2013-Mathematical and Computer Modelling

TL;DR: A robust segmentation and an adaptive SURF descriptor are proposed for iris recognition and the proposed approach performs with improved accuracy and reduced computation cost.

...read moreread less

Journal Article•DOI•

An improvement to the SIFT descriptor for image representation and matching

[...]

Kaiyang Liao¹, Guizhong Liu¹, Youshi Hui¹•Institutions (1)

Xi'an Jiaotong University¹

01 Aug 2013-Pattern Recognition Letters

TL;DR: The framework of the proposed descriptor consists of the following steps: normalizing elliptical neighboring region, transforming to affine scale-space, improving the SIFT descriptor with polar histogram orientation bin, as well as integrating the mirror reflection invariant.

...read moreread less

Proceedings Article•DOI•

Scene Text Recognition Using Co-occurrence of Histogram of Oriented Gradients

[...]

Shangxuan Tian¹, Shijian Lu², Bolan Su¹, Chew Lim Tan¹•Institutions (2)

National University of Singapore¹, Institute for Infocomm Research Singapore²

25 Aug 2013

TL;DR: Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER).

...read moreread less

Abstract: Scene text recognition is a fundamental step in End-to-End applications where traditional optical character recognition (OCR) systems often fail to produce satisfactory results. This paper proposes a technique that uses co-occurrence histogram of oriented gradients (Co-HOG) to recognize the text in scenes. Compared with histogram of oriented gradients (HOG), Co-HOG is a more powerful tool that captures spatial distribution of neighboring orientation pairs instead of just a single gradient orientation. At the same time, it is more efficient compared with HOG and therefore more suitable for real-time applications. The proposed scene text recognition technique is evaluated on ICDAR2003 character dataset and Street View Text (SVT) dataset. Experiments show that the Co-HOG based technique clearly outperforms state-of-the-art techniques that use HOG, Scale Invariant Feature Transform (SIFT), and Maximally Stable Extremal Regions (MSER).

...read moreread less

Proceedings Article•DOI•

A comparison of feature descriptors for visual SLAM

[...]

Jan Hartmann¹, Jan Helge Klussendorff¹, Erik Maehle¹•Institutions (1)

University of Lübeck¹

01 Sep 2013

TL;DR: This paper will compare the most popular feature descriptors in a typical graph-based VSLAM algorithm using two publicly available datasets to determine the impact of the choice for feature descriptor in terms of accuracy and speed in a realistic scenario.

...read moreread less

Abstract: Feature detection and feature description plays an important part in Visual Simultaneous Localization and Mapping (VSLAM). Visual features are commonly used to efficiently estimate the motion of the camera (visual odometry) and link the current image to previously visited parts of the environment (place recognition, loop closure). Gradient histogram-based feature descriptors, like SIFT and SURF, are frequently used for this task. Recently introduced binary descriptors, as BRIEF or BRISK, claim to offer similar capabilities at lower computational cost. In this paper, we will compare the most popular feature descriptors in a typical graph-based VSLAM algorithm using two publicly available datasets to determine the impact of the choice for feature descriptor in terms of accuracy and speed in a realistic scenario.

...read moreread less

Journal Article•DOI•

Finger-Vein Verification Based on Multi-Features Fusion

[...]

Huafeng Qin¹, Lan Qin, Lian Xue², Xiping He², Chengbo Yu³, Xinyuan Liang - Show less +2 more•Institutions (3)

Chongqing Technology and Business University¹, Chongqing University², Chongqing University of Technology³

05 Nov 2013-Sensors

TL;DR: A vein pattern extraction method to extract the finger-vein shape and orientation features is proposed and a region-based matching scheme is investigated by employing the Scale Invariant Feature Transform (SIFT) matching method to accommodate the potential local and global variations at the same time.

...read moreread less

Abstract: This paper presents a new scheme to improve the performance of finger-vein identification systems. Firstly, a vein pattern extraction method to extract the finger-vein shape and orientation features is proposed. Secondly, to accommodate the potential local and global variations at the same time, a region-based matching scheme is investigated by employing the Scale Invariant Feature Transform (SIFT) matching method. Finally, the finger-vein shape, orientation and SIFT features are combined to further enhance the performance. The experimental results on databases of 426 and 170 fingers demonstrate the consistent superiority of the proposed approach.

...read moreread less

Proceedings Article•DOI•

Visual terrain classification for selecting energy efficient gaits of a hexapod robot

[...]

Steffen Zenker¹, Eren Erdal Aksoy¹, Dennis Goldschmidt¹, Florentin Wörgötter¹, Poramate Manoonpong¹ - Show less +1 more•Institutions (1)

University of Göttingen¹

09 Jul 2013

TL;DR: This work presents an online terrain classification system which uses only a monocular camera with a feature-based terrain classification algorithm which is robust to changes in illumination and view points and is successfully applied to the small hexapod robot AMOS II.

...read moreread less

Abstract: Legged robots need to be able to classify and recognize different terrains to adapt their gait accordingly. Recent works in terrain classification use different types of sensors (like stereovision, 3D laser range, and tactile sensors) and their combination. However, such sensor systems require more computing power, produce extra load to legged robots, and/or might be difficult to install on a small size legged robot. In this work, we present an online terrain classification system. It uses only a monocular camera with a feature-based terrain classification algorithm which is robust to changes in illumination and view points. For this algorithm, we extract local features of terrains using either Scale Invariant Feature Transform (SIFT) or Speed Up Robust Feature (SURF). We encode the features using the Bag of Words (BoW) technique, and then classify the words using Support Vector Machines (SVMs) with a radial basis function kernel. We compare this feature-based approach with a color-based approach on the Caltech-256 benchmark as well as eight different terrain image sets (grass, gravel, pavement, sand, asphalt, floor, mud, and fine gravel). For terrain images, we observe up to 90% accuracy with the feature-based approach. Finally, this online terrain classification system is successfully applied to our small hexapod robot AMOS II. The output of the system providing terrain information is used as an input to its neural locomotion control to trigger an energy-efficient gait while traversing different terrains.

...read moreread less

Posted Content•

Image Retrieval based on Bag-of-Words model

[...]

Jialu Liu

18 Apr 2013-arXiv: Information Retrieval

TL;DR: The goal of this survey is to give an overview of this model and introduce different strategies when building the system based on this model.

...read moreread less

Abstract: This article gives a survey for bag-of-words (BoW) or bag-of-features model in image retrieval system. In recent years, large-scale image retrieval shows significant potential in both industry applications and research problems. As local descriptors like SIFT demonstrate great discriminative power in solving vision problems like object recognition, image classification and annotation, more and more state-of-the-art large scale image retrieval systems are trying to rely on them. A common way to achieve this is first quantizing local descriptors into visual words, and then applying scalable textual indexing and retrieval schemes. We call this model as bag-of-words or bag-of-features model. The goal of this survey is to give an overview of this model and introduce different strategies when building the system based on this model.

...read moreread less

Patent•

Remote sensing image registration method of multi-source sensor

[...]

Shi Yue, Fu Kun, Sun Xian

03 Apr 2013

TL;DR: In this paper, the authors proposed a remote sensing image registration method of a multi-source sensor, relating to an image processing technology, which consists of the following steps of: respectively carrying out scale-invariant feature transform (SIFT) on a reference image and a registration image, extracting feature points, calculating the nearest Euclidean distances and the nearer Euclidein distances of the feature points in the image to be registered and the reference image, and screening an optimal matching point pair according to a ratio; rejecting error registration points through a random consistency sampling algorithm, screening an

...read moreread less

Abstract: The invention provides a remote sensing image registration method of a multi-source sensor, relating to an image processing technology. The remote sensing image registration method comprises the following steps of: respectively carrying out scale-invariant feature transform (SIFT) on a reference image and a registration image, extracting feature points, calculating the nearest Euclidean distances and the nearer Euclidean distances of the feature points in the image to be registered and the reference image, and screening an optimal matching point pair according to a ratio; rejecting error registration points through a random consistency sampling algorithm, and screening an original registration point pair; calculating distribution quality parameters of feature point pairs and selecting effective control point parts with uniform distribution according to a feature point weight coefficient; searching an optimal registration point in control points of the image to be registered according to a mutual information assimilation judging criteria, thus obtaining an optimal registration point pair of the control points; and acquiring a geometric deformation parameter of the image to be registered by polynomial parameter transformation, thus realizing the accurate registration of the image to be registered and the reference image. The remote sensing image registration method provided by the invention has the advantages of high calculation speed and high registration precision, and can meet the registration requirements of a multi-sensor, multi-temporal and multi-view remote sensing image.

...read moreread less

Collapse