scispace - formally typeset
Search or ask a question

Showing papers on "Feature (computer vision) published in 2011"


Proceedings Article
12 Dec 2011
TL;DR: This paper considers fully connected CRF models defined on the complete set of pixels in an image and proposes a highly efficient approximate inference algorithm in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels.
Abstract: Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy.

3,233 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: This work introduces a novel descriptor based on motion boundary histograms, which is robust to camera motion and consistently outperforms other state-of-the-art descriptors, in particular in uncontrolled realistic videos.
Abstract: Feature trajectories have shown to be efficient for representing videos. Typically, they are extracted using the KLT tracker or matching SIFT descriptors between frames. However, the quality as well as quantity of these trajectories is often not sufficient. Inspired by the recent success of dense sampling in image classification, we propose an approach to describe videos by dense trajectories. We sample dense points from each frame and track them based on displacement information from a dense optical flow field. Given a state-of-the-art optical flow algorithm, our trajectories are robust to fast irregular motions as well as shot boundaries. Additionally, dense trajectories cover the motion information in videos well. We, also, investigate how to design descriptors to encode the trajectory information. We introduce a novel descriptor based on motion boundary histograms, which is robust to camera motion. This descriptor consistently outperforms other state-of-the-art descriptors, in particular in uncontrolled realistic videos. We evaluate our video description in the context of action classification with a bag-of-features approach. Experimental results show a significant improvement over the state of the art on four datasets of varying difficulty, i.e. KTH, YouTube, Hollywood2 and UCF sports.

2,383 citations


Journal ArticleDOI
TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.

895 citations


Proceedings ArticleDOI
16 Jul 2011
TL;DR: In this paper, a joint framework for unsupervised feature selection is proposed to select the most discriminative feature subset from the whole feature set in batch mode, where the class label of input data can be predicted by a linear classifier.
Abstract: Compared with supervised learning for feature selection, it is much more difficult to select the discriminative features in unsupervised learning due to the lack of label information. Traditional unsupervised feature selection algorithms usually select the features which best preserve the data distribution, e.g., manifold structure, of the whole feature set. Under the assumption that the class label of input data can be predicted by a linear classifier, we incorporate discriminative analysis and l2,1-norm minimization into a joint framework for unsupervised feature selection. Different from existing unsupervised feature selection algorithms, our algorithm selects the most discriminative feature subset from the whole feature set in batch mode. Extensive experiment on different data types demonstrates the effectiveness of our algorithm.

613 citations


Patent
04 Nov 2011
TL;DR: In this paper, a surgical instrument includes a handpiece having a user input feature and a user feedback feature and an end effector is disposed at a distal end of the shaft assembly.
Abstract: A surgical instrument includes a handpiece having a user input feature and a user feedback feature. A shaft assembly extends distally from the handpiece. An end effector is disposed at a distal end of the shaft assembly. The end effector includes an active feature responsive to actuation of the user input feature. The active feature is operable to operate on tissue in response to actuation of the user input feature. The user feedback feature is operable to provide feedback to the user that indicates information relating to operation of the end effector. The feedback may include haptic, visual, and/or auditory feedback.

527 citations


Proceedings Article
14 Jul 2011
TL;DR: In this paper, a generalized Fisher score was proposed to jointly select features, which maximizes the lower bound of traditional Fisher score by solving a quadratically constrained linear programming (QCLP) problem.
Abstract: Fisher score is one of the most widely used supervised feature selection methods. However, it selects each feature independently according to their scores under the Fisher criterion, which leads to a suboptimal subset of features. In this paper, we present a generalized Fisher score to jointly select features. It aims at finding an subset of features, which maximize the lower bound of traditional Fisher score. The resulting feature selection problem is a mixed integer programming, which can be reformulated as a quadratically constrained linear programming (QCLP). It is solved by cutting plane algorithm, in each iteration of which a multiple kernel learning problem is solved alternatively by multivariate ridge regression and projected gradient descent. Experiments on benchmark data sets indicate that the proposed method outperforms Fisher score as well as many other state-of-the-art feature selection methods.

472 citations


Journal ArticleDOI
TL;DR: This work presents a carefully designed dataset of video sequences of planar textures with ground truth, which includes various geometric changes, lighting conditions, and levels of motion blur, and presents a comprehensive quantitative evaluation of detector-descriptor-based visual camera tracking based on this testbed.
Abstract: Applications for real-time visual tracking can be found in many areas, including visual odometry and augmented reality. Interest point detection and feature description form the basis of feature-based tracking, and a variety of algorithms for these tasks have been proposed. In this work, we present (1) a carefully designed dataset of video sequences of planar textures with ground truth, which includes various geometric changes, lighting conditions, and levels of motion blur, and which may serve as a testbed for a variety of tracking-related problems, and (2) a comprehensive quantitative evaluation of detector-descriptor-based visual camera tracking based on this testbed. We evaluate the impact of individual algorithm parameters, compare algorithms for both detection and description in isolation, as well as all detector-descriptor combinations as a tracking solution. In contrast to existing evaluations, which aim at different tasks such as object recognition and have limited validity for visual tracking, our evaluation is geared towards this application in all relevant factors (performance measures, testbed, candidate algorithms). To our knowledge, this is the first work that comprehensively compares these algorithms in this context, and in particular, on video streams.

441 citations


Journal ArticleDOI
TL;DR: A general-purpose deformable registration algorithm referred to as "DRAMMS" is presented, which extracts Gabor attributes at each voxel and selects the optimal components, so that they form a highly distinctive morphological signature reflecting the anatomical context around each v oxel in a multi-scale and multi-resolution fashion.

420 citations


Patent
09 May 2011
TL;DR: In this article, the authors describe an interactive user interface for capturing a frame of image data having a representation of a feature and provide user-perceptible hints for guiding a user to alter positioning of the device to enhance a capability for identifying the linear features defining a candidate quadrilateral form in the image data.
Abstract: Devices, methods, and software are disclosed for an interactive user interface for capturing a frame of image data having a representation of a feature. In an illustrative embodiment, a device includes an imaging subsystem, one or more memory components, and one or more processors. The imaging subsystem is capable of providing image data representative of light incident on said imaging subsystem. The one or more memory components include at least a first memory component operatively capable of storing an input frame of the image data. The one or more processors may be enabled for performing various steps. One step may include receiving the image data from the first memory component. Another step may include attempting to identify linear features defining a candidate quadrilateral form in the image data. Another step may include providing user-perceptible hints for guiding a user to alter positioning of the device to enhance a capability for identifying the linear features defining a candidate quadrilateral form in the image data.

407 citations


Proceedings ArticleDOI
18 Sep 2011
TL;DR: This paper applies large-scale algorithms for learning the features automatically from unlabeled data to construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system.
Abstract: Reading text from photographs is a challenging problem that has received a significant amount of attention. Two key components of most systems are (i) text detection from images and (ii) character recognition, and many recent methods have been proposed to design better feature representations and models for both. In this paper, we apply methods recently developed in machine learning -- specifically, large-scale algorithms for learning the features automatically from unlabeled data -- and show that they allow us to construct highly effective classifiers for both detection and recognition to be used in a high accuracy end-to-end system.

402 citations



Journal ArticleDOI
TL;DR: This work proposes two feature selection algorithms and investigates the performance of using these algorithms compared to a mutual information-based feature selection method, using both a linear and a non-linear measure-linear correlation coefficient and mutual information, for the feature selection.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: It is shown that memorability is a stable property of an image that is shared across different viewers, and a database for which each picture will be remembered after a single view is introduced.
Abstract: When glancing at a magazine, or browsing the Internet, we are continuously being exposed to photographs. Despite of this overflow of visual information, humans are extremely good at remembering thousands of pictures along with some of their visual details. But not all images are equal in memory. Some stitch to our minds, and other are forgotten. In this paper we focus on the problem of predicting how memorable an image will be. We show that memorability is a stable property of an image that is shared across different viewers. We introduce a database for which we have measured the probability that each picture will be remembered after a single view. We analyze image features and labels that contribute to making an image memorable, and we train a predictor based on global image descriptors. We find that predicting image memorability is a task that can be addressed with current computer vision techniques. Whereas making memorable images is a challenging task in visualization and photography, this work is a first attempt to quantify this useful quality of images.

Journal ArticleDOI
TL;DR: It is demonstrated that Lasso logistic regression, fused support vector machine, group Lasso and random forest models suffer from correlation bias, and two related methods for group selection based on feature clustering can be used for correcting the correlation bias.
Abstract: Motivation: Classification and feature selection of genomics or transcriptomics data is often hampered by the large number of features as compared with the small number of samples available. Moreover, features represented by probes that either have similar molecular functions (gene expression analysis) or genomic locations (DNA copy number analysis) are highly correlated. Classical model selection methods such as penalized logistic regression or random forest become unstable in the presence of high feature correlations. Sophisticated penalties such as group Lasso or fused Lasso can force the models to assign similar weights to correlated features and thus improve model stability and interpretability. In this article, we show that the measures of feature relevance corresponding to the above-mentioned methods are biased such that the weights of the features belonging to groups of correlated features decrease as the sizes of the groups increase, which leads to incorrect model interpretation and misleading feature ranking. Results: With simulation experiments, we demonstrate that Lasso logistic regression, fused support vector machine, group Lasso and random forest models suffer from correlation bias. Using simulations, we show that two related methods for group selection based on feature clustering can be used for correcting the correlation bias. These techniques also improve the stability and the accuracy of the baseline models. We apply all methods investigated to a breast cancer and a bladder cancer arrayCGH dataset and in order to identify copy number aberrations predictive of tumor phenotype. Availability: R code can be found at: http://www.mpi-inf.mpg.de/~laura/Clustering.r. Contact: laura.tolosi@mpi-inf.mpg.de Supplementary information:Supplementary data are available at Bioinformatics online.

Journal ArticleDOI
TL;DR: In a series of speeded classification tasks, spontaneous mappings between the auditory feature of pitch and the visual features of vertical location, size, and spatial frequency are found but not contrast.
Abstract: The brain may combine information from different sense modalities to enhance the speed and accuracy of detection of objects and events, and the choice of appropriate responses. There is mounting evidence that perceptual experiences that appear to be modality-specific are also influenced by activity from other sensory modalities, even in the absence of awareness of this interaction. In a series of speeded classification tasks, we found spontaneous mappings between the auditory feature of pitch and the visual features of vertical location, size, and spatial frequency but not contrast. By dissociating the task variables from the features that were cross-modally related, we find that the interactions happen in an automatic fashion and are possibly located at the perceptual level.

Proceedings ArticleDOI
28 Nov 2011
TL;DR: This paper presents a novel approach - Multiple Feature Hashing (MFH) to tackle both the accuracy and the scalability issues of NDVR and shows that the proposed method outperforms the state-of-the-art techniques in both accuracy and efficiency.
Abstract: Near-duplicate video retrieval (NDVR) has recently attracted lots of research attention due to the exponential growth of online videos. It helps in many areas, such as copyright protection, video tagging, online video usage monitoring, etc. Most of existing approaches use only a single feature to represent a video for NDVR. However, a single feature is often insufficient to characterize the video content. Besides, while the accuracy is the main concern in previous literatures, the scalability of NDVR algorithms for large scale video datasets has been rarely addressed. In this paper, we present a novel approach - Multiple Feature Hashing (MFH) to tackle both the accuracy and the scalability issues of NDVR. MFH preserves the local structure information of each individual feature and also globally consider the local structures for all the features to learn a group of hash functions which map the video keyframes into the Hamming space and generate a series of binary codes to represent the video dataset. We evaluate our approach on a public video dataset and a large scale video dataset consisting of 132,647 videos, which was collected from YouTube by ourselves. The experiment results show that the proposed method outperforms the state-of-the-art techniques in both accuracy and efficiency.

Patent
22 Nov 2011
TL;DR: In this article, a bi-dimensional coded light pattern is projected on the object such that each of the identifiable feature types appears at most once on predefined sections of distinguishable epipolar lines.
Abstract: A method and apparatus for obtaining an image to determine a three dimensional shape of a stationary or moving object using a bi dimensional coded light pattern having a plurality of distinct identifiable feature types. The coded light pattern is projected on the object such that each of the identifiable feature types appears at most once on predefined sections of distinguishable epipolar lines. An image of the object is captured and the reflected feature types are extracted along with their location on known epipolar lines in the captured image. Displacements of the reflected feature types along their epipolar lines from reference coordinates thereupon determine corresponding three dimensional coordinates in space and thus a 3D mapping or model of the shape of the object at any point in time.

Journal ArticleDOI
TL;DR: Experimental results on a known database and achieving to more than 94% accuracy in about 50 s for blood vessel detection, proved that the blood vessels can be effectively detected by applying this method on the retinal images.
Abstract: Retinal images can be used in several applications, such as ocular fundus operations as well as human recognition. Also, they play important roles in detection of some diseases in early stages, such as diabetes, which can be performed by comparison of the states of retinal blood vessels. Intrinsic characteristics of retinal images make the blood vessel detection process difficult. Here, we proposed a new algorithm to detect the retinal blood vessels effectively. Due to the high ability of the curvelet transform in representing the edges, modification of curvelet transform coefficients to enhance the retinal image edges better prepares the image for the segmentation part. The directionality feature of the multistructure elements method makes it an effective tool in edge detection. Hence, morphology operators using multistructure elements are applied to the enhanced image in order to find the retinal image ridges. Afterward, morphological operators by reconstruction eliminate the ridges not belonging to the vessel tree while trying to preserve the thin vessels unchanged. In order to increase the efficiency of the morphological operators by reconstruction, they were applied using multistructure elements. A simple thresholding method along with connected components analysis (CCA) indicates the remained ridges belonging to vessels. In order to utilize CCA more efficiently, we locally applied the CCA and length filtering instead of considering the whole image. Experimental results on a known database, DRIVE, and achieving to more than 94% accuracy in about 50 s for blood vessel detection, proved that the blood vessels can be effectively detected by applying our method on the retinal images.

Proceedings ArticleDOI
21 May 2011
TL;DR: In this paper, the authors present procedures for reverse engineering feature models based on a crucial heuristic for identifying parents, which can reduce the information a modeler has to consider from thousands of choices to typically five or less.
Abstract: Feature models describe the common and variable characteristics of a product line. Their advantages are well recognized in product line methods. Unfortunately, creating a feature model for an existing project is time-consuming and requires substantial effort from a modeler. We present procedures for reverse engineering feature models based on a crucial heuristic for identifying parents - the major challenge of this task. We also automatically recover constructs such as feature groups, mandatory features, and implies/excludes edges. We evaluate the technique on two large-scale software product lines with existing reference feature models--the Linux and eCos kernels--and FreeBSD, a project without a feature model. Our heuristic is effective across all three projects by ranking the correct parent among the top results for a vast majority of features. The procedures effectively reduce the information a modeler has to consider from thousands of choices to typically five or less.

Journal ArticleDOI
TL;DR: A context-sensitive technique for unsupervised change detection in multitemporal remote sensing images based on fuzzy clustering approach and takes care of spatial correlation between neighboring pixels of the difference image produced by comparing two images acquired on the same geographical area at different times.

Proceedings ArticleDOI
01 Nov 2011
TL;DR: The Clustered Viewpoint Feature Histogram (CVFH) is described and it is shown that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data.
Abstract: This paper focuses on developing a fast and accurate 3D feature for use in object recognition and pose estimation for rigid objects. More specifically, given a set of CAD models of different objects representing our knoweledge of the world - obtained using high-precission scanners that deliver accurate and noiseless data - our goal is to identify and estimate their pose in a real scene obtained by a depth sensor like the Microsoft Kinect. Borrowing ideas from the Viewpoint Feature Histogram (VFH) due to its computational efficiency and recognition performance, we describe the Clustered Viewpoint Feature Histogram (CVFH) and the cameras roll histogram together with our recognition framework to show that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data. We show that CVFH out-performs VFH and present recognition results using the Microsoft Kinect Sensor on an object set of 44 objects.

Journal ArticleDOI
TL;DR: This paper presents a method to extract color and texture features of an image quickly for content-based image retrieval (CBIR), and shows that the fused features retrieval brings better visual feeling than the single feature retrieval, which means better retrieval results.

Journal ArticleDOI
TL;DR: Simulated and experimental results demonstrate the merits of the proposed approach, particularly in situations of high clutter and data association ambiguity.
Abstract: This paper proposes an integrated Bayesian frame work for feature-based simultaneous localization and map building (SLAM) in the general case of uncertain feature number and data association. By modeling the measurements and feature map as random finite sets (RFSs), a formulation of the feature-based SLAM problem is presented that jointly estimates the number and location of the features, as well as the vehicle trajectory. More concisely, the joint posterior distribution of the set-valued map and vehicle trajectory is propagated forward in time as measurements arrive, thereby incorporating both data association and feature management into a single recursion. Furthermore, the Bayes optimality of the proposed approach is established. A first-order solution, which is coined as the probability hypothesis density (PHD) SLAM filter, is derived, which jointly propagates the posterior PHD of the map and the posterior distribution of the vehicle trajectory. A Rao-Blackwellized (RB) implementation of the PHD-SLAM filter is proposed based on the Gaussian-mixture PHD filter (for the map) and a particle filter (for the vehicle trajectory). Simulated and experimental results demonstrate the merits of the proposed approach, particularly in situations of high clutter and data association ambiguity.

Patent
27 May 2011
TL;DR: In this paper, the design consists of the feature of ornamentation of a display screen shown in solid lines in the drawing, and is based on the idea of the display screen as a tree.
Abstract: The design consists of the feature of ornamentation of a Display Screen shown in solid lines in the drawing.

Journal ArticleDOI
TL;DR: A least square technique is used to learn the weights associated with these maps from subjects freely fixating natural scenes drawn from four recent eye-tracking data sets, and this model outperforms several state-of-the-art saliency algorithms.
Abstract: Inspired by the primate visual system, computational saliency models decompose visual input into a set of feature maps across spatial scales in a number of pre-specified channels. The outputs of these feature maps are summed to yield the final saliency map. Here we use a least square technique to learn the weights associated with these maps from subjects freely fixating natural scenes drawn from four recent eye-tracking data sets. Depending on the data set, the weights can be quite different, with the face and orientation channels usually more important than color and intensity channels. Inter-subject differences are negligible. We also model a bias toward fixating at the center of images and consider both time-varying and constant factors that contribute to this bias. To compensate for the inadequacy of the standard method to judge performance (area under the ROC curve), we use two other metrics to comprehensively assess performance. Although our model retains the basic structure of the standard saliency model, it outperforms several state-of-the-art saliency algorithms. Furthermore, the simple structure makes the results applicable to numerous studies in psychophysics and physiology and leads to an extremely easy implementation for real-world applications.

Journal ArticleDOI
TL;DR: A hybrid feature selection method which combines two feature selection methods - the filters and the wrappers is introduced, which shows that equal or better prediction accuracy can be achieved with a smaller feature set.
Abstract: Feature selection aims at finding the most relevant features of a problem domain. It is very helpful in improving computational speed and prediction accuracy. However, identification of useful features from hundreds or even thousands of related features is a nontrivial task. In this paper, we introduce a hybrid feature selection method which combines two feature selection methods - the filters and the wrappers. Candidate features are first selected from the original feature set via computationally-efficient filters. The candidate feature set is further refined by more accurate wrappers. This hybrid mechanism takes advantage of both the filters and the wrappers. The mechanism is examined by two bioinformatics problems, namely, protein disordered region prediction and gene selection in microarray cancer data. Experimental results show that equal or better prediction accuracy can be achieved with a smaller feature set. These feature subsets can be obtained in a reasonable time period.

Book ChapterDOI
18 Sep 2011
TL;DR: A new similarity metric for multi-modal registration, the non-local shape descriptor, that is robust against the most considerable differences between modalities, such as non-functional intensity relations, different amounts of noise and non-uniform bias fields is proposed.
Abstract: Deformable registration of images obtained from different modalities remains a challenging task in medical image analysis. This paper addresses this problem and proposes a new similarity metric for multi-modal registration, the non-local shape descriptor. It aims to extract the shape of anatomical features in a non-local region. By utilizing the dense evaluation of shape descriptors, this new measure bridges the gap between intensity-based and geometric feature-based similarity criteria. Our new metric allows for accurate and reliable registration of clinical multi-modal datasets and is robust against the most considerable differences between modalities, such as non-functional intensity relations, different amounts of noise and non-uniform bias fields. The measure has been implemented in a non-rigid diffusion-regularized registration framework. It has been applied to synthetic test images and challenging clinical MRI and CT chest scans. Experimental results demonstrate its advantages over the most commonly used similarity metric - mutual information, and show improved alignment of anatomical landmarks.

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This paper presents a “blind” image quality measure, where potentially neither the groundtruth image nor the degradation process are known, and uses a set of novel low-level image features in a machine learning framework to learn a mapping from these features to subjective image quality scores.
Abstract: It is often desirable to evaluate an image based on its quality. For many computer vision applications, a perceptually meaningful measure is the most relevant for evaluation; however, most commonly used measure do not map well to human judgements of image quality. A further complication of many existing image measure is that they require a reference image, which is often not available in practice. In this paper, we present a “blind” image quality measure, where potentially neither the groundtruth image nor the degradation process are known. Our method uses a set of novel low-level image features in a machine learning framework to learn a mapping from these features to subjective image quality scores. The image quality features stem from natural image measure and texture statistics. Experiments on a standard image quality benchmark dataset shows that our method outperforms the current state of art.

Journal ArticleDOI
TL;DR: Genetic programming is applied to perform automatic feature extraction from original feature database with the aim of improving the discriminatory performance of a classifier and reducing the input feature dimensionality at the same time.
Abstract: This paper applies genetic programming (GP) to perform automatic feature extraction from original feature database with the aim of improving the discriminatory performance of a classifier and reducing the input feature dimensionality at the same time. The tree structure of GP naturally represents the features, and a new function generated in this work automatically decides the number of the features extracted. In experiments on two common epileptic EEG detection problems, the classification accuracy on the GP-based features is significant higher than on the original features. Simultaneously, the dimension of the input features for the classifier is much smaller than that of the original features.

Journal ArticleDOI
TL;DR: While some feature ranking techniques performed similarly, the automatic hybrid search algorithm performed the best among the feature subset selection methods, and performances of the defect prediction models either improved or remained unchanged when over 85 metrics were eliminated.
Abstract: The selection of software metrics for building software quality prediction models is a search-based software engineering problem. An exhaustive search for such metrics is usually not feasible due to limited project resources, especially if the number of available metrics is large. Defect prediction models are necessary in aiding project managers for better utilizing valuable project resources for software quality improvement. The efficacy and usefulness of a fault-proneness prediction model is only as good as the quality of the software measurement data. This study focuses on the problem of attribute selection in the context of software quality estimation. A comparative investigation is presented for evaluating our proposed hybrid attribute selection approach, in which feature ranking is first used to reduce the search space, followed by a feature subset selection. A total of seven different feature ranking techniques are evaluated, while four different feature subset selection approaches are considered. The models are trained using five commonly used classification algorithms. The case study is based on software metrics and defect data collected from multiple releases of a large real-world software system. The results demonstrate that while some feature ranking techniques performed similarly, the automatic hybrid search algorithm performed the best among the feature subset selection methods. Moreover, performances of the defect prediction models either improved or remained unchanged when over 85were eliminated. Copyright © 2011 John Wiley & Sons, Ltd.