scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 2006"


Journal ArticleDOI
TL;DR: A novel algorithm for adapting dictionaries in order to achieve sparse signal representations, the K-SVD algorithm, an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data.
Abstract: In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method-the K-SVD algorithm-generalizing the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data

8,905 citations


Journal ArticleDOI
TL;DR: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features that is assessed in the face recognition problem under different challenges.
Abstract: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features. The face image is divided into several regions from which the LBP feature distributions are extracted and concatenated into an enhanced feature vector to be used as a face descriptor. The performance of the proposed method is assessed in the face recognition problem under different challenges. Other applications and several extensions are also discussed

5,563 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work presents a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold.
Abstract: Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that 'similar" points in input space are mapped to nearby points on the manifold. We present a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold. The learning relies solely on neighborhood relationships and does not require any distancemeasure in the input space. The method can learn mappings that are invariant to certain transformations of the inputs, as is demonstrated with a number of experiments. Comparisons are made to other techniques, in particular LLE.

4,524 citations


Journal ArticleDOI
17 Jun 2006
TL;DR: A large-scale evaluation of an approach that represents images as distributions of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ2 distance.
Abstract: Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the ÷2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on 4 texture and 5 object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance.

1,863 citations


Journal ArticleDOI
TL;DR: This work examined the rotation forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with bagging, AdaBoost, and random forest and prompted an investigation into diversity-accuracy landscape of the ensemble models.
Abstract: We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and principal component analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest". Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Using WEKA, we examined the rotation forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with bagging, AdaBoost, and random forest. The results were favorable to rotation forest and prompted an investigation into diversity-accuracy landscape of the ensemble models. Diversity-error diagrams revealed that rotation forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and random forest, and more diverse than these in bagging, sometimes more accurate as well

1,708 citations


Book
01 Jan 2006
TL;DR: This book discusses Feature Extraction for Classification of Proteomic Mass Spectra, Sequence Motifs: Highly Predictive Features of Protein Function, and Combining a Filter Method with SVMs.
Abstract: An Introduction to Feature Extraction.- An Introduction to Feature Extraction.- Feature Extraction Fundamentals.- Learning Machines.- Assessment Methods.- Filter Methods.- Search Strategies.- Embedded Methods.- Information-Theoretic Methods.- Ensemble Learning.- Fuzzy Neural Networks.- Feature Selection Challenge.- Design and Analysis of the NIPS2003 Challenge.- High Dimensional Classification with Bayesian Neural Networks and Dirichlet Diffusion Trees.- Ensembles of Regularized Least Squares Classifiers for High-Dimensional Problems.- Combining SVMs with Various Feature Selection Strategies.- Feature Selection with Transductive Support Vector Machines.- Variable Selection using Correlation and Single Variable Classifier Methods: Applications.- Tree-Based Ensembles with Dynamic Soft Feature Selection.- Sparse, Flexible and Efficient Modeling using L 1 Regularization.- Margin Based Feature Selection and Infogain with Standard Classifiers.- Bayesian Support Vector Machines for Feature Ranking and Selection.- Nonlinear Feature Selection with the Potential Support Vector Machine.- Combining a Filter Method with SVMs.- Feature Selection via Sensitivity Analysis with Direct Kernel PLS.- Information Gain, Correlation and Support Vector Machines.- Mining for Complex Models Comprising Feature Selection and Classification.- Combining Information-Based Supervised and Unsupervised Feature Selection.- An Enhanced Selective Naive Bayes Method with Optimal Discretization.- An Input Variable Importance Definition based on Empirical Data Probability Distribution.- New Perspectives in Feature Extraction.- Spectral Dimensionality Reduction.- Constructing Orthogonal Latent Features for Arbitrary Loss.- Large Margin Principles for Feature Selection.- Feature Extraction for Classification of Proteomic Mass Spectra: A Comparative Study.- Sequence Motifs: Highly Predictive Features of Protein Function.

1,593 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper proposes a novel on-line AdaBoost feature selection method and demonstrates the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection.
Abstract: Boosting has become very popular in computer vision, showing impressive performance in detection and recognition tasks. Mainly off-line training methods have been used, which implies that all training data has to be a priori given; training and usage of the classifier are separate steps. Training the classifier on-line and incrementally as new data becomes available has several advantages and opens new areas of application for boosting in computer vision. In this paper we propose a novel on-line AdaBoost feature selection method. In conjunction with efficient feature extraction methods the method is real time capable. We demonstrate the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection. All approaches benefit significantly by the on-line training.

1,159 citations


Book ChapterDOI
07 May 2006
TL;DR: In this article, the authors show experimentally that for a representative selection of commonly used test databases and for moderate to large numbers of samples, random sampling gives equal or better classifiers than the sophisticated multiscale interest operators that are in common use.
Abstract: Bag-of-features representations have recently become popular for content based image classification owing to their simplicity and good performance. They evolved from texton methods in texture analysis. The basic idea is to treat images as loose collections of independent patches, sampling a representative set of patches from the image, evaluating a visual descriptor vector for each patch independently, and using the resulting distribution of samples in descriptor space as a characterization of the image. The four main implementation choices are thus how to sample patches, how to describe them, how to characterize the resulting distributions and how to classify images based on the result. We concentrate on the first issue, showing experimentally that for a representative selection of commonly used test databases and for moderate to large numbers of samples, random sampling gives equal or better classifiers than the sophisticated multiscale interest operators that are in common use. Although interest operators work well for small numbers of samples, the single most important factor governing performance is the number of patches sampled from the test image and ultimately interest operators can not provide enough patches to compete. We also study the influence of other factors including codebook size and creation method, histogram normalization method and minimum scale for feature extraction.

1,099 citations


Journal ArticleDOI
TL;DR: A keypoint-based approach is developed that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem, which shifts much of the computational burden to a training phase, without sacrificing recognition performance.
Abstract: In many 3D object-detection and pose-estimation problems, runtime performance is of critical importance. However, there usually is time to train the system, which we would show to be very useful. Assuming that several registered images of the target object are available, we developed a keypoint-based approach that is effective in this context by formulating wide-baseline matching of keypoints extracted from the input images to those found in the model images as a classification problem. This shifts much of the computational burden to a training phase, without sacrificing recognition performance. As a result, the resulting algorithm is robust, accurate, and fast-enough for frame-rate performance. This reduction in runtime computational complexity is our first contribution. Our second contribution is to show that, in this context, a simple and fast keypoint detector suffices to support detection and tracking even under large perspective and scale variations. While earlier methods require a detector that can be expected to produce very repeatable results, in general, which usually is very time-consuming, we simply find the most repeatable object keypoints for the specific target object during the training phase. We have incorporated these ideas into a real-time system that detects planar, nonplanar, and deformable objects. It then estimates the pose of the rigid ones and the deformations of the others

843 citations


Journal ArticleDOI
TL;DR: This paper proposes some new feature extractors based on maximum margin criterion (MMC) and establishes a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA.
Abstract: In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are the two most popular linear dimensionality reduction methods. However, PCA is not very effective for the extraction of the most discriminant features, and LDA is not stable due to the small sample size problem . In this paper, we propose some new (linear and nonlinear) feature extractors based on maximum margin criterion (MMC). Geometrically, feature extractors based on MMC maximize the (average) margin between classes after dimensionality reduction. It is shown that MMC can represent class separability better than PCA. As a connection to LDA, we may also derive LDA from MMC by incorporating some constraints. By using some other constraints, we establish a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA. The kernelized (nonlinear) counterpart of this linear feature extractor is also established in the paper. Our extensive experiments demonstrate that the new feature extractors are effective, stable, and efficient.

838 citations


Journal ArticleDOI
TL;DR: This work proposes a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple- instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels.
Abstract: Multiple-instance problems arise from the situations where training class labels are attached to sets of samples (named bags), instead of individual samples within each bag (called instances). Most previous multiple-instance learning (MIL) algorithms are developed based on the assumption that a bag is positive if and only if at least one of its instances is positive. Although the assumption works well in a drug activity prediction problem, it is rather restrictive for other applications, especially those in the computer vision area. We propose a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple-instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels. MILES maps each bag into a feature space defined by the instances in the training bags via an instance similarity measure. This feature mapping often provides a large number of redundant or irrelevant features. Hence, 1-norm SVM is applied to select important features as well as construct classifiers simultaneously. We have performed extensive experiments. In comparison with other methods, MILES demonstrates competitive classification accuracy, high computation efficiency, and robustness to labeling uncertainty

Proceedings ArticleDOI
17 Jun 2006
TL;DR: The built Colored SIFT (CSIFT) is more robust than the conventional SIFT with respect to color and photometrical variations and the evaluation results support the potential of the proposed approach.
Abstract: SIFT has been proven to be the most robust local invariant feature descriptor. SIFT is designed mainly for gray images. However, color provides valuable information in object description and matching tasks. Many objects can be misclassified if their color contents are ignored. This paper addresses this problem and proposes a novel colored local invariant feature descriptor. Instead of using the gray space to represent the input image, the proposed approach builds the SIFT descriptors in a color invariant space. The built Colored SIFT (CSIFT) is more robust than the conventional SIFT with respect to color and photometrical variations. The evaluation results support the potential of the proposed approach.


Book ChapterDOI
07 May 2006
TL;DR: The results show that color descriptors remain reliable under photometric and geometrical changes, and with decreasing image quality, and for all experiments a combination of color and shape outperforms a pure shape-based approach.
Abstract: Although color is commonly experienced as an indispensable quality in describing the world around us, state-of-the art local feature-based representations are mostly based on shape description, and ignore color information. The description of color is hampered by the large amount of variations which causes the measured color values to vary significantly. In this paper we aim to extend the description of local features with color information. To accomplish a wide applicability of the color descriptor, it should be robust to : 1. photometric changes commonly encountered in the real world, 2. varying image quality, from high quality images to snap-shot photo quality and compressed internet images. Based on these requirements we derive a set of color descriptors. The set of proposed descriptors are compared by extensive testing on multiple applications areas, namely, matching, retrieval and classification, and on a wide variety of image qualities. The results show that color descriptors remain reliable under photometric and geometrical changes, and with decreasing image quality. For all experiments a combination of color and shape outperforms a pure shape-based approach.

Journal ArticleDOI
TL;DR: The proposed simplex growing algorithm (SGA) improves one commonly used EEA, the N-finder algorithm (N-FINDR) developed by Winter, by including a process of growing simplexes one vertex at a time until it reaches a desired number of vertices estimated by the VD, which results in a tremendous reduction of computational complexity.
Abstract: A new growing method for simplex-based endmember extraction algorithms (EEAs), called simplex growing algorithm (SGA), is presented in this paper. It is a sequential algorithm to find a simplex with the maximum volume every time a new vertex is added. In order to terminate this algorithm a recently developed concept, virtual dimensionality (VD), is implemented as a stopping rule to determine the number of vertices required for the algorithm to generate. The SGA improves one commonly used EEA, the N-finder algorithm (N-FINDR) developed by Winter, by including a process of growing simplexes one vertex at a time until it reaches a desired number of vertices estimated by the VD, which results in a tremendous reduction of computational complexity. Additionally, it also judiciously selects an appropriate initial vector to avoid a dilemma caused by the use of random vectors as its initial condition in the N-FINDR where the N-FINDR generally produces different sets of final endmembers if different sets of randomly generated initial endmembers are used. In order to demonstrate the performance of the proposed SGA, the N-FINDR and two other EEAs, pixel purity index, and vertex component analysis are used for comparison

Book ChapterDOI
Isabelle Guyon, André Elisseeff1
01 Jan 2006
TL;DR: This chapter introduces the reader to the various aspects of feature extraction covered in this book and proposes a unified view of the feature extraction problem.
Abstract: This chapter introduces the reader to the various aspects of feature extraction covered in this book. Section 1 reviews definitions and notations and proposes a unified view of the feature extraction problem. Section 2 is an overview of the methods and results presented in the book, emphasizing novel contributions. Section 3 provides the reader with an entry point in the field of feature extraction by showing small revealing examples and describing simple but effective algorithms. Finally, Section 4 introduces a more theoretical formalism and points to directions of research and open problems.

Journal ArticleDOI
TL;DR: A detailed survey of state of the art 2D face recognition algorithms using Gabor wavelets for feature extraction and existing problems are covered and possible solutions are suggested.
Abstract: Due to the robustness of Gabor features against local distortions caused by variance of illumination, expression and pose, they have been successfully applied for face recognition. The Facial Recognition Technology (FERET) evaluation and the recent Face Verification Competition (FVC2004) have seen the top performance of Gabor feature based methods. This paper aims to give a detailed survey of state of the art 2D face recognition algorithms using Gabor wavelets for feature extraction. Existing problems are covered and possible solutions are suggested.

Book ChapterDOI
07 May 2006
TL;DR: This paper proposes Probabilistic LDA, a generative probability model with which it can both extract the features and combine them for recognition, and shows applications to classification, hypothesis testing, class inference, and clustering.
Abstract: Linear dimensionality reduction methods, such as LDA, are often used in object recognition for feature extraction, but do not address the problem of how to use these features for recognition. In this paper, we propose Probabilistic LDA, a generative probability model with which we can both extract the features and combine them for recognition. The latent variables of PLDA represent both the class of the object and the view of the object within a class. By making examples of the same class share the class variable, we show how to train PLDA and use it for recognition on previously unseen classes. The usual LDA features are derived as a result of training PLDA, but in addition have a probability model attached to them, which automatically gives more weight to the more discriminative features. With PLDA, we can build a model of a previously unseen class from a single example, and can combine multiple examples for a better representation of the class. We show applications to classification, hypothesis testing, class inference, and clustering, on classes not observed during training.

Proceedings Article
01 Jan 2006
TL;DR: A practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space and achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over the previous baseline.
Abstract: This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCAcomplements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial improvements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed automatic traffic surveillance system is more robust, accurate, and powerful than other traditional methods, which utilize only the vehicle size and a single frame for vehicle classification.
Abstract: This paper presents an automatic traffic surveillance system to estimate important traffic parameters from video sequences using only one camera. Different from traditional methods that can classify vehicles to only cars and noncars, the proposed method has a good ability to categorize vehicles into more specific classes by introducing a new "linearity" feature in vehicle representation. In addition, the proposed system can well tackle the problem of vehicle occlusions caused by shadows, which often lead to the failure of further vehicle counting and classification. This problem is solved by a novel line-based shadow algorithm that uses a set of lines to eliminate all unwanted shadows. The used lines are devised from the information of lane-dividing lines. Therefore, an automatic scheme to detect lane-dividing lines is also proposed. The found lane-dividing lines can also provide important information for feature normalization, which can make the vehicle size more invariant, and thus much enhance the accuracy of vehicle classification. Once all features are extracted, an optimal classifier is then designed to robustly categorize vehicles into different classes. When recognizing a vehicle, the designed classifier can collect different evidences from its trajectories and the database to make an optimal decision for vehicle classification. Since more evidences are used, more robustness of classification can be achieved. Experimental results show that the proposed method is more robust, accurate, and powerful than other traditional methods, which utilize only the vehicle size and a single frame for vehicle classification.

Journal ArticleDOI
TL;DR: This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.
Abstract: The aim of this paper is to investigate the feasibility of an audio-based context recognition system. Here, context recognition refers to the automatic classification of the context or an environment around a device. A system is developed and compared to the accuracy of human listeners in the same task. Particular emphasis is placed on the computational complexity of the methods, since the application is of particular interest in resource-constrained portable devices. Simplistic low-dimensional feature vectors are evaluated against more standard spectral features. Using discriminative training, competitive recognition accuracies are achieved with very low-order hidden Markov models (1-3 Gaussian components). Slight improvement in recognition accuracy is observed when linear data-driven feature transformations are applied to mel-cepstral features. The recognition rate of the system as a function of the test sequence length appears to converge only after about 30 to 60 s. Some degree of accuracy can be achieved even with less than 1-s test sequence lengths. The average reaction time of the human listeners was 14 s, i.e., somewhat smaller, but of the same order as that of the system. The average recognition accuracy of the system was 58% against 69%, obtained in the listening tests in recognizing between 24 everyday contexts. The accuracies in recognizing six high-level classes were 82% for the system and 88% for the subjects.

Journal ArticleDOI
TL;DR: This paper presents a complete framework that starts with the extraction of various local regions of either discontinuity or homogeneity, and uses Boosting to learn a subset of feature vectors (weak hypotheses) and to combine them into one final hypothesis for each visual category.
Abstract: This paper explores the power and the limitations of weakly supervised categorization. We present a complete framework that starts with the extraction of various local regions of either discontinuity or homogeneity. A variety of local descriptors can be applied to form a set of feature vectors for each local region. Boosting is used to learn a subset of such feature vectors (weak hypotheses) and to combine them into one final hypothesis for each visual category. This combination of individual extractors and descriptors leads to recognition rates that are superior to other approaches which use only one specific extractor/descriptor setting. To explore the limitation of our system, we had to set up new, highly complex image databases that show the objects of interest at varying scales and poses, in cluttered background, and under considerable occlusion. We obtain classification results up to 81 percent ROC-equal error rate on the most complex of our databases. Our approach outperforms all comparable solutions on common databases.

Journal ArticleDOI
TL;DR: This paper proposes a classification system based on a genetic optimization framework formulated in such a way as to detect the best discriminative features without requiring the a priori setting of their number by the user and to estimate the best SVM parameters in a completely automatic way.
Abstract: Recent remote sensing literature has shown that support vector machine (SVM) methods generally outperform traditional statistical and neural methods in classification problems involving hyperspectral images. However, there are still open issues that, if suitably addressed, could allow further improvement of their performances in terms of classification accuracy. Two especially critical issues are: 1) the determination of the most appropriate feature subspace where to carry out the classification task and 2) model selection. In this paper, these two issues are addressed through a classification system that optimizes the SVM classifier accuracy for this kind of imagery. This system is based on a genetic optimization framework formulated in such a way as to detect the best discriminative features without requiring the a priori setting of their number by the user and to estimate the best SVM parameters (i.e., regularization and kernel parameters) in a completely automatic way. For these purposes, it exploits fitness criteria intrinsically related to the generalization capabilities of SVM classifiers. In particular, two criteria are explored, namely: 1) the simple support vector count and 2) the radius margin bound. The effectiveness of the proposed classification system in general and of these two criteria in particular is assessed both by simulated and real experiments. In addition, a comparison with classification approaches based on three different feature selection methods is reported, i.e., the steepest ascent (SA) algorithm and two other methods explicitly developed for SVM classifiers, namely: 1) the recursive feature elimination technique and 2) the radius margin bound minimization method

Journal ArticleDOI
TL;DR: A generic model for unsupervised extraction of viewer's attention objects from color images by integrating computational visual attention mechanisms with attention object growing techniques and describes the MRF by a Gibbs random field with an energy function.
Abstract: This paper proposes a generic model for unsupervised extraction of viewer's attention objects from color images. Without the full semantic understanding of image content, the model formulates the attention objects as a Markov random field (MRF) by integrating computational visual attention mechanisms with attention object growing techniques. Furthermore, we describe the MRF by a Gibbs random field with an energy function. The minimization of the energy function provides a practical way to obtain attention objects. Experimental results on 880 real images and user subjective evaluations by 16 subjects demonstrate the effectiveness of the proposed approach.

Journal ArticleDOI
TL;DR: Experimental results and comparisons with a standard technique developed for the analysis of very high spatial resolution images confirm the effectiveness of the proposed pixel-based classification system.
Abstract: This paper proposes a novel pixel-based system for the supervised classification of very high geometrical (spatial) resolution images. This system is aimed at obtaining accurate and reliable maps both by preserving the geometrical details in the images and by properly considering the spatial-context information. It is made up of two main blocks: 1) a novel feature-extraction block that, extending and developing some concepts previously presented in the literature, adaptively models the spatial context of each pixel according to a complete hierarchical multilevel representation of the scene and 2) a classifier, based on support vector machines (SVMs), capable of analyzing hyperdimensional feature spaces. The choice of adopting an SVM-based classification architecture is motivated by the potentially large number of parameters derived from the contextual feature-extraction stage. Experimental results and comparisons with a standard technique developed for the analysis of very high spatial resolution images confirm the effectiveness of the proposed system

Proceedings ArticleDOI
17 Jun 2006
TL;DR: It is shown that architectures such as convolutional networks are good at learning invariant features, but not always optimal for classification, while Support Vector Machines are good for producing decision surfaces from wellbehaved feature vectors, but cannot learn complicated invariances.
Abstract: The detection and recognition of generic object categories with invariance to viewpoint, illumination, and clutter requires the combination of a feature extractor and a classifier. We show that architectures such as convolutional networks are good at learning invariant features, but not always optimal for classification, while Support Vector Machines are good at producing decision surfaces from wellbehaved feature vectors, but cannot learn complicated invariances. We present a hybrid system where a convolutional network is trained to detect and recognize generic objects, and a Gaussian-kernel SVM is trained from the features learned by the convolutional network. Results are given on a large generic object recognition task with six categories (human figures, four-legged animals, airplanes, trucks, cars, and "none of the above"), with multiple instances of each object category under various poses, illuminations, and backgrounds. On the test set, which contains different object instances than the training set, an SVM alone yields a 43.3% error rate, a convolutional net alone yields 7.2% and an SVM on top of features produced by the convolutional net yields 5.9%.

Journal ArticleDOI
TL;DR: The proposed image hashing paradigm using visually significant feature points is proposed, which withstands standard benchmark attacks, including compression, geometric distortions of scaling and small-angle rotation, and common signal-processing operations.
Abstract: We propose an image hashing paradigm using visually significant feature points. The feature points should be largely invariant under perceptually insignificant distortions. To satisfy this, we propose an iterative feature detector to extract significant geometry preserving feature points. We apply probabilistic quantization on the derived features to introduce randomness, which, in turn, reduces vulnerability to adversarial attacks. The proposed hash algorithm withstands standard benchmark (e.g., Stirmark) attacks, including compression, geometric distortions of scaling and small-angle rotation, and common signal-processing operations. Content changing (malicious) manipulations of image data are also accurately detected. Detailed statistical analysis in the form of receiver operating characteristic (ROC) curves is presented and reveals the success of the proposed scheme in achieving perceptual robustness while avoiding misclassification

Journal ArticleDOI
TL;DR: A new likelihood ratio test that combines matched-filter responses, confidence measures and vessel boundary measures is presented, embedded into a vessel tracing framework, resulting in an efficient and effective vessel centerline extraction algorithm.
Abstract: Motivated by the goals of improving detection of low-contrast and narrow vessels and eliminating false detections at nonvascular structures, a new technique is presented for extracting vessels in retinal images. The core of the technique is a new likelihood ratio test that combines matched-filter responses, confidence measures and vessel boundary measures. Matched filter responses are derived in scale-space to extract vessels of widely varying widths. A vessel confidence measure is defined as a projection of a vector formed from a normalized pixel neighborhood onto a normalized ideal vessel profile. Vessel boundary measures and associated confidences are computed at potential vessel boundaries. Combined, these responses form a six-dimensional measurement vector at each pixel. A training technique is used to develop a mapping of this vector to a likelihood ratio that measures the "vesselness" at each pixel. Results comparing this vesselness measure to matched filters alone and to measures based on the Hessian of intensities show substantial improvements, both qualitatively and quantitatively. The Hessian can be used in place of the matched filter to obtain similar but less-substantial improvements or to steer the matched filter by preselecting kernel orientations. Finally, the new vesselness likelihood ratio is embedded into a vessel tracing framework, resulting in an efficient and effective vessel centerline extraction algorithm

Journal ArticleDOI
TL;DR: This paper addresses two specific issues related to the implementation of the FSV method, namely "how well does it produce results that agree with visual assessment?" and "what benefit can it provide in a practical validation environment?"
Abstract: The feature selective validation (FSV) method has been proposed as a technique to allow the objective, quantified, comparison of data for inter alia validation of computational electromagnetics. In the companion paper "Feature selective validation for validation of computational electromagnetics. Part I-The FSV method," the method was outlined in some detail. This paper addresses two specific issues related to the implementation of the FSV method, namely "how well does it produce results that agree with visual assessment?" and "what benefit can it provide in a practical validation environment?" The first of these questions is addressed by comparing the FSV output to the results of an extensive survey of EMC engineers from several countries. The second is approached via a case study analysis

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper proposes a novel approach to extract primitive 3D facial expression features, and then applies the feature distribution to classify the prototypic facial expressions, and demonstrates the advantages of the 3D geometric based approach over 2D texture based approaches in terms of various head poses.
Abstract: The creation of facial range models by 3D imaging systems has led to extensive work on 3D face recognition [19] However, little work has been done to study the usefulness of such data for recognizing and understanding facial expressions Psychological research shows that the shape of a human face, a highly mobile facial surface, is critical to facial expression perception In this paper, we investigate the importance and usefulness of 3D facial geometric shapes to represent and recognize facial expressions using 3D facial expression range data We propose a novel approach to extract primitive 3D facial expression features, and then apply the feature distribution to classify the prototypic facial expressions In order to validate our proposed approach, we have conducted experiments for person-independent facial expression recognition using our newly created 3D facial expression database We also demonstrate the advantages of our 3D geometric based approach over 2D texture based approaches in terms of various head poses