scispace - formally typeset
Search or ask a question

Showing papers on "Feature extraction published in 2003"


Proceedings ArticleDOI
18 Jun 2003
TL;DR: The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals).
Abstract: We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and relative scale. An entropy-based feature detector is used to select regions and their scale within the image. In learning the parameters of the scale-invariant object model are estimated. This is done using expectation-maximization in a maximum-likelihood setting. In recognition, this model is used in a Bayesian manner to classify images. The flexible nature of the model is demonstrated by excellent results over a range of datasets including geometrically constrained classes (e.g. faces, cars) and flexible objects (such as animals).

2,411 citations


Proceedings Article
21 Aug 2003
TL;DR: A novel concept, predominant correlation, is introduced, and a fast filter method is proposed which can identify relevant features as well as redundancy among relevant features without pairwise correlation analysis.
Abstract: Feature selection, as a preprocessing step to machine learning, is effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. However, the recent increase of dimensionality of data poses a severe challenge to many existing feature selection methods with respect to efficiency and effectiveness. In this work, we introduce a novel concept, predominant correlation, and propose a fast filter method which can identify relevant features as well as redundancy among relevant features without pairwise correlation analysis. The efficiency and effectiveness of our method is demonstrated through extensive comparisons with other methods using real-world data of high dimensionality

2,251 citations


Proceedings ArticleDOI
20 May 2003
TL;DR: This work develops a method for automatically distinguishing between positive and negative reviews and draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation.
Abstract: The web contains a wealth of product reviews, but sifting through them is a daunting task. Ideally, an opinion mining tool would process a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good). We begin by identifying the unique properties of this problem and develop a method for automatically distinguishing between positive and negative reviews. Our classifier draws on information retrieval techniques for feature extraction and scoring, and the results for various metrics and heuristics vary depending on the testing situation. The best methods work as well as or better than traditional machine learning. When operating on individual sentences collected from web searches, performance is limited due to noise and ambiguity. But in the context of a complete web-based tool and aided by a simple method for grouping sentences into attributes, the results are qualitatively quite useful.

2,238 citations


Proceedings ArticleDOI
13 Oct 2003
TL;DR: This work builds on the idea of the Harris and Forstner interest point operators and detects local structures in space-time where the image values have significant local variations in both space and time to detect spatio-temporal events.
Abstract: Local image features or interest points provide compact and abstract representations of patterns in an image. We propose to extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features often reflect interesting events that can be used for a compact representation of video data as well as for its interpretation. To detect spatio-temporal events, we build on the idea of the Harris and Forstner interest point operators and detect local structures in space-time where the image values have significant local variations in both space and time. We then estimate the spatio-temporal extents of the detected events and compute their scale-invariant spatio-temporal descriptors. Using such descriptors, we classify events and construct video representation in terms of labeled space-time points. For the problem of human motion analysis, we illustrate how the proposed method allows for detection of walking people in scenes with occlusions and dynamic backgrounds.

2,232 citations


Journal ArticleDOI
TL;DR: The system consists of a novel device for online palmprint image acquisition and an efficient algorithm for fast palmprint recognition, and a robust image coordinate system is defined to facilitate image alignment for feature extraction.
Abstract: Biometrics-based personal identification is regarded as an effective method for automatically recognizing, with a high confidence, a person's identity. This paper presents a new biometric approach to online personal identification using palmprint technology. In contrast to the existing methods, our online palmprint identification system employs low-resolution palmprint images to achieve effective personal identification. The system consists of two parts: a novel device for online palmprint image acquisition and an efficient algorithm for fast palmprint recognition. A robust image coordinate system is defined to facilitate image alignment for feature extraction. In addition, a 2D Gabor phase encoding scheme is proposed for palmprint feature extraction and representation. The experimental results demonstrate the feasibility of the proposed system.

1,416 citations


Journal ArticleDOI
TL;DR: A simple but efficient gait recognition algorithm using spatial-temporal silhouette analysis is proposed that implicitly captures the structural and transitional characteristics of gait.
Abstract: Human identification at a distance has recently gained growing interest from computer vision researchers. Gait recognition aims essentially to address this problem by identifying people based on the way they walk. In this paper, a simple but efficient gait recognition algorithm using spatial-temporal silhouette analysis is proposed. For each image sequence, a background subtraction algorithm and a simple correspondence procedure are first used to segment and track the moving silhouettes of a walking figure. Then, eigenspace transformation based on principal component analysis (PCA) is applied to time-varying distance signals derived from a sequence of silhouette images to reduce the dimensionality of the input feature space. Supervised pattern classification techniques are finally performed in the lower-dimensional eigenspace for recognition. This method implicitly captures the structural and transitional characteristics of gait. Extensive experimental results on outdoor image sequences demonstrate that the proposed algorithm has an encouraging recognition performance with relatively low computational cost.

1,183 citations


Journal ArticleDOI
TL;DR: A bank of spatial filters, whose kernels are suitable for iris recognition, is used to capture local characteristics of the iris so as to produce discriminating texture features and results show that the proposed method has an encouraging performance.
Abstract: With an increasing emphasis on security, automated personal identification based on biometrics has been receiving extensive attention over the past decade. Iris recognition, as an emerging biometric recognition approach, is becoming a very active topic in both research and practical applications. In general, a typical iris recognition system includes iris imaging, iris liveness detection, and recognition. This paper focuses on the last issue and describes a new scheme for iris recognition from an image sequence. We first assess the quality of each image in the input sequence and select a clear iris image from such a sequence for subsequent recognition. A bank of spatial filters, whose kernels are suitable for iris recognition, is then used to capture local characteristics of the iris so as to produce discriminating texture features. Experimental results show that the proposed method has an encouraging performance. In particular, a comparative study of existing methods for iris recognition is conducted on an iris image database including 2,255 sequences from 213 subjects. Conclusions based on such a comparison using a nonparametric statistical method (the bootstrap) provide useful information for further research.

1,052 citations


Proceedings ArticleDOI
18 Jun 2003
TL;DR: A best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting, which demonstrates the effectiveness of the algorithm in removing large occluding objects as well as thin scratches.
Abstract: A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way. In the past, this problem has been addressed by two classes of algorithms: (i) "texture synthesis" algorithms for generating large image regions from sample textures, and (ii) "inpainting" techniques for filling in small image gaps. The former work well for "textures" - repeating two dimensional patterns with some stochasticity; the latter focus on linear "structures" which can be thought of as one dimensional patterns, such as lines and object contours. This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual color values are computed using exemplar-based synthesis. Computational efficiency is achieved by a block-based sampling process. A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.

997 citations


Journal ArticleDOI
TL;DR: The proposed framework includes some novel low-level processing algorithms, such as dominant color region detection, robust shot boundary detection, and shot classification, as well as some higher-level algorithms for goal detection, referee detection,and penalty-box detection.
Abstract: We propose a fully automatic and computationally efficient framework for analysis and summarization of soccer videos using cinematic and object-based features. The proposed framework includes some novel low-level processing algorithms, such as dominant color region detection, robust shot boundary detection, and shot classification, as well as some higher-level algorithms for goal detection, referee detection, and penalty-box detection. The system can output three types of summaries: i) all slow-motion segments in a game; ii) all goals in a game; iii) slow-motion segments classified according to object-based features. The first two types of summaries are based on cinematic features only for speedy processing, while the summaries of the last type contain higher-level semantics. The proposed framework is efficient, effective, and robust. It is efficient in the sense that there is no need to compute object-based features when cinematic features are sufficient for the detection of certain events, e.g., goals in soccer. It is effective in the sense that the framework can also employ object-based features when needed to increase accuracy (at the expense of more computation). The efficiency, effectiveness, and robustness of the proposed framework are demonstrated over a large data set, consisting of more than 13 hours of soccer video, captured in different countries and under different conditions.

943 citations


Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors presented a new biometric approach to online personal identification using palmprint technology, which consists of two parts: a novel device for online palmprint image acquisition and an efficient algorithm for fast palmprint recognition.
Abstract: —Biometrics-based personal identification is regarded as an effective method for automatically recognizing, with a high confidence, a person's identity. This paper presents a new biometric approach to online personal identification using palmprint technology. In contrast to the existing methods, our online palmprint identification system employs low-resolution palmprint images to achieve effective personal identification. The system consists of two parts: a novel device for online palmprint image acquisition and an efficient algorithm for fast palmprint recognition. A robust image coordinate system is defined to facilitate image alignment for feature extraction. In addition, a 2D Gabor phase encoding scheme is proposed for palmprint feature extraction and representation. The experimental results demonstrate the feasibility of the proposed system.

908 citations


Journal ArticleDOI
TL;DR: A new algorithm is proposed that deals with both of the shortcomings in an efficient and cost effective manner of traditional linear discriminant analysis methods for face recognition systems.
Abstract: Low-dimensional feature representation with enhanced discriminatory power is of paramount importance to face recognition (FR) systems. Most of traditional linear discriminant analysis (LDA)-based methods suffer from the disadvantage that their optimality criteria are not directly related to the classification ability of the obtained feature representation. Moreover, their classification accuracy is affected by the "small sample size" (SSS) problem which is often encountered in FR tasks. In this paper, we propose a new algorithm that deals with both of the shortcomings in an efficient and cost effective manner. The proposed method is compared, in terms of classification accuracy, to other commonly used FR methods on two face databases. Results indicate that the performance of the proposed method is overall superior to those of traditional FR approaches, such as the eigenfaces, fisherfaces, and D-LDA methods.

Journal ArticleDOI
08 Sep 2003
TL;DR: The main components of audiovisual automatic speech recognition (ASR) are reviewed and novel contributions in two main areas are presented: first, the visual front-end design, based on a cascade of linear image transforms of an appropriate video region of interest, and subsequently, audiovISual speech integration.
Abstract: Visual speech information from the speaker's mouth region has been successfully shown to improve noise robustness of automatic speech recognizers, thus promising to extend their usability in the human computer interface. In this paper, we review the main components of audiovisual automatic speech recognition (ASR) and present novel contributions in two main areas: first, the visual front-end design, based on a cascade of linear image transforms of an appropriate video region of interest, and subsequently, audiovisual speech integration. On the latter topic, we discuss new work on feature and decision fusion combination, the modeling of audiovisual speech asynchrony, and incorporating modality reliability estimates to the bimodal recognition process. We also briefly touch upon the issue of audiovisual adaptation. We apply our algorithms to three multisubject bimodal databases, ranging from small- to large-vocabulary recognition tasks, recorded in both visually controlled and challenging environments. Our experiments demonstrate that the visual modality improves ASR over all conditions and data considered, though less so for visually challenging environments and large vocabulary tasks.

Journal ArticleDOI
TL;DR: It is seen that relatively few features are needed to achieve the same classification accuracies as in the original feature space when classification of panchromatic high-resolution data from urban areas using morphological and neural approaches.
Abstract: Classification of panchromatic high-resolution data from urban areas using morphological and neural approaches is investigated. The proposed approach is based on three steps. First, the composition of geodesic opening and closing operations of different sizes is used in order to build a differential morphological profile that records image structural information. Although, the original panchromatic image only has one data channel, the use of the composition operations will give many additional channels, which may contain redundancies. Therefore, feature extraction or feature selection is applied in the second step. Both discriminant analysis feature extraction and decision boundary feature extraction are investigated in the second step along with a simple feature selection based on picking the largest indexes of the differential morphological profiles. Third, a neural network is used to classify the features from the second step. The proposed approach is applied in experiments on high-resolution Indian Remote Sensing 1C (IRS-1C) and IKONOS remote sensing data from urban areas. In experiments, the proposed method performs well in terms of classification accuracies. It is seen that relatively few features are needed to achieve the same classification accuracies as in the original feature space.

Journal Article
Kari Torkkola1
TL;DR: A quadratic divergence measure is used instead of a commonly used mutual information measure based on Kullback-Leibler divergence, which allows for an efficient non-parametric implementation and requires no prior assumptions about class densities.
Abstract: We present a method for learning discriminative feature transforms using as criterion the mutual information between class labels and transformed features. Instead of a commonly used mutual information measure based on Kullback-Leibler divergence, we use a quadratic divergence measure, which allows us to make an efficient non-parametric implementation and requires no prior assumptions about class densities. In addition to linear transforms, we also discuss nonlinear transforms that are implemented as radial basis function networks. Extensions to reduce the computational complexity are also presented, and a comparison to greedy feature selection is made.

Proceedings ArticleDOI
01 Jan 2003
TL;DR: In this article, a pedestrian detection system that integrates image intensity information with motion information is presented, which uses a detection style algorithm that scans a detector over two consecutive frames of a video sequence.
Abstract: This paper describes a pedestrian detection system that integrates image intensity information with motion information. We use a detection style algorithm that scans a detector over two consecutive frames of a video sequence. The detector is trained (using AdaBoost) to take advantage of both motion and appearance information to detect a walking person. Past approaches have built detectors based on appearance information, but ours is the first to combine both sources of information in a single detector. The implementation described runs at about 4 frames/second, detects pedestrians at very small scales (as small as 20/spl times/15 pixels), and has a very low false positive rate. Our approach builds on the detection work of Viola and Jones. Novel contributions of this paper include: i) development of a representation of image motion which is extremely efficient, and ii) implementation of a state of the art pedestrian detection system which operates on low resolution images under difficult conditions (such as rain and snow).

Proceedings ArticleDOI
18 Jun 2003
TL;DR: Two appearance-based methods for clustering a set of images of 3D (three-dimensional) objects into disjoint subsets corresponding to individual objects, based on the concept of illumination cones and another affinity measure based on image gradient comparisons are introduced.
Abstract: We introduce two appearance-based methods for clustering a set of images of 3D (three-dimensional) objects, acquired under varying illumination conditions, into disjoint subsets corresponding to individual objects. The first algorithm is based on the concept of illumination cones. According to the theory, the clustering problem is equivalent to finding convex polyhedral cones in the high-dimensional image space. To efficiently determine the conic structures hidden in the image data, we introduce the concept of conic affinity, which measures the likelihood of a pair of images belonging to the same underlying polyhedral cone. For the second method, we introduce another affinity measure based on image gradient comparisons. The algorithm operates directly on the image gradients by comparing the magnitudes and orientations of the image gradient at each pixel. Both methods have clear geometric motivations, and they operate directly on the images without the need for feature extraction or computation of pixel statistics. We demonstrate experimentally that both algorithms are surprisingly effective in clustering images acquired under varying illumination conditions with two large, well-known image data sets.

Proceedings ArticleDOI
Collins1, Liu1
13 Oct 2003
TL;DR: This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance, and notes susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter.
Abstract: We present a method for evaluating multiple feature spaces while tracking, and for adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. We develop an online feature selection mechanism based on the two-class variance ratio measure, applied to log likelihood distributions computed with respect to a given feature from samples of object and background pixels. This feature selection mechanism is embedded in a tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented to illustrate how the method adapts to changing appearances of both tracked object and scene background.

Journal ArticleDOI
TL;DR: The results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques are competitive to the best ones previously reported on the same databases.

Journal ArticleDOI
TL;DR: The experiment shows that SVM by feature extraction using PCA, KPCA or ICA can perform better than that without feature extraction, and among the three methods, there is the best performance in K PCA feature extraction; followed by ICA feature extraction.

Journal ArticleDOI
TL;DR: This paper describes the texture classification using (i) wavelet statistical features, (ii) wavelets co-occurrence features and (iii) a combination of wavelets statistical features and co- Occurrence features of one level wavelet transformed images with different feature databases.

Journal ArticleDOI
TL;DR: Central to the method is a multi‐scale classification operator that allows feature analysis at multiplescales, using the size of the local neighborhoods as a discrete scale parameter, which significantly improves thereliability of the detection phase and makes the method more robust in the presence of noise.
Abstract: We present a new technique for extracting line-type features on point-sampled geometry. Given an unstructured point cloud as input, our method first applies principal component analysis on local neighborhoods to classify points according to the likelihood that they belong to a feature. Using hysteresis thresholding, we then compute a minimum spanning graph as an initial approximation of the feature lines. To smooth out the features while maintaining a close connection to the underlying surface, we use an adaptation of active contour models. Central to our method is a multi-scale classification operator that allows feature analysis at multiple scales, using the size of the local neighborhoods as a discrete scale parameter. This significantly improves the reliability of the detection phase and makes our method more robust in the presence of noise. To illustrate the usefulness of our method, we have implemented a non-photorealistic point renderer to visualize point-sampled surfaces as line drawings of their extracted feature curves.

Journal ArticleDOI
TL;DR: A hand gesture recognition system to recognize continuous gesture before stationary background consisting of a real time hand tracking and extraction, feature extraction, hidden Markov model (HMM) training, and gesture recognition.

Journal ArticleDOI
TL;DR: Experimental results verify the validity of the proposed approaches in personal authentication using the template-matching and the backpropagation neural network to measure the similarity in the verification stage.

Journal ArticleDOI
TL;DR: An independent Gabor features (IGFs) method and its application to face recognition is presented, which achieves 98.5% correct face recognition accuracy when using 180 features for the FERET dataset, and 100% accuracy for the ORL dataset using 88 features.
Abstract: We present an independent Gabor features (IGFs) method and its application to face recognition. The novelty of the IGF method comes from 1) the derivation of independent Gabor features in the feature extraction stage and 2) the development of an IGF features-based probabilistic reasoning model (PRM) classification method in the pattern recognition stage. In particular, the IGF method first derives a Gabor feature vector from a set of downsampled Gabor wavelet representations of face images, then reduces the dimensionality of the vector by means of principal component analysis, and finally defines the independent Gabor features based on the independent component analysis (ICA). The independence property of these Gabor features facilitates the application of the PRM method for classification. The rationale behind integrating the Gabor wavelets and the ICA is twofold. On the one hand, the Gabor transformed face images exhibit strong characteristics of spatial locality, scale, and orientation selectivity. These images can, thus, produce salient local features that are most suitable for face recognition. On the other hand, ICA would further reduce redundancy and represent independent features explicitly. These independent features are most useful for subsequent pattern discrimination and associative recall. Experiments on face recognition using the FacE REcognition Technology (FERET) and the ORL datasets, where the images vary in illumination, expression, pose, and scale, show the feasibility of the IGF method. In particular, the IGF method achieves 98.5% correct face recognition accuracy when using 180 features for the FERET dataset, and 100% accuracy for the ORL dataset using 88 features.

Journal ArticleDOI
01 Sep 2003
TL;DR: An approach to the detection of tumors in colonoscopic video based on a new color feature extraction scheme to represent the different regions in the frame sequence based on the wavelet decomposition, reaching 97% specificity and 90% sensitivity.
Abstract: We present an approach to the detection of tumors in colonoscopic video. It is based on a new color feature extraction scheme to represent the different regions in the frame sequence. This scheme is built on the wavelet decomposition. The features named as color wavelet covariance (CWC) are based on the covariances of second-order textural measures and an optimum subset of them is proposed after the application of a selection algorithm. The proposed approach is supported by a linear discriminant analysis (LDA) procedure for the characterization of the image regions along the video frames. The whole methodology has been applied on real data sets of color colonoscopic videos. The performance in the detection of abnormal colonic regions corresponding to adenomatous polyps has been estimated high, reaching 97% specificity and 90% sensitivity.

Journal ArticleDOI
TL;DR: This work has shown that the steadily increasing performance of computers again has become a driving force for new advances in flow visualisation, especially in techniques based on texturing, feature extraction, vector field clustering, and topology extraction.
Abstract: Flow visualisation is an attractive topic in data visualisation, offering great challenges for research. Very large data sets must be processed, consisting of multivariate data at large numbers of grid points, often arranged in many time steps. Recently, the steadily increasing performance of computers again has become a driving force for new advances in flow visualisation, especially in techniques based on texturing, feature extraction, vector field clustering, and topology extraction. In this article we present the state of the art in feature-based flow visualisation techniques. We will present numerous feature extraction techniques, categorised according to the type of feature. Next, feature tracking and event detection algorithms are discussed, for studying the evolution of features in time-dependent data sets. Finally, various visualisation techniques are demonstrated.

Journal ArticleDOI
TL;DR: The combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as time-consuming texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be texture-analyzed.
Abstract: The current paper presents a novel texture-based method for detecting texts in images. A support vector machine (SVM) is used to analyze the textural properties of texts. No external texture feature extraction module is used, but rather the intensities of the raw pixels that make up the textural pattern are fed directly to the SVM, which works well even in high-dimensional spaces. Next, text regions are identified by applying a continuously adaptive mean shift algorithm (CAMSHIFT) to the results of the texture analysis. The combination of CAMSHIFT and SVMs produces both robust and efficient text detection, as time-consuming texture analyses for less relevant pixels are restricted, leaving only a small part of the input image to be texture-analyzed.

Proceedings ArticleDOI
13 Oct 2003
TL;DR: This work exploits a recently proposed approximation technique, locality-sensitive hashing (LSH), to reduce the computational complexity of adaptive mean shift and implements the implementation of LSH, where the optimal parameters of the data structure are determined by a pilot learning procedure, and the partitions are data driven.
Abstract: Feature space analysis is the main module in many computer vision tasks. The most popular technique, k-means clustering, however, has two inherent limitations: the clusters are constrained to be spherically symmetric and their number has to be known a priori. In nonparametric clustering methods, like the one based on mean shift, these limitations are eliminated but the amount of computation becomes prohibitively large as the dimension of the space increases. We exploit a recently proposed approximation technique, locality-sensitive hashing (LSH), to reduce the computational complexity of adaptive mean shift. In our implementation of LSH the optimal parameters of the data structure are determined by a pilot learning procedure, and the partitions are data driven. As an application, the performance of mode and k-means based textons are compared in a texture classification study.

Journal ArticleDOI
TL;DR: A study to compare the performance of bearing fault detection using two different classifiers, namely, artificial neural networks and support vector machines (SMVs), using time-domain vibration signals of a rotating machine with normal and defective bearings.

Journal ArticleDOI
TL;DR: A new model-based moving feature extraction analysis is presented that automatically extracts and describes human gait for recognition and is shown to be able to handle high levels of occlusion, which is of especial importance in gait as the human body is self-occluding.