Showing papers on "Feature vector published in 2006"

PDF

Open Access

Journal Article•DOI•

Face Description with Local Binary Patterns: Application to Face Recognition

[...]

Timo Ahonen¹, Abdenour Hadid¹, Matti Pietikäinen¹•Institutions (1)

01 Dec 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features that is assessed in the face recognition problem under different challenges.

...read moreread less

Abstract: This paper presents a novel and efficient facial image representation based on local binary pattern (LBP) texture features. The face image is divided into several regions from which the LBP feature distributions are extracted and concatenated into an enhanced feature vector to be used as a face descriptor. The performance of the proposed method is assessed in the face recognition problem under different challenges. Other applications and several extensions are also discussed

...read moreread less

5,563 citations

Journal Article•DOI•

Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification

[...]

João V. B. Soares¹, J.J.G. Leandro¹, Roberto M. Cesar¹, Herbert F. Jelinek², Michael J. Cree³ - Show less +1 more•Institutions (3)

University of São Paulo¹, Charles Sturt University², University of Waikato³

21 Aug 2006-IEEE Transactions on Medical Imaging

TL;DR: In this paper, a method for automated segmentation of the vasculature in retinal images is presented, which produces segmentations by classifying each image pixel as vessel or non-vessel, based on the pixel's feature vector.

...read moreread less

Abstract: We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and two-dimensional Gabor wavelet transform responses taken at multiple scales. The Gabor wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE (Staal et al.,2004) and STARE (Hoover et al.,2000) databases of manually labeled images. On the DRIVE database, it achieves an area under the receiver operating characteristic curve of 0.9614, being slightly superior than that presented by state-of-the-art approaches. We are making our implementation available as open source MATLAB scripts for researchers interested in implementation details, evaluation, or development of methods

...read moreread less

1,435 citations

Journal Article•DOI•

MILES: Multiple-Instance Learning via Embedded Instance Selection

[...]

Yixin Chen, Jinbo Bi, James Z. Wang

01 Dec 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work proposes a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple- instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels.

...read moreread less

Abstract: Multiple-instance problems arise from the situations where training class labels are attached to sets of samples (named bags), instead of individual samples within each bag (called instances). Most previous multiple-instance learning (MIL) algorithms are developed based on the assumption that a bag is positive if and only if at least one of its instances is positive. Although the assumption works well in a drug activity prediction problem, it is rather restrictive for other applications, especially those in the computer vision area. We propose a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple-instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels. MILES maps each bag into a feature space defined by the instances in the training bags via an instance similarity measure. This feature mapping often provides a large number of redundant or irrelevant features. Hence, 1-norm SVM is applied to select important features as well as construct classifiers simultaneously. We have performed extensive experiments. In comparison with other methods, MILES demonstrates competitive classification accuracy, high computation efficiency, and robustness to labeling uncertainty

...read moreread less

766 citations

Journal Article•DOI•

Support vector machines for speaker and language recognition

[...]

William M. Campbell¹, Joseph P. Campbell¹, Douglas A. Reynolds¹, Elliot Singer¹, Pedro A. Torres-Carrasquillo¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 2006-Computer Speech & Language

TL;DR: This work considers the application of SVMs to speaker and language recognition and uses a sequence kernel that compares sequences of feature vectors and produces a measure of similarity to build upon a simpler mean-squared error classifier to produce a more accurate system.

...read moreread less

542 citations

Proceedings Article•

Within-class covariance normalization for SVM-based speaker recognition.

[...]

Andrew O. Hatch¹, Sachin S. Kajarekar², Andreas Stolcke²•Institutions (2)

University of California, Berkeley¹, SRI International²

01 Jan 2006

TL;DR: A practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space and achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over the previous baseline.

...read moreread less

Abstract: This paper extends the within-class covariance normalization (WCCN) technique described in [1, 2] for training generalized linear kernels. We describe a practical procedure for applying WCCN to an SVM-based speaker recognition system where the input feature vectors reside in a high-dimensional space. Our approach involves using principal component analysis (PCA) to split the original feature space into two subspaces: a low-dimensional “PCA space” and a high-dimensional “PCA-complement space.” After performing WCCN in the PCA space, we concatenate the resulting feature vectors with a weighted version of their PCAcomplements. When applied to a state-of-the-art MLLR-SVM speaker recognition system, this approach achieves improvements of up to 22% in EER and 28% in minimum decision cost function (DCF) over our previous baseline. We also achieve substantial improvements over an MLLR-SVM system that performs WCCN in the PCA space but discards the PCA-complement.

...read moreread less

461 citations

Proceedings Article•

Multi-Instance Multi-Label Learning with Application to Scene Classification

[...]

Zhi-Hua Zhou¹, Min-Ling Zhang¹•Institutions (1)

Nanjing University¹

04 Dec 2006

TL;DR: This paper formalizes multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels, and proposes the MIMLBOOST and MIMLSVM algorithms which achieve good performance in an application to scene classification.

...read moreread less

Abstract: In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels Such a problem can occur in many real-world tasks, eg an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multi-instance learning and multi-label learning Then, we propose the MIMLBOOST and MIMLSVM algorithms which achieve good performance in an application to scene classification

...read moreread less

455 citations

Journal Article•DOI•

Audio-based context recognition

[...]

Antti Eronen¹, V.T. Peltonen¹, J.T. Tuomi², Anssi Klapuri², Seppo Fagerlund³, Timo Sorsa¹, Gaetan Lorho¹, Jyri Huopaniemi¹ - Show less +4 more•Institutions (3)

Nokia¹, Tampere University of Technology², Helsinki University of Technology³

01 Dec 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.

...read moreread less

Abstract: The aim of this paper is to investigate the feasibility of an audio-based context recognition system. Here, context recognition refers to the automatic classification of the context or an environment around a device. A system is developed and compared to the accuracy of human listeners in the same task. Particular emphasis is placed on the computational complexity of the methods, since the application is of particular interest in resource-constrained portable devices. Simplistic low-dimensional feature vectors are evaluated against more standard spectral features. Using discriminative training, competitive recognition accuracies are achieved with very low-order hidden Markov models (1-3 Gaussian components). Slight improvement in recognition accuracy is observed when linear data-driven feature transformations are applied to mel-cepstral features. The recognition rate of the system as a function of the test sequence length appears to converge only after about 30 to 60 s. Some degree of accuracy can be achieved even with less than 1-s test sequence lengths. The average reaction time of the human listeners was 14 s, i.e., somewhat smaller, but of the same order as that of the system. The average recognition accuracy of the system was 58% against 69%, obtained in the listening tests in recognizing between 24 everyday contexts. The accuracies in recognizing six high-level classes were 82% for the system and 88% for the subjects.

...read moreread less

436 citations

Journal Article•DOI•

Generic object recognition with boosting

[...]

Andreas Opelt¹, Axel Pinz¹, Michael Fussenegger¹, Peter Auer²•Institutions (2)

Graz University of Technology¹, University of Leoben²

01 Mar 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a complete framework that starts with the extraction of various local regions of either discontinuity or homogeneity, and uses Boosting to learn a subset of feature vectors (weak hypotheses) and to combine them into one final hypothesis for each visual category.

...read moreread less

Abstract: This paper explores the power and the limitations of weakly supervised categorization. We present a complete framework that starts with the extraction of various local regions of either discontinuity or homogeneity. A variety of local descriptors can be applied to form a set of feature vectors for each local region. Boosting is used to learn a subset of such feature vectors (weak hypotheses) and to combine them into one final hypothesis for each visual category. This combination of individual extractors and descriptors leads to recognition rates that are superior to other approaches which use only one specific extractor/descriptor setting. To explore the limitation of our system, we had to set up new, highly complex image databases that show the objects of interest at varying scales and poses, in cluttered background, and under considerable occlusion. We obtain classification results up to 81 percent ROC-equal error rate on the most complex of our databases. Our approach outperforms all comparable solutions on common databases.

...read moreread less

422 citations

Journal Article•DOI•

Comprehensible Credit Scoring Models Using Rule Extraction from Support Vector Machines

[...]

David Martens¹, Bart Baesens¹, Tony Van Gestel¹, Jan Vanthienen¹•Institutions (1)

Katholieke Universiteit Leuven¹

27 Jan 2006-Social Science Research Network

TL;DR: This paper provides an overview of the recently proposed rule extraction techniques for SVMs and introduces two others taken from the artificial neural networks domain, being Trepan and G-REX, which rank at the top of comprehensible classification techniques.

...read moreread less

Abstract: In recent years, Support Vector Machines (SVMs) were successfully applied to a wide range of applications. Their good performance is achieved by an implicit non-linear transformation of the original problem to a high-dimensional (possibly infinite) feature space in which a linear decision hyperplane is constructed that yields a nonlinear classifier in the input space. However, since the classifier is described as a complex mathematical function, it is rather incomprehensible for humans. This opacity property prevents them from being used in many real- life applications where both accuracy and comprehensibility are required, such as medical diagnosis and credit risk evaluation. To overcome this limitation, rules can be extracted from the trained SVM that are interpretable by humans and keep as much of the accuracy of the SVM as possible. In this paper, we will provide an overview of the recently proposed rule extraction techniques for SVMs and introduce two others taken from the artificial neural networks domain, being Trepan and G-REX. The described techniques are compared using publicly avail- able datasets, such as Ripley's synthetic dataset and the multi-class iris dataset. We will also look at medical diagnosis and credit scoring where comprehensibility is a key requirement and even a regulatory recommendation. Our experiments show that the SVM rule extraction techniques lose only a small percentage in performance compared to SVMs and therefore rank at the top of comprehensible classification techniques.

...read moreread less

392 citations

Proceedings Article•DOI•

Large-scale Learning with SVM and Convolutional for Generic Object Categorization

[...]

Fu Jie Huang¹, Yann LeCun¹•Institutions (1)

New York University¹

17 Jun 2006

TL;DR: It is shown that architectures such as convolutional networks are good at learning invariant features, but not always optimal for classification, while Support Vector Machines are good for producing decision surfaces from wellbehaved feature vectors, but cannot learn complicated invariances.

...read moreread less

Abstract: The detection and recognition of generic object categories with invariance to viewpoint, illumination, and clutter requires the combination of a feature extractor and a classifier. We show that architectures such as convolutional networks are good at learning invariant features, but not always optimal for classification, while Support Vector Machines are good at producing decision surfaces from wellbehaved feature vectors, but cannot learn complicated invariances. We present a hybrid system where a convolutional network is trained to detect and recognize generic objects, and a Gaussian-kernel SVM is trained from the features learned by the convolutional network. Results are given on a large generic object recognition task with six categories (human figures, four-legged animals, airplanes, trucks, cars, and "none of the above"), with multiple instances of each object category under various poses, illuminations, and backgrounds. On the test set, which contains different object instances than the training set, an SVM alone yields a 43.3% error rate, a convolutional net alone yields 7.2% and an SVM on top of features produced by the convolutional net yields 5.9%.

...read moreread less

373 citations

Dissertation•

Finding People in Images and Videos

[...]

Navneet Dalal

17 Jul 2006

TL;DR: This thesis introduces grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images and proposes descriptors based on oriented histograms of differential optical flow to detect moving humans in videos.

...read moreread less

Abstract: This thesis targets the detection of humans and other object classes in images and videos. Our focus is on developing robust feature extraction algorithms that encode image regions as highdimensional feature vectors that support high accuracy object/non-object decisions. To test our feature sets we adopt a relatively simple learning framework that uses linear Support Vector Machines to classify each possible image region as an object or as a non-object. The approach is data-driven and purely bottom-up using low-level appearance and motion vectors to detect objects. As a test case we focus on person detection as people are one of the most challenging object classes with many applications, for example in film and video analysis, pedestrian detection for smart cars and video surveillance. Nevertheless we do not make any strong class specific assumptions and the resulting object detection framework also gives state-of-the-art performance for many other classes including cars, motorbikes, cows and sheep. This thesis makes four main contributions. Firstly, we introduce grids of locally normalised Histograms of Oriented Gradients (HOG) as descriptors for object detection in static images. The HOG descriptors are computed over dense and overlapping grids of spatial blocks, with image gradient orientation features extracted at fixed resolution and gathered into a highdimensional feature vector. They are designed to be robust to small changes in image contour locations and directions, and significant changes in image illumination and colour, while remaining highly discriminative for overall visual form. We show that unsmoothed gradients, fine orientation voting, moderately coarse spatial binning, strong normalisation and overlapping blocks are all needed for good performance. Secondly, to detect moving humans in videos, we propose descriptors based on oriented histograms of differential optical flow. These are similar to static HOG descriptors, but instead of image gradients, they are based on local differentials of dense optical flow. They encode the noisy optical flow estimates into robust feature vectors in a manner that is robust to the overall camera motion. Several variants are proposed, some capturing motion boundaries while others encode the relative motions of adjacent image regions. Thirdly, we propose a general method based on kernel density estimation for fusing multiple overlapping detections, that takes into account the number of detections, their confidence scores and the scales of the detections. Lastly, we present work in progress on a parts based approach to person detection that first detects local body parts like heads, torso, and legs and then fuses them to create a global overall person detector.

...read moreread less

Journal Article•DOI•

Information gain and divergence-based feature selection for machine learning-based text categorization

[...]

Changki Lee¹, Gary Geunbae Lee¹•Institutions (1)

Pohang University of Science and Technology¹

01 Jan 2006

TL;DR: This paper introduces a new information gain and divergence-based feature selection method for statistical machine learning-based text categorization without relying on more complex dependence models.

...read moreread less

Abstract: Most previous works of feature selection emphasized only the reduction of high dimensionality of the feature space. But in cases where many features are highly redundant with each other, we must utilize other means, for example, more complex dependence models such as Bayesian network classifiers. In this paper, we introduce a new information gain and divergence-based feature selection method for statistical machine learning-based text categorization without relying on more complex dependence models. Our feature selection method strives to reduce redundancy between features while maintaining information gain in selecting appropriate features for text categorization. Empirical results are given on a number of dataset, showing that our feature selection method is more effective than Koller and Sahami's method [Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Proceedings of ICML-96, 13th international conference on machine learning], which is one of greedy feature selection methods, and conventional information gain which is commonly used in feature selection for text categorization. Moreover, our feature selection method sometimes produces more improvements of conventional machine learning algorithms over support vector machines which are known to give the best classification accuracy.

...read moreread less

Journal Article•DOI•

A Real-Time EMG Pattern Recognition System Based on Linear-Nonlinear Feature Projection for a Multifunction Myoelectric Hand

[...]

Jun-Uk Chu, Inhyuk Moon¹, Mu-Seong Mun•Institutions (1)

Dong-eui University¹

16 Oct 2006-IEEE Transactions on Biomedical Engineering

TL;DR: A novel real-time electromyogram (EMG) pattern recognition for the control of a multifunction myoelectric hand from four channel EMG signals using a wavelet packet transform and a linear-nonlinear feature projection composed of principal components analysis (PCA) and a self-organizing feature map (SOFM).

...read moreread less

Abstract: This paper proposes a novel real-time electromyogram (EMG) pattern recognition for the control of a multifunction myoelectric hand from four channel EMG signals. To extract a feature vector from the EMG signal, we use a wavelet packet transform that is a generalized version of wavelet transform. For dimensionality reduction and nonlinear mapping of the features, we also propose a linear-nonlinear feature projection composed of principal components analysis (PCA) and a self-organizing feature map (SOFM). The dimensionality reduction by PCA simplifies the structure of the classifier and reduces processing time for the pattern recognition. The nonlinear mapping by SOFM transforms the PCA-reduced features into a new feature space with high class separability. Finally, a multilayer perceptron (MLP) is used as the classifier. Using an analysis of class separability by feature projections, we show that the recognition accuracy depends more on the class separability of the projected features than on the MLP's class separation ability. Consequently, the proposed linear-nonlinear projection method improves class separability and recognition accuracy. We implement a real-time control system for a multifunction virtual hand. Our experimental results show that all processes, including virtual hand control, are completed within 125 ms, and the proposed method is applicable to real-time myoelectric hand control without an operational time delay

...read moreread less

Journal Article•DOI•

Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance

[...]

Chengjun Liu¹•Institutions (1)

New Jersey Institute of Technology¹

01 May 2006-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel pattern recognition framework that integrates Gabor image representation, a novel multiclass kernel Fisher analysis (KFA) method, and fractional power polynomial models for improving pattern recognition performance is presented.

...read moreread less

Abstract: This paper presents a novel pattern recognition framework by capitalizing on dimensionality increasing techniques. In particular, the framework integrates Gabor image representation, a novel multiclass kernel Fisher analysis (KFA) method, and fractional power polynomial models for improving pattern recognition performance. Gabor image representation, which increases dimensionality by incorporating Gabor filters with different scales and orientations, is characterized by spatial frequency, spatial locality, and orientational selectivity for coping with image variabilities such as illumination variations. The KFA method first performs nonlinear mapping from the input space to a high-dimensional feature space, and then implements the multiclass Fisher discriminant analysis in the feature space. The significance of the nonlinear mapping is that it increases the discriminating power of the KFA method, which is linear in the feature space but nonlinear in the input space. The novelty of the KFA method comes from the fact that 1) it extends the two-class kernel Fisher methods by addressing multiclass pattern classification problems and 2) it improves upon the traditional generalized discriminant analysis (GDA) method by deriving a unique solution (compared to the GDA solution, which is not unique). The fractional power polynomial models further improve performance of the proposed pattern recognition framework. Experiments on face recognition using both the FERET database and the FRGC (face recognition grand challenge) databases show the feasibility of the proposed framework. In particular, experimental results using the FERET database show that the KFA method performs better than the GDA method and the fractional power polynomial models help both the KFA method and the GDA method improve their face recognition performance. Experimental results using the FRGC databases show that the proposed pattern recognition framework improves face recognition performance upon the BEE baseline algorithm and the LDA-based baseline algorithm by large margins.

...read moreread less

Journal Article•DOI•

A self-organizing learning array system for power quality classification based on wavelet transform

[...]

Haibo He¹, Janusz A. Starzyk¹•Institutions (1)

Ohio University¹

01 Jan 2006-IEEE Transactions on Power Delivery

TL;DR: It is shown that there is no statistically significant difference in performance of the proposed method for PQ classification when different wavelets are chosen, which means one can choose the wavelet with short wavelet filter length to achieve good classification results as well as small computational cost.

...read moreread less

Abstract: This paper proposed a novel approach for the Power Quality (PQ) disturbances classification based on the wavelet transform and self organizing learning array (SOLAR) system. Wavelet transform is utilized to extract feature vectors for various PQ disturbances based on the multiresolution analysis (MRA). These feature vectors then are applied to a SOLAR system for training and testing. SOLAR has three advantageous over a typical neural network: data driven learning, local interconnections and entropy based self-organization. Several typical PQ disturbances are taken into consideration in this paper. Comparison research between the proposed method, the support vector machine (SVM) method and existing literature reports show that the proposed method can provide accurate classification results. By the hypothesis test of the averages, it is shown that there is no statistically significant difference in performance of the proposed method for PQ classification when different wavelets are chosen. This means one can choose the wavelet with short wavelet filter length to achieve good classification results as well as small computational cost. Gaussian white noise is considered and the Monte Carlo method is used to simulate the performance of the proposed method in different noise conditions.

...read moreread less

Proceedings Article•DOI•

Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems

[...]

Roberto Perdisci¹, Guofei Gu¹, Wenke Lee¹•Institutions (1)

Georgia Institute of Technology¹

18 Dec 2006

TL;DR: This paper proposes a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade, and uses a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space.

...read moreread less

Abstract: Unsupervised or unlabeled learning approaches for network anomaly detection have been recently proposed. In particular, recent work on unlabeled anomaly detection focused on high speed classification based on simple payload statistics. For example, PAYL, an anomaly IDS, measures the occurrence frequency in the payload of n-grams. A simple model of normal traffic is then constructed according to this description of the packets' content. It has been demonstrated that anomaly detectors based on payload statistics can be "evaded" by mimicry attacks using byte substitution and padding techniques. In this paper we propose a new approach to construct high speed payload-based anomaly IDS intended to be accurate and hard to evade. We propose a new technique to extract the features from the payload. We use a feature clustering algorithm originally proposed for text classification problems to reduce the dimensionality of the feature space. Accuracy and hardness of evasion are obtained by constructing our anomaly-based IDS using an ensemble of one-class SVM classifiers that work on different feature spaces.

...read moreread less

Book Chapter•DOI•

Kernel-Based reinforcement learning

[...]

Guanghua Hu¹, Yuqin Qiu¹, Liming Xiang²•Institutions (2)

Yunnan University¹, City University of Hong Kong²

16 Aug 2006

TL;DR: Two kernel-based reinforcement learning algorithms, the e – KRL and the least squares kernel based reinforcement learning (LS-KRL) are proposed and an example shows that the proposed methods can deal effectively with the reinforcement learning problem without having to explore many states.

...read moreread less

Abstract: We consider the problem of approximating the cost-to-go functions in reinforcement learning By mapping the state implicitly into a feature space, we perform a simple algorithm in the feature space, which corresponds to a complex algorithm in the original state space Two kernel-based reinforcement learning algorithms, the e -insensitive kernel based reinforcement learning (e – KRL) and the least squares kernel based reinforcement learning (LS-KRL) are proposed An example shows that the proposed methods can deal effectively with the reinforcement learning problem without having to explore many states

...read moreread less

Journal Article•DOI•

Multi-time scale stream flow predictions: The support vector machines approach

[...]

Tirusew Asefa¹, Mariush Kemblowski¹, Mac McKee¹, Abedalrazq F. Khalil¹•Institutions (1)

Utah State University¹

01 Mar 2006-Journal of Hydrology

TL;DR: New data-driven models based on Statistical Learning Theory that were used to forecast flows at two time scales: seasonal flow volumes and hourly stream flows showed a promising performance in solving site-specific, real-time water resources management problems.

...read moreread less

Journal Article•DOI•

Real-time speaker identification and verification

[...]

Tomi Kinnunen¹, Evgeny Karpov¹, Pasi Fränti¹•Institutions (1)

University of Eastern Finland¹

01 Dec 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This paper focuses on optimizing vector quantization (VQ) based speaker identification, which reduces the number of test vectors by pre-quantizing the test sequence prior to matching, and thenumber of speakers by pruning out unlikely speakers during the identification process.

...read moreread less

Abstract: In speaker identification, most of the computation originates from the distance or likelihood computations between the feature vectors of the unknown speaker and the models in the database. The identification time depends on the number of feature vectors, their dimensionality, the complexity of the speaker models and the number of speakers. In this paper, we concentrate on optimizing vector quantization (VQ) based speaker identification. We reduce the number of test vectors by pre-quantizing the test sequence prior to matching, and the number of speakers by pruning out unlikely speakers during the identification process. The best variants are then generalized to Gaussian mixture model (GMM) based modeling. We apply the algorithms also to efficient cohort set search for score normalization in speaker verification. We obtain a speed-up factor of 16:1 in the case of VQ-based modeling with minor degradation in the identification accuracy, and 34:1 in the case of GMM-based modeling. An equal error rate of 7% can be reached in 0.84 s on average when the length of test utterance is 30.4 s.

...read moreread less

Patent•

Trust propagation through both explicit and implicit social networks

[...]

Pavel Berkhim¹, Zhichen Xu¹, Jianchang Mao¹, Daniel E. Rose¹, Abe Taha¹, Farzin Maghoul¹ - Show less +2 more•Institutions (1)

Yahoo!¹

02 Aug 2006

TL;DR: In this paper, the authors proposed a method for trust propagation in which a first feature vector for a first user, calculating a second feature for a second user and comparing the similarity value with the second feature vector to calculate a similarity value.

...read moreread less

Abstract: The present invention is directed towards systems and methods for trust propagation. The method according to one embodiment comprises calculating a first feature vector for a first user, calculating a second feature for a second user and comparing the first feature vector with the second feature vector to calculate a similarity value. A determination is made as to whether the similarity value falls within a threshold. If the similarity value falls within the threshold, a relationship is recorded between the first user and the second user in a first user profile and a second user profile.

...read moreread less

Journal Article•DOI•

Random Sampling for Subspace Face Recognition

[...]

Xiaogang Wang¹, Xiaoou Tang¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Oct 2006-International Journal of Computer Vision

TL;DR: An ensemble learning framework based on random sampling on all three key components of a classification system: the feature space, training samples, and subspace parameters is developed, and a robust random sampling face recognition system integrating shape, texture, and Gabor responses is constructed.

...read moreread less

Abstract: Subspace face recognition often suffers from two problems: (1) the training sample set is small compared with the high dimensional feature vector; (2) the performance is sensitive to the subspace dimension. Instead of pursuing a single optimal subspace, we develop an ensemble learning framework based on random sampling on all three key components of a classification system: the feature space, training samples, and subspace parameters. Fisherface and Null Space LDA (N-LDA) are two conventional approaches to address the small sample size problem. But in many cases, these LDA classifiers are overfitted to the training set and discard some useful discriminative information. By analyzing different overfitting problems for the two kinds of LDA classifiers, we use random subspace and bagging to improve them respectively. By random sampling on feature vectors and training samples, multiple stabilized Fisherface and N-LDA classifiers are constructed and the two groups of complementary classifiers are integrated using a fusion rule, so nearly all the discriminative information is preserved. In addition, we further apply random sampling on parameter selection in order to overcome the difficulty of selecting optimal parameters in our algorithms. Then, we use the developed random sampling framework for the integration of multiple features. A robust random sampling face recognition system integrating shape, texture, and Gabor responses is finally constructed.

...read moreread less

The Power of Word Clusters for Text Classification

[...]

Noam Slonim¹, Naftali Tishby¹•Institutions (1)

Hebrew University of Jerusalem¹

01 Jan 2006

TL;DR: This work applies the information bottleneck method to find word-clusters that preserve the information about document categories and use these clusters as features for classification, and shows that when the training sample is small word clusters can yield significant improvement in classification accuracy.

...read moreread less

Abstract: The recently introduced Information Bottleneck method [21] provides an information theoretic framework, for extracting features of one variable, that are relevant for the values of another variable. Several previous works already suggested applying this method for document clustering, gene expression data analysis, spectral analysis and more. In this work we present a novel implementation of this method for supervised text classification. Specifically, we apply the information bottleneck method to find word-clusters that preserve the information about document categories and use these clusters as features for classification. Previous work [1] used a similar clustering procedure to show that word-clusters can significantly reduce the feature space dimensionality, with only a minor change in classification accuracy. In this work we reproduce these results and go further to show that when the training sample is small word clusters can yield significant improvement in classification accuracy (up to 18%) over the performance using the words directly.

...read moreread less

Journal Article•DOI•

Random subspace method for multivariate feature selection

[...]

Carmen Lai¹, Marcel J. T. Reinders¹, Lodewyk F. A. Wessels¹•Institutions (1)

Delft University of Technology¹

15 Jul 2006-Pattern Recognition Letters

TL;DR: A new multivariate search technique is introduced, that is less sensitive to the noise in the data and computationally feasible as well and the robustness and reliability of the novel multivariate feature selection method are compared.

...read moreread less

Book Chapter•DOI•

Inter-modality face recognition

[...]

Dahua Lin¹, Xiaoou Tang¹•Institutions (1)

The Chinese University of Hong Kong¹

07 May 2006

TL;DR: A novel algorithm called Common Discriminant Feature Extraction specially tailored to the inter-modality face recognition problem is proposed and two nonlinear extensions of the algorithm are developed: one is based on kernelization, while the other is a multi-mode framework.

...read moreread less

Abstract: Recently, the wide deployment of practical face recognition systems gives rise to the emergence of the inter-modality face recognition problem. In this problem, the face images in the database and the query images captured on spot are acquired under quite different conditions or even using different equipments. Conventional approaches either treat the samples in a uniform model or introduce an intermediate conversion stage, both of which would lead to severe performance degradation due to the great discrepancies between different modalities. In this paper, we propose a novel algorithm called Common Discriminant Feature Extraction specially tailored to the inter-modality problem. In the algorithm, two transforms are simultaneously learned to transform the samples in both modalities respectively to the common feature space. We formulate the learning objective by incorporating both the empirical discriminative power and the local smoothness of the feature transformation. By explicitly controlling the model complexity through the smoothness constraint, we can effectively reduce the risk of overfitting and enhance the generalization capability. Furthermore, to cope with the nongaussian distribution and diverse variations in the sample space, we develop two nonlinear extensions of the algorithm: one is based on kernelization, while the other is a multi-mode framework. These extensions substantially improve the recognition performance in complex situation. Extensive experiments are conducted to test our algorithms in two application scenarios: optical image-infrared image recognition and photo-sketch recognition. Our algorithms show excellent performance in the experiments.

...read moreread less

Journal Article•DOI•

FS_SFS: A novel feature selection method for support vector machines

[...]

Yi Liu¹, Yuan F. Zheng¹•Institutions (1)

Ohio State University¹

01 Jul 2006-Pattern Recognition

TL;DR: A novel feature selection method named filtered and supported sequential forward search (FS_SFS) in the context of support vector machines (SVM) is presented, which has two important properties to reduce the time of computation.

...read moreread less

Journal Article•

Inter-modality Face Recognition

[...]

Dahua Lin¹, Xiaoou Tang¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: Wang et al. as mentioned in this paper proposed a Common Discriminant Feature Extraction (CDFE) algorithm for inter-modality face recognition, where two transforms are simultaneously learned to transform the samples in both modalities respectively to the common feature space.

...read moreread less

Abstract: Recently, the wide deployment of practical face recognition systems gives rise to the emergence of the inter-modality face recognition problem. In this problem, the face images in the database and the query images captured on spot are acquired under quite different conditions or even using different equipments. Conventional approaches either treat the samples in a uniform model or introduce an intermediate conversion stage, both of which would lead to severe performance degradation due to the great discrepancies between different modalities. In this paper, we propose a novel algorithm called Common Discriminant Feature Extraction specially tailored to the inter-modality problem. In the algorithm, two transforms are simultaneously learned to transform the samples in both modalities respectively to the common feature space. We formulate the learning objective by incorporating both the empirical discriminative power and tlie local smoothness of the feature transformation. By explicitly controlling the model complexity through the smoothness constraint, we can effectively reduce the risk of overfitting and enhance the generalization capability. Furthermore, to cope with the nongaussian distribution and diverse variations in the sample space, we develop two non-linear extensions of the algorithm: one is based on kernelization, while the other is a multi-mode framework. These extensions substantially improve the recognition performance in complex situation. Extensive experiments are conducted to test our algorithms in two application scenarios: optical image-infrared image recognition and photo-sketch recognition. Our algorithms show excellent performance in the experiments.

...read moreread less

Journal Article•DOI•

A general method for human activity recognition in video

[...]

Neil Robertson¹, Ian Reid¹•Institutions (1)

University of Oxford¹

01 Nov 2006-Computer Vision and Image Understanding

TL;DR: A system for human behaviour recognition in video sequences that combines Bayesian networks and belief propagation, non-parametric sampling from a previously learned database of actions, and Hidden Markov Models which encode scene rules are used to smooth sequences of actions.

...read moreread less

Patent•

Identifying language of origin for words using estimates of normalized appearance frequency

[...]

Yi Ning Chen¹, Min Chu¹, Jiali You¹, Frank K. Soong¹•Institutions (1)

Microsoft¹

01 Sep 2006

TL;DR: In this article, the authors predicted the language of origin of a word or named entity using estimates of frequency of occurrence of the word or entity in different languages in a variety of different languages.

...read moreread less

Abstract: The language of origin of a word or named entity is predicted using estimates of frequency of occurrence of the word or named entity in different languages. In one embodiment, the normalized frequency of occurrence of the word or named entity in a variety of different languages is estimated and the values are used as features in a feature vector which is scored and used to identify language of origin.

...read moreread less

Book Chapter•DOI•

Comparison of SVM and Some Older Classification Algorithms in Text Classification Tasks

[...]

Fabrice Colas¹, Pavel Brazdil²•Institutions (2)

Leiden University¹, University of Porto²

21 Aug 2006

TL;DR: Following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms, so should the authors just not bother about other classificationgorithms and opt always for SVM?

...read moreread less

Abstract: Document classification has already been widely studied. In fact, some studies compared feature selection techniques or feature space transformation whereas some others compared the performance of different algorithms. Recently, following the rising interest towards the Support Vector Machine, various studies showed that SVM outperforms other classification algorithms. So should we just not bother about other classification algorithms and opt always for SVM ?

...read moreread less

Proceedings Article•DOI•

Human action recognition using star skeleton

[...]

Hsuan-Sheng Chen¹, Hua-Tsung Chen¹, Yi-Wen Chen¹, Suh-Yin Lee¹•Institutions (1)

National Chiao Tung University¹

27 Oct 2006

TL;DR: This paper presents a HMM-based methodology for action recogni-tion using star skeleton as a representative descriptor of human posture, and implements a system to automatically recognize ten different types of actions.

...read moreread less

Abstract: This paper presents a HMM-based methodology for action recogni-tion using star skeleton as a representative descriptor of human posture. Star skeleton is a fast skeletonization technique by connecting from centroid of target object to contour extremes. To use star skeleton as feature for action recognition, we clearly define the fea-ture as a five-dimensional vector in star fashion because the head and four limbs are usually local extremes of human shape. In our proposed method, an action is composed of a series of star skeletons over time. Therefore, time-sequential images expressing human action are transformed into a feature vector sequence. Then the fea-ture vector sequence must be transformed into symbol sequence so that HMM can model the action. We design a posture codebook, which contains representative star skeletons of each action type and define a star distance to measure the similarity between feature vec-tors. Each feature vector of the sequence is matched against the codebook and is assigned to the symbol that is most similar. Conse-quently, the time-sequential images are converted to a symbol posture sequence. We use HMMs to model each action types to be recognized. In the training phase, the model parameters of the HMM of each category are optimized so as to best describe the training symbol sequences. For human action recognition, the model which best matches the observed symbol sequence is selected as the recog-nized category. We implement a system to automatically recognize ten different types of actions, and the system has been tested on real human action videos in two cases. One case is the classification of 100 video clips, each containing a single action type. A 98% recog-nition rate is obtained. The other case is a more realistic situation in which human takes a series of actions combined. An action-series recognition is achieved by referring a period of posture history using a sliding window scheme. The experimental results show promising performance.

...read moreread less

Collapse