scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 2005"


Journal ArticleDOI
TL;DR: Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.
Abstract: We propose an appearance-based face recognition method called the Laplacianface approach. By using locality preserving projections (LPP), the face images are mapped into a face subspace for analysis. Different from principal component analysis (PCA) and linear discriminant analysis (LDA) which effectively see only the Euclidean structure of face space, LPP finds an embedding that preserves local information, and obtains a face subspace that best detects the essential face manifold structure. The Laplacianfaces are the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the face manifold. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. Theoretical analysis shows that PCA, LDA, and LPP can be obtained from different graph models. We compare the proposed Laplacianface approach with Eigenface and Fisherface methods on three different face data sets. Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.

3,314 citations


Journal ArticleDOI
TL;DR: This paper assesses performance of regularized radial basis function neural networks (Reg-RBFNN), standard support vector machines (SVMs), kernel Fisher discriminant (KFD) analysis, and regularized AdaBoost (reg-AB) in the context of hyperspectral image classification.
Abstract: This paper presents the framework of kernel-based methods in the context of hyperspectral image classification, illustrating from a general viewpoint the main characteristics of different kernel-based approaches and analyzing their properties in the hyperspectral domain. In particular, we assess performance of regularized radial basis function neural networks (Reg-RBFNN), standard support vector machines (SVMs), kernel Fisher discriminant (KFD) analysis, and regularized AdaBoost (Reg-AB). The novelty of this work consists in: 1) introducing Reg-RBFNN and Reg-AB for hyperspectral image classification; 2) comparing kernel-based methods by taking into account the peculiarities of hyperspectral images; and 3) clarifying their theoretical relationships. To these purposes, we focus on the accuracy of methods when working in noisy environments, high input dimension, and limited training sets. In addition, some other important issues are discussed, such as the sparsity of the solutions, the computational burden, and the capability of the methods to provide outputs that can be directly interpreted as probabilities.

1,428 citations


Journal ArticleDOI
TL;DR: This work compares several methods for estimating the 'true' prediction error of a prediction model in the presence of feature selection, and finds that LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis and the .632+ bootstrap has the lowest mean square error.
Abstract: Motivation: In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to this process: feature selection, model selection and prediction assessment. With a focus on prediction assessment, we compare several methods for estimating the 'true' prediction error of a prediction model in the presence of feature selection. Results: For small studies where features are selected from thousands of candidates, the resubstitution and simple split-sample estimates are seriously biased. In these small samples, leave-one-out cross-validation (LOOCV), 10-fold cross-validation (CV) and the .632+ bootstrap have the smallest bias for diagonal discriminant analysis, nearest neighbor and classification trees. LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis. Additionally, LOOCV, 5- and 10-fold CV, and the .632+ bootstrap have the lowest mean square error. The .632+ bootstrap is quite biased in small sample sizes with strong signal-to-noise ratios. Differences in performance among resampling methods are reduced as the number of specimens available increase. Contact: annette.molinaro@yale.edu Supplementary Information: A complete compilation of results and R code for simulations and analyses are available in Molinaro et al. (2005) (http://linus.nci.nih.gov/brb/TechReport.htm).

1,128 citations


Proceedings ArticleDOI
17 Oct 2005
TL;DR: A novel non-statistics based face representation approach, local Gabor binary pattern histogram sequence (LGBPHS), in which training procedure is unnecessary to construct the face model, so that the generalizability problem is naturally avoided.
Abstract: For years, researchers in face recognition area have been representing and recognizing faces based on subspace discriminant analysis or statistical learning. Nevertheless, these approaches are always suffering from the generalizability problem. This paper proposes a novel non-statistics based face representation approach, local Gabor binary pattern histogram sequence (LGBPHS), in which training procedure is unnecessary to construct the face model, so that the generalizability problem is naturally avoided. In this approach, a face image is modeled as a "histogram sequence" by concatenating the histograms of all the local regions of all the local Gabor magnitude binary pattern maps. For recognition, histogram intersection is used to measure the similarity of different LGBPHSs and the nearest neighborhood is exploited for final classification. Additionally, we have further proposed to assign different weights for each histogram piece when measuring two LGBPHSes. Our experimental results on AR and FERET face database show the validity of the proposed approach especially for partially occluded face images, and more impressively, we have achieved the best result on FERET face database.

1,093 citations


Proceedings ArticleDOI
27 Dec 2005
TL;DR: It is shown that existing SVM software can be used to solve the SVM/LDA formulation and empirical comparisons of the proposed algorithm with SVM and LDA using both synthetic and real world benchmark data are presented.
Abstract: This paper describes a new large margin classifier, named SVM/LDA. This classifier can be viewed as an extension of support vector machine (SVM) by incorporating some global information about the data. The SVM/LDA classifier can be also seen as a generalization of linear discriminant analysis (LDA) by incorporating the idea of (local) margin maximization into standard LDA formulation. We show that existing SVM software can be used to solve the SVM/LDA formulation. We also present empirical comparisons of the proposed algorithm with SVM and LDA using both synthetic and real world benchmark data.

1,030 citations


Book
01 Jan 2005
TL;DR: A review of basic statistics with SPSS can be found in this paper, where the authors present several measures of reliability, such as repeated measures and mixed ANOVAs, as well as a survey of the literature.
Abstract: Introduction and Review of Basic Statistics With SPSS. Data Coding and Exploratory Analysis (EDA). Several Measures of Reliability. Exploratory Factor Analysis and Principal Components Analysis. Selecting and Interpreting Inferential Statistics. Multiple Regression. Logistic Regression and Discriminant Analysis. Factorial ANOVA and ANCOVA. Repeated Measures and Mixed ANOVAs. Multivariate Analysis of Variance (MANOVA) and Canonical Correlation. Multilevel Linear Modeling/Hierarchical Linear Modeling. Appendices.

986 citations


Journal ArticleDOI
TL;DR: This paper investigates the predictability of financial movement direction with SVM by forecasting the weekly movement direction of NIKKEI 225 index and proposes a combining model by integrating SVM with the other classification methods.

984 citations



Journal ArticleDOI
TL;DR: A two-phase KFD framework is developed, i.e., kernel principal component analysis (KPCA) plus Fisher linear discriminant analysis (LDA), which provides novel insights into the nature of KFD.
Abstract: This paper examines the theory of kernel Fisher discriminant analysis (KFD) in a Hilbert space and develops a two-phase KFD framework, i.e., kernel principal component analysis (KPCA) plus Fisher linear discriminant analysis (LDA). This framework provides novel insights into the nature of KFD. Based on this framework, the authors propose a complete kernel Fisher discriminant analysis (CKFD) algorithm. CKFD can be used to carry out discriminant analysis in "double discriminant subspaces." The fact that, it can make full use of two kinds of discriminant information, regular and irregular, makes CKFD a more powerful discriminator. The proposed algorithm was tested and evaluated using the FERET face database and the CENPARMI handwritten numeral database. The experimental results show that CKFD outperforms other KFD algorithms.

856 citations


Journal ArticleDOI
TL;DR: The implementation of the statistical total correlation spectroscopy (STOCSY) analysis method with supervised pattern recognition and particularly orthogonal projection on latent structure-discriminant analysis (O-PLS-DA) offers a new powerful framework for analysis of metabonomic data.
Abstract: We describe here the implementation of the statistical total correlation spectroscopy (STOCSY) analysis method for aiding the identification of potential biomarker molecules in metabonomic studies based on NMR spectroscopic data. STOCSY takes advantage of the multicollinearity of the intensity variables in a set of spectra (in this case 1H NMR spectra) to generate a pseudo-two-dimensional NMR spectrum that displays the correlation among the intensities of the various peaks across the whole sample. This method is not limited to the usual connectivities that are deducible from more standard two-dimensional NMR spectroscopic methods, such as TOCSY. Moreover, two or more molecules involved in the same pathway can also present high intermolecular correlations because of biological covariance or can even be anticorrelated. This combination of STOCSY with supervised pattern recognition and particularly orthogonal projection on latent structure−discriminant analysis (O-PLS-DA) offers a new powerful framework for ...

823 citations


Journal ArticleDOI
TL;DR: It is demonstrated that SVM outperforms FLD in classification performance as well as in robustness of the spatial maps obtained (i.e, the SVM discrimination maps had greater overlap with the general linear model (GLM) analysis compared to the FLD).

Journal ArticleDOI
TL;DR: A novel document clustering method which aims to cluster the documents into different semantic classes by using locality preserving indexing (LPI), an unsupervised approximation of the supervised linear discriminant analysis (LDA) method, which gives the intuitive motivation of the method.
Abstract: We propose a novel document clustering method which aims to cluster the documents into different semantic classes. The document space is generally of high dimensionality and clustering in such a high dimensional space is often infeasible due to the curse of dimensionality. By using locality preserving indexing (LPI), the documents can be projected into a lower-dimensional semantic space in which the documents related to the same semantics are close to each other. Different from previous document clustering methods based on latent semantic indexing (LSI) or nonnegative matrix factorization (NMF), our method tries to discover both the geometric and discriminating structures of the document space. Theoretical analysis of our method shows that LPI is an unsupervised approximation of the supervised linear discriminant analysis (LDA) method, which gives the intuitive motivation of our method. Extensive experimental evaluations are performed on the Reuters-21578 and TDT2 data sets.

Journal ArticleDOI
TL;DR: An innovative algorithm named 2D-LDA is proposed, which directly extracts the proper features from image matrices based on Fisher's Linear Discriminant Analysis, and achieves the best performance.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: The system operates in real-time, and obtained 93% correct generalization to novel subjects for a 7-way forced choice on the Cohn-Kanade expression dataset, and has a mean accuracy of 94.8%.
Abstract: We present a systematic comparison of machine learning methods applied to the problem of fully automatic recognition of facial expressions. We report results on a series of experiments comparing recognition engines, including AdaBoost, support vector machines, linear discriminant analysis. We also explored feature selection techniques, including the use of AdaBoost for feature selection prior to classification by SVM or LDA. Best results were obtained by selecting a subset of Gabor filters using AdaBoost followed by classification with support vector machines. The system operates in real-time, and obtained 93% correct generalization to novel subjects for a 7-way forced choice on the Cohn-Kanade expression dataset. The outputs of the classifiers change smoothly as a function of time and thus can be used to measure facial expression dynamics. We applied the system to to fully automated recognition of facial actions (FACS). The present system classifies 17 action units, whether they occur singly or in combination with other actions, with a mean accuracy of 94.8%. We present preliminary results for applying this system to spontaneous facial expressions.

Journal ArticleDOI
TL;DR: The GMM-based limb motion classification system demonstrates exceptional classification accuracy and results in a robust method of motion classification with low computational load.
Abstract: This paper introduces and evaluates the use of Gaussian mixture models (GMMs) for multiple limb motion classification using continuous myoelectric signals. The focus of this work is to optimize the configuration of this classification scheme. To that end, a complete experimental evaluation of this system is conducted on a 12 subject database. The experiments examine the GMMs algorithmic issues including the model order selection and variance limiting, the segmentation of the data, and various feature sets including time-domain features and autoregressive features. The benefits of postprocessing the results using a majority vote rule are demonstrated. The performance of the GMM is compared to three commonly used classifiers: a linear discriminant analysis, a linear perceptron network, and a multilayer perceptron neural network. The GMM-based limb motion classification system demonstrates exceptional classification accuracy and results in a robust method of motion classification with low computational load.

Journal Article
TL;DR: This work presents the Relevant Component Analysis algorithm, which is a simple and efficient algorithm for learning a Mahalanobis metric, and shows that RCA is the solution of an interesting optimization problem, founded on an information theoretic basis.
Abstract: Many learning algorithms use a metric defined over the input space as a principal tool, and their performance critically depends on the quality of this metric. We address the problem of learning metrics using side-information in the form of equivalence constraints. Unlike labels, we demonstrate that this type of side-information can sometimes be automatically obtained without the need of human intervention. We show how such side-information can be used to modify the representation of the data, leading to improved clustering and classification.Specifically, we present the Relevant Component Analysis (RCA) algorithm, which is a simple and efficient algorithm for learning a Mahalanobis metric. We show that RCA is the solution of an interesting optimization problem, founded on an information theoretic basis. If dimensionality reduction is allowed within RCA, we show that it is optimally accomplished by a version of Fisher's linear discriminant that uses constraints. Moreover, under certain Gaussian assumptions, RCA can be viewed as a Maximum Likelihood estimation of the within class covariance matrix. We conclude with extensive empirical evaluations of RCA, showing its advantage over alternative methods.

Journal ArticleDOI
TL;DR: This paper describes a simple set of "recipes" for the analysis of high spatial density EEG, and demonstrates how corresponding algorithms can be used to remove eye-motion artifacts, extract strong evoked responses, and decompose temporally overlapping components.


Journal ArticleDOI
TL;DR: The proposed discriminative common vector method based on a variation of Fisher's linear discriminant analysis for the small sample size case is superior to other methods in terms of recognition accuracy, efficiency, and numerical stability.
Abstract: In face recognition tasks, the dimension of the sample space is typically larger than the number of the samples in the training set. As a consequence, the within-class scatter matrix is singular and the linear discriminant analysis (LDA) method cannot be applied directly. This problem is known as the "small sample size" problem. In this paper, we propose a new face recognition method called the discriminative common vector method based on a variation of Fisher's linear discriminant analysis for the small sample size case. Two different algorithms are given to extract the discriminative common vectors representing each person in the training set of the face database. One algorithm uses the within-class scatter matrix of the samples in the training set while the other uses the subspace methods and the Gram-Schmidt orthogonalization procedure to obtain the discriminative common vectors. Then, the discriminative common vectors are used for classification of new faces. The proposed method yields an optimal solution for maximizing the modified Fisher's linear discriminant criterion given in the paper. Our test results show that the discriminative common vector method is superior to other methods in terms of recognition accuracy, efficiency, and numerical stability.

Proceedings ArticleDOI
06 Jul 2005
TL;DR: This paper discussed the most important stages of a fully implemented emotion recognition system including data analysis and classification, and used a music induction method which elicits natural emotional reactions from the subject.
Abstract: Little attention has been paid so far to physiological signals for emotion recognition compared to audio-visual emotion channels, such as facial expressions or speech. In this paper, we discuss the most important stages of a fully implemented emotion recognition system including data analysis and classification. For collecting physiological signals in different affective states, we used a music induction method which elicits natural emotional reactions from the subject. Four-channel biosensors are used to obtain electromyogram, electrocardiogram, skin conductivity and respiration changes. After calculating a sufficient amount of features from the raw signals, several feature selection/reduction methods are tested to extract a new feature set consisting of the most significant features for improving classification performance. Three well-known classifiers, linear discriminant function, k-nearest neighbour and multilayer perceptron, are then used to perform supervised classification

01 Jan 2005
TL;DR: A probabilistic interpretation of canonical correlation (CCA) analysis as a latent variable model for two Gaussian random vectors for Fisher linear discriminant analysis within the CCA framework is given.
Abstract: We give a probabilistic interpretation of canonical correlation (CCA) analysis as a latent variable model for two Gaussian random vectors. Our interpretation is similar to the probabilistic interpretation of principal component analysis (Tipping and Bishop, 1999, Roweis, 1998). In addition, we cast Fisher linear discriminant analysis (LDA) within the CCA framework.

Journal ArticleDOI
TL;DR: This paper compares SVM to canonical variates analysis (CVA) by examining the relative sensitivity of each method to ten combinations of preprocessing choices consisting of spatial smoothing, temporal detrending, and motion correction, and proposes four methods for extracting activation maps from SVM models.

Proceedings ArticleDOI
20 Jun 2005
TL;DR: A new supervised algorithm, Marginal Fisher Analysis (MFA), is proposed, for dimensionality reduction by designing two graphs that characterize the intra-class compactness and inter-class separability, respectively.
Abstract: In the last decades, a large family of algorithms - supervised or unsupervised; stemming from statistic or geometry theory - have been proposed to provide different solutions to the problem of dimensionality reduction. In this paper, beyond the different motivations of these algorithms, we propose a general framework, graph embedding along with its linearization and kernelization, which in theory reveals the underlying objective shared by most previous algorithms. It presents a unified perspective to understand these algorithms; that is, each algorithm can be considered as the direct graph embedding or its linear/kernel extension of some specific graph characterizing certain statistic or geometry property of a data set. Furthermore, this framework is a general platform to develop new algorithm for dimensionality reduction. To this end, we propose a new supervised algorithm, Marginal Fisher Analysis (MFA), for dimensionality reduction by designing two graphs that characterize the intra-class compactness and inter-class separability, respectively. MFA measures the intra-class compactness with the distance between each data point and its neighboring points of the same class, and measures the inter-class separability with the class margins; thus it overcomes the limitations of traditional Linear Discriminant Analysis algorithm in terms of data distribution assumptions and available projection directions. The toy problem on artificial data and the real face recognition experiments both show the superiority of our proposed MFA in comparison to LDA.

Proceedings Article
05 Dec 2005
TL;DR: A new algorithm called Tensor Subspace Analysis (TSA) is proposed that detects the intrinsic local geometrical structure of the tensor space by learning a lower dimensional tensor subspace and achieves better recognition rate, while being much more efficient.
Abstract: Previous work has demonstrated that the image variations of many objects (human faces in particular) under variable lighting can be effectively modeled by low dimensional linear spaces. The typical linear sub-space learning algorithms include Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Locality Preserving Projection (LPP). All of these methods consider an n1 x n2 image as a high dimensional vector in ℝn1 x n2, while an image represented in the plane is intrinsically a matrix. In this paper, we propose a new algorithm called Tensor Subspace Analysis (TSA). TSA considers an image as the second order tensor in Rn1 ⊗ Rn2, where Rn1 and Rn2 are two vector spaces. The relationship between the column vectors of the image matrix and that between the row vectors can be naturally characterized by TSA. TSA detects the intrinsic local geometrical structure of the tensor space by learning a lower dimensional tensor subspace. We compare our proposed approach with PCA, LDA and LPP methods on two standard databases. Experimental results demonstrate that TSA achieves better recognition rate, while being much more efficient.

Proceedings ArticleDOI
27 Nov 2005
TL;DR: This paper generalizes MPM to its STL version, which is named the tensor MPM (TMPM), and develops a method for tensor based feature extraction, named the tenor rank-one discriminant analysis (TR1DA).
Abstract: This paper aims to take general tensors as inputs for supervised learning. A supervised tensor learning (STL) framework is established for convex optimization based learning techniques such as support vector machines (SVM) and minimax probability machines (MPM). Within the STL framework, many conventional learning machines can be generalized to take n/sup th/-order tensors as inputs. We also study the applications of tensors to learning machine design and feature extraction by linear discriminant analysis (LDA). Our method for tensor based feature extraction is named the tenor rank-one discriminant analysis (TR1DA). These generalized algorithms have several advantages: 1) reduce the curse of dimension problem in machine learning and data mining; 2) avoid the failure to converge; and 3) achieve better separation between the different categories of samples. As an example, we generalize MPM to its STL version, which is named the tensor MPM (TMPM). TMPM learns a series of tensor projections iteratively. It is then evaluated against the original MPM. Our experiments on a binary classification problem show that TMPM significantly outperforms the original MPM.

Journal ArticleDOI
TL;DR: The results of the multi-channel analysis indicate SVM as the most successful classifier, whereas kNN performed worst, while the single-channel results gave rise to topographic maps that revealed the channels with the highest level of separability between classes for each subject.
Abstract: To determine and compare the performance of different classifiers applied to four-class EEG data is the goal of this communication. The EEG data were recorded with 60 electrodes from five subjects performing four different motor-imagery tasks. The EEG signal was modeled by an adaptive autoregressive (AAR) process whose parameters were extracted by Kalman filtering. By these AAR parameters four classifiers were obtained, namely minimum distance analysis (MDA)--for single-channel analysis, and linear discriminant analysis (LDA), k-nearest-neighbor (kNN) classifiers as well as support vector machine (SVM) classifiers for multi-channel analysis. The performance of all four classifiers was quantified and evaluated by Cohen's kappa coefficient, an advantageous measure we introduced here to BCI research for the first time. The single-channel results gave rise to topographic maps that revealed the channels with the highest level of separability between classes for each subject. Our results of the multi-channel analysis indicate SVM as the most successful classifier, whereas kNN performed worst.

Journal ArticleDOI
TL;DR: The proposed hybrid approach outperforms the results using discriminant analysis, logistic regression, artificial neural networks and MARS and hence provides an alternative in handling credit scoring tasks.
Abstract: The objective of the proposed study is to explore the performance of credit scoring using a two-stage hybrid modeling procedure with artificial neural networks and multivariate adaptive regression splines (MARS). The rationale under the analyses is firstly to use MARS in building the credit scoring model, the obtained significant variables are then served as the input nodes of the neural networks model. To demonstrate the effectiveness and feasibility of the proposed modeling procedure, credit scoring tasks are performed on one bank housing loan dataset using cross-validation approach. As the results reveal, the proposed hybrid approach outperforms the results using discriminant analysis, logistic regression, artificial neural networks and MARS and hence provides an alternative in handling credit scoring tasks.

Book ChapterDOI
TL;DR: This chapter discusses tree-based classification and regression, as well as bagging and boosting, and introduces some general information of the methods and describes how the methods work.
Abstract: Publisher Summary This chapter discusses tree-based classification and regression, as well as bagging and boosting. It introduces some general information of the methods and describes how the methods work. Tree-structured classification and regression are alternative approaches to classification and regression that are not based on assumptions of normality and user-specified model statements, as are some older methods such as discriminant analysis and ordinary least squares regression. Tree-structured classification and regression are nonparametric computationally intensive methods that have greatly increased in popularity during the past several years. They can be applied to data sets having both a large number of cases and a large number of variables, and they are extremely resistant to outliers. Bagging and boosting are general techniques for improving prediction rules. They can be applied to tree-based methods to increase the accuracy of the resulting predictions, although it should be emphasized that they can be used with methods other than tree-based methods, such as neural networks.

Journal Article
TL;DR: A generalized discriminant analysis based on a new optimization criterion that extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) when the scatter matrices are singular is presented.
Abstract: A generalized discriminant analysis based on a new optimization criterion is presented. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) when the scatter matrices are singular. An efficient algorithm for the new optimization problem is presented.The solutions to the proposed criterion form a family of algorithms for generalized LDA, which can be characterized in a closed form. We study two specific algorithms, namely Uncorrelated LDA (ULDA) and Orthogonal LDA (OLDA). ULDA was previously proposed for feature extraction and dimension reduction, whereas OLDA is a novel algorithm proposed in this paper. The features in the reduced space of ULDA are uncorrelated, while the discriminant vectors of OLDA are orthogonal to each other. We have conducted a comparative study on a variety of real-world data sets to evaluate ULDA and OLDA in terms of classification accuracy.

Journal ArticleDOI
01 Oct 2005
TL;DR: The results show that the proposed ILDA can effectively evolve a discriminant eigenspace over a fast and large data stream, and extract features with superior discriminability in classification, when compared with other methods.
Abstract: This paper presents a constructive method for deriving an updated discriminant eigenspace for classification when bursts of data that contains new classes is being added to an initial discriminant eigenspace in the form of random chunks. Basically, we propose an incremental linear discriminant analysis (ILDA) in its two forms: a sequential ILDA and a Chunk ILDA. In experiments, we have tested ILDA using datasets with a small number of classes and small-dimensional features, as well as datasets with a large number of classes and large-dimensional features. We have compared the proposed ILDA against the traditional batch LDA in terms of discriminability, execution time and memory usage with the increasing volume of data addition. The results show that the proposed ILDA can effectively evolve a discriminant eigenspace over a fast and large data stream, and extract features with superior discriminability in classification, when compared with other methods.