scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 2011"


Journal ArticleDOI
TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.
Abstract: This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space because it models both speaker and channel variabilities. Two speaker verification systems are proposed which use this new representation. The first system is a support vector machine-based system that uses the cosine kernel to estimate the similarity between the input data. The second system directly uses the cosine similarity as the final decision score. We tested three channel compensation techniques in the total variability space, which are within-class covariance normalization (WCCN), linear discriminate analysis (LDA), and nuisance attribute projection (NAP). We found that the best results are obtained when LDA is followed by WCCN. We achieved an equal error rate (EER) of 1.12% and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset. We also obtained 4% absolute EER improvement for both-gender trials on the 10 s-10 s condition compared to the classical joint factor analysis scoring.

3,526 citations


Journal ArticleDOI
TL;DR: This tutorial proposes to use shrinkage estimators and shows that appropriate regularization of linear discriminant analysis (LDA) by shrinkage yields excellent results for single-trial ERP classification that are far superior to classical LDA classification.

1,046 citations


Journal ArticleDOI
TL;DR: A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework and has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets.
Abstract: Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits. A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework. sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.

672 citations


Journal ArticleDOI
TL;DR: This work proposes sparse discriminantAnalysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously in the high-dimensional setting.
Abstract: We consider the problem of performing interpretable classification in the high-dimensional setting, in which the number of features is very large and the number of observations is limited. This setting has been studied extensively in the chemometrics literature, and more recently has become commonplace in biological and medical applications. In this setting, a traditional approach involves performing feature selection before classification. We propose sparse discriminant analysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously. Sparse discriminant analysis is based on the optimal scoring interpretation of linear discriminant analysis, and can be extended to perform sparse discrimination via mixtures of Gaussians if boundaries between classes are nonlinear or if subgroups are present within each class. Our proposal also provides low-dimensional views of the discriminative directions.

565 citations


Journal ArticleDOI
TL;DR: A set of building blocks for constructing descriptors which can be combined together and jointly optimized so as to minimize the error of a nearest-neighbor classifier are described.
Abstract: In this paper, we explore methods for learning local image descriptors from training data. We describe a set of building blocks for constructing descriptors which can be combined together and jointly optimized so as to minimize the error of a nearest-neighbor classifier. We consider both linear and nonlinear transforms with dimensionality reduction, and make use of discriminant learning techniques such as Linear Discriminant Analysis (LDA) and Powell minimization to solve for the parameters. Using these techniques, we obtain descriptors that exceed state-of-the-art performance with low dimensionality. In addition to new experiments and recommendations for descriptor learning, we are also making available a new and realistic ground truth data set based on multiview stereo data.

520 citations


Journal ArticleDOI
TL;DR: Canonical variate analysis (CVA), the generalization of LDA for multiple groups, is often used in the exploratory style of an ordination technique (a low-dimensional representation of the data) and can be a simple alternative to LDA.
Abstract: Linear discriminant analysis (LDA) is a multivariate classification technique frequently applied to morphometric data in various biomedical disciplines. Canonical variate analysis (CVA), the generalization of LDA for multiple groups, is often used in the exploratory style of an ordination technique (a low-dimensional representation of the data). In the rare case when all groups have the same covariance matrix, maximum likelihood classification can be based on these linear functions. Both LDA and CVA require full-rank covariance matrices, which is usually not the case in modern morphometrics. When the number of variables is close to the number of individuals, groups appear separated in a CVA plot even if they are samples from the same population. Hence, reliable classification and assessment of group separation require many more organisms than variables. A simple alternative to CVA is the projection of the data onto the principal components of the group averages (between-group PCA). In contrast to CVA, these axes are orthogonal and can be computed even when the data are not of full rank, such as for Procrustes shape coordinates arising in samples of any size, and when covariance matrices are heterogeneous. In evolutionary quantitative genetics, the selection gradient is identical to the coefficient vector of a linear discriminant function between the populations before vs. after selection. When the measured variables are Procrustes shape coordinates, discriminant functions and selection gradients are vectors in shape space and can be visualized as shape deformations. Except for applications in quantitative genetics and in classification, however, discriminant functions typically offer no interpretation as biological factors.

419 citations


Journal ArticleDOI
TL;DR: This work proposes penalized LDA, which is a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability, and uses a minorization–maximization approach to optimize it efficiently when convex penalties are applied to the discriminating vectors.
Abstract: We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high-dimensional setting where p ≫ n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule obtained from LDA, since it involves all p features. We propose penalized LDA, a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization-maximization approach in order to efficiently optimize it when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L(1) and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high-dimensional setting, and explore their relationships with our proposal.

405 citations


Journal ArticleDOI
TL;DR: The influence of the algorithm used to enforce independence and of the number of IC retained for the classification of hyperspectral images is studied, proposing an effective method to estimate the most suitable number.
Abstract: In this paper, the use of Independent Component (IC) Discriminant Analysis (ICDA) for remote sensing classification is proposed. ICDA is a nonparametric method for discriminant analysis based on the application of a Bayesian classification rule on a signal composed by ICs. The method uses IC Analysis (ICA) to choose a transform matrix so that the transformed components are as independent as possible. When the data are projected in an independent space, the estimates of their multivariate density function can be computed in a much easier way as the product of univariate densities. A nonparametric kernel density estimator is used to compute the density functions of each IC. Finally, the Bayes rule is applied for the classification assignment. In this paper, we investigate the possibility of using ICDA for the classification of hyperspectral images. We study the influence of the algorithm used to enforce independence and of the number of IC retained for the classification, proposing an effective method to estimate the most suitable number. The proposed method is applied to several hyperspectral images, in order to test different data set conditions (urban/agricultural area, size of the training set, and type of sensor). Obtained results are compared with one of the most commonly used classifier of hyperspectral images (support vector machines) and show the comparative effectiveness of the proposed method in terms of accuracy.

342 citations


Journal ArticleDOI
TL;DR: When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests.
Abstract: Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing.

331 citations


Journal ArticleDOI
TL;DR: A new human face recognition algorithm based on bidirectional two dimensional principal component analysis (B2DPCA) and extreme learning machine (ELM) and a subband that exhibits a maximum standard deviation is dimensionally reduced using an improved dimensionality reduction technique.

308 citations


Proceedings ArticleDOI
20 Jun 2011
TL;DR: It is shown that by introducing within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, the geometrical structure of data can be exploited.
Abstract: A convenient way of dealing with image sets is to represent them as points on Grassmannian manifolds. While several recent studies explored the applicability of discriminant analysis on such manifolds, the conventional formalism of discriminant analysis suffers from not considering the local structure of the data. We propose a discriminant analysis approach on Grassmannian manifolds, based on a graph-embedding framework. We show that by introducing within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, the geometrical structure of data can be exploited. Experiments on several image datasets (PIE, BANCA, MoBo, ETH-80) show that the proposed algorithm obtains considerable improvements in discrimination accuracy, in comparison to three recent methods: Grassmann Discriminant Analysis (GDA), Kernel GDA, and the kernel version of Affine Hull Image Set Distance. We further propose a Grassmannian kernel, based on canonical correlation between subspaces, which can increase discrimination accuracy when used in combination with previous Grassmannian kernels.

Journal ArticleDOI
TL;DR: There was a statistically significant difference between the k -NN and LDA algorithms for the classification of wrist-motion directions such as up, down, right, left, and the rest state.

Journal ArticleDOI
TL;DR: A simple unsupervised adaptation method of the linear discriminant analysis (LDA) classifier is suggested that effectively solves this problem by counteracting the harmful effect of nonclass-related nonstationarities in electroencephalography (EEG) during BCI sessions performed with motor imagery tasks.
Abstract: There is a step of significant difficulty experienced by brain-computer interface (BCI) users when going from the calibration recording to the feedback application. This effect has been previously studied and a supervised adaptation solution has been proposed. In this paper, we suggest a simple unsupervised adaptation method of the linear discriminant analysis (LDA) classifier that effectively solves this problem by counteracting the harmful effect of nonclass-related nonstationarities in electroencephalography (EEG) during BCI sessions performed with motor imagery tasks. For this, we first introduce three types of adaptation procedures and investigate them in an offline study with 19 datasets. Then, we select one of the proposed methods and analyze it further. The chosen classifier is offline tested in data from 80 healthy users and four high spinal cord injury patients. Finally, for the first time in BCI literature, we apply this unsupervised classifier in online experiments. Additionally, we show that its performance is significantly better than the state-of-the-art supervised approach.

Journal ArticleDOI
TL;DR: The results of the current study suggest that nonlinear HRV analysis using short term ECG recording could be effective in automatically detecting real-life stress condition, such as a university examination.
Abstract: This study investigates the variations of Heart Rate Variability (HRV) due to a real-life stressor and proposes a classifier based on nonlinear features of HRV for automatic stress detection. 42 students volunteered to participate to the study about HRV and stress. For each student, two recordings were performed: one during an on-going university examination, assumed as a real-life stressor, and one after holidays. Nonlinear analysis of HRV was performed by using Poincare Plot, Approximate Entropy, Correlation dimension, Detrended Fluctuation Analysis, Recurrence Plot. For statistical comparison, we adopted the Wilcoxon Signed Rank test and for development of a classifier we adopted the Linear Discriminant Analysis (LDA). Almost all HRV features measuring heart rate complexity were significantly decreased in the stress session. LDA generated a simple classifier based on the two Poincare Plot parameters and Approximate Entropy, which enables stress detection with a total classification accuracy, a sensitivity and a specificity rate of 90%, 86%, and 95% respectively. The results of the current study suggest that nonlinear HRV analysis using short term ECG recording could be effective in automatically detecting real-life stress condition, such as a university examination.

Journal ArticleDOI
TL;DR: This paper proposes a discriminative model to address face matching in the presence of age variation and shows that this approach outperforms a state-of-the-art commercial face recognition engine on two public domain face aging data sets: MORPH and FG-NET.
Abstract: Aging variation poses a serious problem to automatic face recognition systems. Most of the face recognition studies that have addressed the aging problem are focused on age estimation or aging simulation. Designing an appropriate feature representation and an effective matching framework for age invariant face recognition remains an open problem. In this paper, we propose a discriminative model to address face matching in the presence of age variation. In this framework, we first represent each face by designing a densely sampled local feature description scheme, in which scale invariant feature transform (SIFT) and multi-scale local binary patterns (MLBP) serve as the local descriptors. By densely sampling the two kinds of local descriptors from the entire facial image, sufficient discriminatory information, including the distribution of the edge direction in the face image (that is expected to be age invariant) can be extracted for further analysis. Since both SIFT-based local features and MLBP-based local features span a high-dimensional feature space, to avoid the overfitting problem, we develop an algorithm, called multi-feature discriminant analysis (MFDA) to process these two local feature spaces in a unified framework. The MFDA is an extension and improvement of the LDA using multiple features combined with two different random sampling methods in feature and sample space. By random sampling the training set as well as the feature space, multiple LDA-based classifiers are constructed and then combined to generate a robust decision via a fusion rule. Experimental results show that our approach outperforms a state-of-the-art commercial face recognition engine on two public domain face aging data sets: MORPH and FG-NET. We also compare the performance of the proposed discriminative model with a generative aging model. A fusion of discriminative and generative models further improves the face matching accuracy in the presence of aging.

Journal ArticleDOI
TL;DR: In this article, the linear programming discriminant (LPD) rule is proposed for sparse linear discriminant analysis of high-dimensional data, which can be implemented efficiently using linear programming and the resulting classifier is called the LPD rule.
Abstract: This article considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix Ω and the difference δ of the mean vectors, we introduce a simple and effective classifier by estimating the product Ωδ directly through constrained l1 minimization. The estimator can be implemented efficiently using linear programming and the resulting classifier is called the linear programming discriminant (LPD) rule. The LPD rule is shown to have desirable theoretical and numerical properties. It exploits the approximate sparsity of Ωδ and as a consequence allows cases where it can still perform well even when Ω and/or δ cannot be estimated consistently. Asymptotic properties of the LPD rule are investigated and consistency and rate of convergence results are given. The LPD classifier has superior finite sample performance and significant computational advantages over the existing methods that require separate estimation...

Journal ArticleDOI
TL;DR: In this paper, a sparse linear discriminant analysis (LDA) was proposed to classify human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72.
Abstract: In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.

Journal ArticleDOI
01 Feb 2011
TL;DR: Spectral Regression Kernel Discriminant Analysis is presented, which casts discriminant analysis into a regression framework, which facilitates both efficient computation and the use of regularization techniques.
Abstract: Linear discriminant analysis (LDA) has been a popular method for dimensionality reduction, which preserves class separability. The projection vectors are commonly obtained by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. LDA can be performed either in the original input space or in the reproducing kernel Hilbert space (RKHS) into which data points are mapped, which leads to kernel discriminant analysis (KDA). When the data are highly nonlinear distributed, KDA can achieve better performance than LDA. However, computing the projective functions in KDA involves eigen-decomposition of kernel matrix, which is very expensive when a large number of training samples exist. In this paper, we present a new algorithm for kernel discriminant analysis, called Spectral Regression Kernel Discriminant Analysis (SRKDA). By using spectral graph analysis, SRKDA casts discriminant analysis into a regression framework, which facilitates both efficient computation and the use of regularization techniques. Specifically, SRKDA only needs to solve a set of regularized regression problems, and there is no eigenvector computation involved, which is a huge save of computational cost. The new formulation makes it very easy to develop incremental version of the algorithm, which can fully utilize the computational results of the existing training samples. Moreover, it is easy to produce sparse projections (Sparse KDA) with a L 1-norm regularizer. Extensive experiments on spoken letter, handwritten digit image and face image data demonstrate the effectiveness and efficiency of the proposed algorithm.

Journal ArticleDOI
TL;DR: A novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains by convolving multiscale and multi-orientation Gabor filters is proposed.
Abstract: Information jointly contained in image space, scale and orientation domains can provide rich important clues not seen in either individual of these domains. The position, spatial frequency and orientation selectivity properties are believed to have an important role in visual perception. This paper proposes a novel face representation and recognition approach by exploring information jointly in image space, scale and orientation domains. Specifically, the face image is first decomposed into different scale and orientation responses by convolving multiscale and multi-orientation Gabor filters. Second, local binary pattern analysis is used to describe the neighboring relationship not only in image space, but also in different scale and orientation responses. This way, information from different domains is explored to give a good face representation for recognition. Discriminant classification is then performed based upon weighted histogram intersection or conditional mutual information with linear discriminant analysis techniques. Extensive experimental results on FERET, AR, and FRGC ver 2.0 databases show the significant advantages of the proposed method over the existing ones.

Journal ArticleDOI
TL;DR: The paper reviews the state-of-the-art in gender classification, giving special attention to linear techniques and their relations, and proves that Linear Discriminant Analysis on a linearly selected set of features also achieves similar accuracies.
Abstract: Emerging applications of computer vision and pattern recognition in mobile devices and networked computing require the development of resource-limited algorithms. Linear classification techniques have an important role to play in this context, given their simplicity and low computational requirements. The paper reviews the state-of-the-art in gender classification, giving special attention to linear techniques and their relations. It discusses why linear techniques are not achieving competitive results and shows how to obtain state-of-the-art performances. Our work confirms previous results reporting very close classification accuracies for Support Vector Machines (SVMs) and boosting algorithms on single-database experiments. We have proven that Linear Discriminant Analysis on a linearly selected set of features also achieves similar accuracies. We perform cross-database experiments and prove that single database experiments were optimistically biased. If enough training data and computational resources are available, SVM's gender classifiers are superior to the rest. When computational resources are scarce but there is enough data, boosting or linear approaches are adequate. Finally, if training data and computational resources are very scarce, then the linear approach is the best choice.

Journal ArticleDOI
TL;DR: An improved LDA framework is proposed, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions, and can effectively capture the local structure of samples.
Abstract: The linear discriminant analysis (LDA) is a very popular linear feature extraction approach. The algorithms of LDA usually perform well under the following two assumptions. The first assumption is that the global data structure is consistent with the local data structure. The second assumption is that the input data classes are Gaussian distributions. However, in real-world applications, these assumptions are not always satisfied. In this paper, we propose an improved LDA framework, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions. Our LLDA framework can effectively capture the local structure of samples. According to different types of local data structure, our LLDA framework incorporates several different forms of linear feature extraction approaches, such as the classical LDA and principal component analysis. The proposed framework includes two LLDA algorithms: a vector-based LLDA algorithm and a matrix-based LLDA (MLLDA) algorithm. MLLDA is directly applicable to image recognition, such as face recognition. Our algorithms need to train only a small portion of the whole training set before testing a sample. They are suitable for learning large-scale databases especially when the input data dimensions are very high and can achieve high classification accuracy. Extensive experiments show that the proposed algorithms can obtain good classification results.

Journal ArticleDOI
TL;DR: When and why the linear discriminant analysis (LDA) has poor performance is explored and a sparse LDA is proposed that is asymptotically optimal under some sparsity conditions on the unknown parameters.
Abstract: In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.

Book
09 Feb 2011
TL;DR: A/B Split Testing Analytic Hierarchy Process Association and Causation in Quantitative Research Association Rule Mining Bass Model Canonical Correlation Chaos Theory Classification and Ranking Belief Simplex Clustering Algorithms Cluster Analysis Coding Conventions for Representing Information Conceptual Equivalence Confidence Intervals Conjoint Analysis Construct Operationalization Review (CORE) Correlation Analysis Correspondence Analysis Cross-national/Cultural Comparisons Cross-Sectional Designs Data Envelopment Analysis in Management Data Mining Data Set Structure Data Transformation Dempster-Shafer Theory Design
Abstract: Introduction A/B Split Testing Analytic Hierarchy Process Association and Causation in Quantitative Research Association Rule Mining Bass Model Canonical Correlation Chaos Theory Classification and Ranking Belief Simplex Clustering Algorithms Cluster Analysis Coding Conventions for Representing Information Conceptual Equivalence Confidence Intervals Conjoint Analysis Construct Operationalization Review (CORE) Correlation Analysis Correspondence Analysis Cross-National/Cultural Comparisons Cross-Sectional Designs Data Envelopment Analysis in Management Data Mining Data Set Structure Data Transformation Dempster-Shafer Theory Designing Experiments Discriminant Analysis Dominance Analysis DS/AHP Dummy Variable Coding Event Studies Methodology Evaluation Research Experimental Design Exploratory or Confirmatory Factor Analysis Factor Analysis Fuzzy Decision Trees Fuzzy Sets Fuzzy Time Series Models Generalized Linear Mixed Models Generalized Linear Models Geodemographics Geographical Information Systems Growth Models Intraclass Correlation Coefficient Internal and External Validity Item Response Theory Item Response Theory for Management Latent Segments Analysis Latent Variable Models Logical Discriminant Models Logistic Growth Model Logistic Regression Logistic Spline Model Measurement Reliability Measurement Invariance in Multigroup Research Measurement Scales Moderator-Mediator Variable Distinction Mixture Models Multi-Attribute Utility/Value Theory Multidimensional Scaling Multilevel Models Multi-Logistic Growth Model Multinomial Logistic Regression Multi-State Modelling Non-Parametric Measures of Association Optimal Control Models in Management Ordinary Least-Squares Regression Panel Design Paper Versus Electronic Surveys Parametric Tests Partial Correlation Principal Components Analysis Programme Evaluation PROMETHEE Method of Ranking Alternatives Proportional Odds Model Quantile Estimators Quantile Estimators - Bootstrap Confidence Intervals Rasch Model for Measurement Receiver Operating Characteristic Response Styles in Cross-National Research Retail Site Selection Sampling Equivalence in Cross-National Research Sample Size for Proportions Sample Sizes Versus Usable Observations Self-Organizing Maps Single-Case Research Designs Single-Case Single-Baseline Designs Sngle-Case Multiple-Baseline Designs Simulation - Discrete Event Simulation - Methodology Structural Equation Modelling in Business Management Structural Equation Modelling in Marketing - Part 1: Introduction and Basic Concepts Structural Equation Modelling in Marketing - Part 2: Model Calibration Structural Equation Modelling in Marketing - Part 3: An Example Study Design Survey Design Tabu Search Testing a Simple Hypothesis Validity Variable Precision Rough Sets Voronoi Diagrams Web Surveys Index

Posted Content
TL;DR: A simple and effective classifier is introduced by estimating the product Ωδ directly through constrained ℓ1 minimization and it has superior finite sample performance and significant computational advantages over the existing methods that require separate estimation of Ω and δ.
Abstract: This paper considers sparse linear discriminant analysis of high-dimensional data. In contrast to the existing methods which are based on separate estimation of the precision matrix $\O$ and the difference $\de$ of the mean vectors, we introduce a simple and effective classifier by estimating the product $\O\de$ directly through constrained $\ell_1$ minimization. The estimator can be implemented efficiently using linear programming and the resulting classifier is called the linear programming discriminant (LPD) rule. The LPD rule is shown to have desirable theoretical and numerical properties. It exploits the approximate sparsity of $\O\de$ and as a consequence allows cases where it can still perform well even when $\O$ and/or $\de$ cannot be estimated consistently. Asymptotic properties of the LPD rule are investigated and consistency and rate of convergence results are given. The LPD classifier has superior finite sample performance and significant computational advantages over the existing methods that require separate estimation of $\O$ and $\de$. The LPD rule is also applied to analyze real datasets from lung cancer and leukemia studies. The classifier performs favorably in comparison to existing methods.

Journal ArticleDOI
TL;DR: A Multi-Manifold Discriminant Analysis method based on graph embedded learning and under the Fisher discriminant analysis framework is proposed, leading to promising image recognition performance.

Journal ArticleDOI
TL;DR: A novel face representation based on locally adaptive regression kernel (LARK) descriptors which achieves state-of-the-art face verification performance on the challenging benchmark “Labeled Faces in the Wild” (LFW) dataset is presented.
Abstract: We present a novel face representation based on locally adaptive regression kernel (LARK) descriptors. Our LARK descriptor measures a self-similarity based on “signal-induced distance” between a center pixel and surrounding pixels in a local neighborhood. By applying principal component analysis (PCA) and a logistic function to LARK consecutively, we develop a new binary-like face representation which achieves state-of-the-art face verification performance on the challenging benchmark “Labeled Faces in the Wild” (LFW) dataset. In the case where training data are available, we employ one-shot similarity (OSS) based on linear discriminant analysis (LDA). The proposed approach achieves state-of-the-art performance on both the unsupervised setting and the image restrictive training setting (72.23% and 78.90% verification rates), respectively, as a single descriptor representation, with no preprocessing step. As opposed to combined 30 distances which achieve 85.13%, we achieve comparable performance (85.1%) with only 14 distances while significantly reducing computational complexity.

Journal ArticleDOI
TL;DR: A new criterion for discriminative dimension reduction, max-min distance analysis (MMDA), which maximizes the minimum pairwise distance of these C classes in the selected low-dimensional subspace, and extends MMDA to kernel MMDA (KMMDA).
Abstract: We propose a new criterion for discriminative dimension reduction, max-min distance analysis (MMDA). Given a data set with C classes, represented by homoscedastic Gaussians, MMDA maximizes the minimum pairwise distance of these C classes in the selected low-dimensional subspace. Thus, unlike Fisher's linear discriminant analysis (FLDA) and other popular discriminative dimension reduction criteria, MMDA duly considers the separation of all class pairs. To deal with general case of data distribution, we also extend MMDA to kernel MMDA (KMMDA). Dimension reduction via MMDA/KMMDA leads to a nonsmooth max-min optimization problem with orthonormal constraints. We develop a sequential convex relaxation algorithm to solve it approximately. To evaluate the effectiveness of the proposed criterion and the corresponding algorithm, we conduct classification and data visualization experiments on both synthetic data and real data sets. Experimental results demonstrate the effectiveness of MMDA/KMMDA associated with the proposed optimization algorithm.

Journal ArticleDOI
TL;DR: This work suggests a new direction to accelerate microarray technologies into a clinical routine through building a high-performance classifier to attain clinical-level sensitivities and specificities by treating an input profile as a ‘profile-biomarker’.
Abstract: Although high-throughput microarray based molecular diagnostic technologies show a great promise in cancer diagnosis, it is still far from a clinical application due to its low and instable sensitivities and specificities in cancer molecular pattern recognition. In fact, high-dimensional and heterogeneous tumor profiles challenge current machine learning methodologies for its small number of samples and large or even huge number of variables (genes). This naturally calls for the use of an effective feature selection in microarray data classification. We propose a novel feature selection method: multi-resolution independent component analysis (MICA) for large-scale gene expression data. This method overcomes the weak points of the widely used transform-based feature selection methods such as principal component analysis (PCA), independent component analysis (ICA), and nonnegative matrix factorization (NMF) by avoiding their global feature-selection mechanism. In addition to demonstrating the effectiveness of the multi-resolution independent component analysis in meaningful biomarker discovery, we present a multi-resolution independent component analysis based support vector machines (MICA-SVM) and linear discriminant analysis (MICA-LDA) to attain high-performance classifications in low-dimensional spaces. We have demonstrated the superiority and stability of our algorithms by performing comprehensive experimental comparisons with nine state-of-the-art algorithms on six high-dimensional heterogeneous profiles under cross validations. Our classification algorithms, especially, MICA-SVM, not only accomplish clinical or near-clinical level sensitivities and specificities, but also show strong performance stability over its peers in classification. Software that implements the major algorithm and data sets on which this paper focuses are freely available at https://sites.google.com/site/heyaumapbc2011/ . This work suggests a new direction to accelerate microarray technologies into a clinical routine through building a high-performance classifier to attain clinical-level sensitivities and specificities by treating an input profile as a ‘profile-biomarker’. The multi-resolution data analysis based redundant global feature suppressing and effective local feature extraction also have a positive impact on large scale ‘omics’ data mining.

Journal ArticleDOI
TL;DR: In this article, the use of discriminant function analyses (DFA) in archaeological and related research is on the increase, however many of the assumptions of this method receive a mixed treatment in the literature.

Journal Article
TL;DR: The proposed GLCM based face recognition system not only outperforms well-known techniques such as principal component analysis and linear discriminant analysis, but also has comparable performance with local binary patterns and Gabor wavelets.
Abstract: In this paper, a new face recognition technique is introduced based on the gray-level co-occurrence matrix (GLCM). GLCM represents the distributions of the intensities and the information about relative positions of neighboring pixels of an image. We proposed two methods to extract feature vectors using GLCM for face classification. The first method extracts the well-known Haralick features from the GLCM, and the second method directly uses GLCM by converting the matrix into a vector that can be used in the classification process. The results demonstrate that the second method, which uses GLCM directly, is superior to the first method that uses the feature vector containing the statistical Haralick features in both nearest neighbor and neural networks classifiers. The proposed GLCM based face recognition system not only outperforms well-known techniques such as principal component analysis and linear discriminant analysis, but also has comparable performance with local binary patterns and Gabor wavelets.