scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 2003"


Journal ArticleDOI
TL;DR: It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.
Abstract: In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.

906 citations


Journal ArticleDOI
TL;DR: A new algorithm is proposed that deals with both of the shortcomings in an efficient and cost effective manner of traditional linear discriminant analysis methods for face recognition systems.
Abstract: Low-dimensional feature representation with enhanced discriminatory power is of paramount importance to face recognition (FR) systems. Most of traditional linear discriminant analysis (LDA)-based methods suffer from the disadvantage that their optimality criteria are not directly related to the classification ability of the obtained feature representation. Moreover, their classification accuracy is affected by the "small sample size" (SSS) problem which is often encountered in FR tasks. In this paper, we propose a new algorithm that deals with both of the shortcomings in an efficient and cost effective manner. The proposed method is compared, in terms of classification accuracy, to other commonly used FR methods on two face databases. Results indicate that the performance of the proposed method is overall superior to those of traditional FR approaches, such as the eigenfaces, fisherfaces, and D-LDA methods.

811 citations


Journal ArticleDOI
TL;DR: In this paper, a method of generalized discriminant analysis based on a dissimilarity matrix to test for differences in a priori groups of multivariate observations is described. But the results of the analysis are not as robust to changes in the distributions of the original variables, unlike the distribution of the multi-response permutation test statistics which have been considered by other workers for testing differences among groups.
Abstract: Summary This paper describes a method of generalized discriminant analysis based on a dissimilarity matrix to test for differences in a priori groups of multivariate observations. Use of classical multidimensional scaling produces a low-dimensional representation of the data for which Euclidean distances approximate the original dissimilarities. The resulting scores are then analysed using discriminant analysis, giving tests based on the canonical correlations. The asymptotic distributions of these statistics under permutations of the observations are shown to be invariant to changes in the distributions of the original variables, unlike the distributions of the multi-response permutation test statistics which have been considered by other workers for testing differences among groups. This canonical method is applied to multivariate fish assemblage data, with Monte Carlo simulations to make power comparisons and to compare theoretical results and empirical distributions. The paper proposes classification based on distances. Error rates are estimated using cross-validation.

689 citations


Journal ArticleDOI
28 Jul 2003
TL;DR: The results of a linear (linear discriminant analysis) and two nonlinear classifiers applied to the classification of spontaneous EEG during five mental tasks are reported, showing that non linear classifiers produce only slightly better classification results.
Abstract: The reliable operation of brain-computer interfaces (BCIs) based on spontaneous electroencephalogram (EEG) signals requires accurate classification of multichannel EEG. The design of EEG representations and classifiers for BCI are open research questions whose difficulty stems from the need to extract complex spatial and temporal patterns from noisy multidimensional time series obtained from EEG measurements. The high-dimensional and noisy nature of EEG may limit the advantage of nonlinear classification methods over linear ones. This paper reports the results of a linear (linear discriminant analysis) and two nonlinear classifiers (neural networks and support vector machines) applied to the classification of spontaneous EEG during five mental tasks, showing that nonlinear classifiers produce only slightly better classification results. An approach to feature selection based on genetic algorithms is also presented with preliminary results of application to EEG during finger movement.

686 citations


Journal ArticleDOI
TL;DR: This paper proposes a kernel machine-based discriminant analysis method, which deals with the nonlinearity of the face patterns' distribution and effectively solves the so-called "small sample size" (SSS) problem, which exists in most FR tasks.
Abstract: Techniques that can introduce low-dimensional feature representation with enhanced discriminatory power is of paramount importance in face recognition (FR) systems. It is well known that the distribution of face images, under a perceivable variation in viewpoint, illumination or facial expression, is highly nonlinear and complex. It is, therefore, not surprising that linear techniques, such as those based on principle component analysis (PCA) or linear discriminant analysis (LDA), cannot provide reliable and robust solutions to those FR problems with complex face variations. In this paper, we propose a kernel machine-based discriminant analysis method, which deals with the nonlinearity of the face patterns' distribution. The proposed method also effectively solves the so-called "small sample size" (SSS) problem, which exists in most FR tasks. The new algorithm has been tested, in terms of classification error rate performance, on the multiview UMIST face database. Results indicate that the proposed methodology is able to achieve excellent performance with only a very small set of features being used, and its error rate is approximately 34% and 48% of those of two other commonly used kernel FR approaches, the kernel-PCA (KPCA) and the generalized discriminant analysis (GDA), respectively.

651 citations


Journal ArticleDOI
TL;DR: This paper points out the weakness of the previous LDA based methods, and suggests a complete PCA plus LDA algorithm, and experimental results indicate that the proposed method is more effective than the previous ones.

583 citations


Proceedings Article
21 Aug 2003
TL;DR: It is empirically demonstrate that learning a distance metric using the RCA algorithm significantly improves clustering performance, similarly to the alternative algorithm.
Abstract: We address the problem of learning distance metrics using side-information in the form of groups of "similar" points. We propose to use the RCA algorithm, which is a simple and efficient algorithm for learning a full ranked Mahalanobis metric (Shental et al., 2002). We first show that RCA obtains the solution to an interesting optimization problem, founded on an information theoretic basis. If the Mahalanobis matrix is allowed to be singular, we show that Fisher's linear discriminant followed by RCA is the optimal dimensionality reduction algorithm under the same criterion. We then show how this optimization problem is related to the criterion optimized by another recent algorithm for metric learning (Xing et al., 2002), which uses the same kind of side information. We empirically demonstrate that learning a distance metric using the RCA algorithm significantly improves clustering performance, similarly to the alternative algorithm. Since the RCA algorithm is much more efficient and cost effective than the alternative, as it only uses closed form expressions of the data, it seems like a preferable choice for the learning of full rank Mahalanobis distances.

481 citations


Journal ArticleDOI
01 Sep 2003
TL;DR: An approach to the detection of tumors in colonoscopic video based on a new color feature extraction scheme to represent the different regions in the frame sequence based on the wavelet decomposition, reaching 97% specificity and 90% sensitivity.
Abstract: We present an approach to the detection of tumors in colonoscopic video. It is based on a new color feature extraction scheme to represent the different regions in the frame sequence. This scheme is built on the wavelet decomposition. The features named as color wavelet covariance (CWC) are based on the covariances of second-order textural measures and an optimum subset of them is proposed after the application of a selection algorithm. The proposed approach is supported by a linear discriminant analysis (LDA) procedure for the characterization of the image regions along the video frames. The whole methodology has been applied on real data sets of color colonoscopic videos. The performance in the detection of abnormal colonic regions corresponding to adenomatous polyps has been estimated high, reaching 97% specificity and 90% sensitivity.

480 citations


Journal ArticleDOI
TL;DR: The experimental results indicate that the classification accuracy is increased significantly under parallel feature fusion and also demonstrate that the developed parallel fusion is more effective than the classical serial feature fusion.

418 citations


Journal ArticleDOI
TL;DR: The experimental results show that, in the proposed method, the palmprint images with resolution 32 × 32 are optimal for medium security biometric systems while those with resolution 64 × 64 are optimalFor high security biometrics systems.

416 citations


Journal ArticleDOI
TL;DR: It is shown that leave-one-out cross-validation of kernel Fisher discriminant classifiers can be implemented with a computational complexity of only O (l 3 ) operations rather than the O ( l 4 ) of a naive implementation, where l is the number of training patterns.

Journal ArticleDOI
TL;DR: The performance of PLS-DA with published data from breast cancer is found to be extremely satisfactory in all cases and that the discriminant cDNA clones often had a sound biological interpretation.
Abstract: Partial least squares discriminant analysis (PLS-DA) is a partial least squares regression of a set Y of binary variables describing the categories of a categorical variable on a set X of predictor variables. It is a compromise between the usual discriminant analysis and a discriminant analysis on the significant principal components of the predictor variables. This technique is specially suited to deal with a much larger number of predictors than observations and with multicollineality, two of the main problems encountered when analysing microarray expression data. We explore the performance of PLS-DA with published data from breast cancer (Perou et al. 2000). Several such analyses were carried out: (1) before vs after chemotherapy treatment, (2) estrogen receptor positive vs negative tumours, and (3) tumour classification. We found that the performance of PLS-DA was extremely satisfactory in all cases and that the discriminant cDNA clones often had a sound biological interpretation. We conclude that PLS-DA is a powerful yet simple tool for analysing microarray data.

Book ChapterDOI
TL;DR: This work uses two different strategies for fusing iris and face classifiers to treat the matching distances of face and iris classifiers as a two-dimensional feature vector and uses a classifier such as Fisher's discriminant analysis and a neural network with radial basis function to classify the vector as being genuine or an impostor.
Abstract: Face and iris identification have been employed in various biometric applications. Besides improving verification performance, the fusion of these two biometrics has several other advantages. We use two different strategies for fusing iris and face classifiers. The first strategy is to compute either an unweighted or weighted sum and to compare the result to a threshold. The second strategy is to treat the matching distances of face and iris classifiers as a two-dimensional feature vector and to use a classifier such as Fisher's discriminant analysis and a neural network with radial basis function (RBFNN) to classify the vector as being genuine or an impostor. We compare the results of the combined classifier with the results of the individual face and iris classifiers.

Book ChapterDOI
01 Apr 2003
TL;DR: The CSU Face Identification Evaluation System provides standard face recognition algorithms and standard statistical methods for comparing face recognition algorithm performance and it is hoped it will be used by others to rigorously compare novel face identification algorithms to standard algorithms using a common implementation and known comparison techniques.
Abstract: The CSU Face Identification Evaluation System provides standard face recognition algorithms and standard statistical methods for comparing face recognition algorithms. The system includes standardized image pre-processing software, three distinct face recognition algorithms, analysis software to study algorithm performance, and Unix shell scripts to run standard experiments. All code is written in ANSI C. The preprocessing code replicates feature of preprocessing used in the FERET evaluations. The three algorithms provided are Principle Components Analysis (PCA), a.k.a Eigenfaces, a combined Principle Components Analysis and Linear Discriminant Analysis algorithm (PCA+LDA), and a Bayesian Intrapersonal/Extrapersonal Classifier (BIC). The PCA+LDA and BIC algorithms are based upon algorithms used in the FERET study contributed by the University of Maryland and MIT respectively. There are two analysis. The first takes as input a set of probe images, a set of gallery images, and similarity matrix produced by one of the three algorithms. It generates a Cumulative Match Curve of recognition rate versus recognition rank. The second analysis tool generates a sample probability distribution for recognition rate at recognition rank 1, 2, etc. It takes as input multiple images per subject, and uses Monte Carlo sampling in the space of possible probe and gallery choices. This procedure will, among other things, add standard error bars to a Cumulative Match Curve. The System is available through our website and we hope it will be used by others to rigorously compare novel face identification algorithms to standard algorithms using a common implementation and known comparison techniques.

Journal ArticleDOI
TL;DR: MCLUST is a software package for model-based clustering, density estimation and discriminant analysis interfaced to the S-PLUS commercial software and the R language that implements parameterized Gaussian hierarchical clustering algorithms and the EM algorithm for parameterizedGaussian mixture models with the possible addition of a Poisson noise term.
Abstract: MCLUST is a software package for model-based clustering, density estimation and discriminant analysis interfaced to the S-PLUS commercial software and the R language. It implements parameterized Gaussian hierarchical clustering algorithms and the EM algorithm for parameterized Gaussian mixture models with the possible addition of a Poisson noise term. Also included are functions that combine hierarchical clustering, EM and the Bayesian Information Criterion (BIC) in comprehensive strategies for clustering, density estimation, and discriminant analysis. MCLUST provides functionality for displaying and visualizing clustering and classification results. A web page with related links can be found at http://www.stat.washington.edu/mclust.

Book ChapterDOI
26 Jun 2003
TL;DR: Two extensions of LLE to supervised feature extraction were independently proposed by the authors of this paper and are unified in a common framework and applied to a number of benchmark data sets.
Abstract: Locally linear embedding (LLE) is a recently proposed method for unsupervised nonlinear dimensionality reduction. It has a number of attractive features: it does not require an iterative algorithm, and just a few parameters need to be set. Two extensions of LLE to supervised feature extraction were independently proposed by the authors of this paper. Here, both methods are unified in a common framework and applied to a number of benchmark data sets. Results show that they perform very well on high-dimensional data which exhibits a manifold structure.

Journal ArticleDOI
TL;DR: The minimum classification error (MCE) training algorithm (which was originally proposed for optimizing classifiers) is investigated for feature extraction and a generalized MCE (GMCE)Training algorithm is proposed to mend the shortcomings of the MCE training algorithm.

Proceedings ArticleDOI
13 Oct 2003
TL;DR: This work proposes a new approach to mapping face images into a subspace obtained by locality preserving projections (LPP) for face analysis, which provides a better representation and achieves lower error rates in face recognition.
Abstract: We have demonstrated that the face recognition performance can be improved significantly in low dimensional linear subspaces. Conventionally, principal component analysis (PCA) and linear discriminant analysis (LDA) are considered effective in deriving such a face subspace. However, both of them effectively see only the Euclidean structure of face space. We propose a new approach to mapping face images into a subspace obtained by locality preserving projections (LPP) for face analysis. We call this Laplacianface approach. Different from PCA and LDA, LPP finds an embedding that preserves local information, and obtains a face space that best detects the essential manifold structure. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. We compare the proposed Laplacianface approach with eigenface and fisherface methods on three test datasets. Experimental results show that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition.

Journal ArticleDOI
TL;DR: By integrating the information from multiple images and capturing the textural and anatomical features in tumor areas, summary statistical maps can potentially aid in image-guided prostate biopsy and assist in guiding and controlling delivery of localized therapy under image guidance.
Abstract: A multichannel statistical classifier for detecting prostate cancer was developed and validated by combining information from three different magnetic resonance (MR) methodologies: T2-weighted, T2-mapping, and line scan diffusion imaging (LSDI). From these MR sequences, four different sets of image intensities were obtained: T2-weighted (T2W) from T2-weighted imaging, Apparent Diffusion Coefficient (ADC) from LSDI, and proton density (PD) and T2 (T2 Map) from T2-mapping imaging. Manually segmented tumor labels from a radiologist, which were validated by biopsy results, served as tumor "ground truth." Textural features were extracted from the images using co-occurrence matrix (CM) and discrete cosine transform (DCT). Anatomical location of voxels was described by a cylindrical coordinate system. A statistical jack-knife approach was used to evaluate our classifiers. Single-channel maximum likelihood (ML) classifiers were based on 1 of the 4 basic image intensities. Our multichannel classifiers: support vector machine (SVM) and Fisher linear discriminant (FLD), utilized five different sets of derived features. Each classifier generated a summary statistical map that indicated tumor likelihood in the peripheral zone (PZ) of the prostate gland. To assess classifier accuracy, the average areas under the receiver operator characteristic (ROC) curves over all subjects were compared. Our best FLD classifier achieved an average ROC area of 0.839(+/-0.064), and our best SVM classifier achieved an average ROC area of 0.761(+/-0.043). The T2W ML classifier, our best single-channel classifier, only achieved an average ROC area of 0.599(+/-0.146). Compared to the best single-channel ML classifier, our best multichannel FLD and SVM classifiers have statistically superior ROC performance (P=0.0003 and 0.0017, respectively) from pairwise two-sided t-test. By integrating the information from multiple images and capturing the textural and anatomical features in tumor areas, summary statistical maps can potentially aid in image-guided prostate biopsy and assist in guiding and controlling delivery of localized therapy under image guidance.

Journal ArticleDOI
TL;DR: This paper applied statistical tests based on discriminant analysis to the wide range of photospheric magnetic parameters described in a companion paper by Leka & Barnes, with the goal of identifying those properties that are important for the production of energetic events such as solar flares.
Abstract: We apply statistical tests based on discriminant analysis to the wide range of photospheric magnetic parameters described in a companion paper by Leka & Barnes, with the goal of identifying those properties that are important for the production of energetic events such as solar flares. The photospheric vector magnetic field data from the University of Hawai'i Imaging Vector Magnetograph are well sampled both temporally and spatially, and we include here data covering 24 flare-event and flare-quiet epochs taken from seven active regions. The mean value and rate of change of each magnetic parameter are treated as separate variables, thus evaluating both the parameter's state and its evolution, to determine which properties are associated with flaring. Considering single variables first, Hotelling's T2-tests show small statistical differences between flare-producing and flare-quiet epochs. Even pairs of variables considered simultaneously, which do show a statistical difference for a number of properties, have high error rates, implying a large degree of overlap of the samples. To better distinguish between flare-producing and flare-quiet populations, larger numbers of variables are simultaneously considered; lower error rates result, but no unique combination of variables is clearly the best discriminator. The sample size is too small to directly compare the predictive power of large numbers of variables simultaneously. Instead, we rank all possible four-variable permutations based on Hotelling's T2-test and look for the most frequently appearing variables in the best permutations, with the interpretation that they are most likely to be associated with flaring. These variables include an increasing kurtosis of the twist parameter and a larger standard deviation of the twist parameter, but a smaller standard deviation of the distribution of the horizontal shear angle and a horizontal field that has a smaller standard deviation but a larger kurtosis. To support the "sorting all permutations" method of selecting the most frequently occurring variables, we show that the results of a single 10-variable discriminant analysis are consistent with the ranking. We demonstrate that individually, the variables considered here have little ability to differentiate between flaring and flare-quiet populations, but with multivariable combinations, the populations may be distinguished.

Journal ArticleDOI
TL;DR: This review presents an overview of another class of models, cluster analysis, which will likely be less familiar to pharmacogenetics researchers, and demonstrates the use of distance-based methods of hierarchical clustering to analyze gene expression data.
Abstract: As pharmacogenetics researchers gather more detailed and complex data on gene polymorphisms that effect drug metabolizing enzymes, drug target receptors and drug transporters, they will need access to advanced statistical tools to mine that data. These tools include approaches from classical biostatistics, such as logistic regression or linear discriminant analysis, and supervised learning methods from computer science, such as support vector machines and artificial neural networks. In this review, we present an overview of another class of models, cluster analysis, which will likely be less familiar to pharmacogenetics researchers. Cluster analysis is used to analyze data that is not a priori known to contain any specific subgroups. The goal is to use the data itself to identify meaningful or informative subgroups. Specifically, we will focus on demonstrating the use of distance-based methods of hierarchical clustering to analyze gene expression data.

Journal ArticleDOI
TL;DR: The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.
Abstract: We have developed an algorithm called Q5 for probabilistic classification of healthy vs. disease whole serum samples using mass spectrometry. The algorithm employs Principal Components Analysis (PCA) followed by Linear Discriminant Analysis (LDA) on whole spectrum SurfaceEnhanced Laser Desorption/Ionization Time of Flight (SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally ecient; it is non-iterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of dierent training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques, and can provide clues as to the molecular identities of dierentially-exp

Journal ArticleDOI
TL;DR: This work adapt and extend the discriminant analysis projection used in pattern recognition and shows that by using the generalized singular value decomposition (GSVD), it can achieve the same goal regardless of the relative dimensions of the term-document matrix.
Abstract: In today's vector space information retrieval systems, dimension reduction is imperative for efficiently manipulating the massive quantity of data. To be useful, this lower-dimensional representation must be a good approximation of the full document set. To that end, we adapt and extend the discriminant analysis projection used in pattern recognition. This projection preserves cluster structure by maximizing the scatter between clusters while minimizing the scatter within clusters. A common limitation of trace optimization in discriminant analysis is that one of the scatter matrices must be nonsingular, which restricts its application to document sets in which the number of terms does not exceed the number of documents. We show that by using the generalized singular value decomposition (GSVD), we can achieve the same goal regardless of the relative dimensions of the term-document matrix. In addition, applying the GSVD allows us to avoid the explicit formation of the scatter matrices in favor of working directly with the data matrix, thus improving the numerical properties of the approach. Finally, we present experimental results that confirm the effectiveness of our approach.

Journal ArticleDOI
TL;DR: In this article, distortion discriminant analysis (DDA) is proposed to map audio data to feature vectors for the classification, retrieval or identification tasks, and the feature extraction operation must be computationally efficient.
Abstract: Mapping audio data to feature vectors for the classification, retrieval or identification tasks presents four principal challenges. The dimensionality of the input must be significantly reduced; the resulting features must be robust to likely distortions of the input; the features must be informative for the task at hand; and the feature extraction operation must be computationally efficient. We propose distortion discriminant analysis (DDA), which fulfills all four of these requirements. DDA constructs a linear, convolutional neural network out of layers, each of which performs an oriented PCA dimensional reduction. We demonstrate the effectiveness of DDA on two audio fingerprinting tasks: searching for 500 audio clips in 36 h of audio test data; and playing over 10 days of audio against a database with approximately 240 000 fingerprints. We show that the system is robust to kinds of noise that are not present in the training procedure. In the large test, the system gives a false positive rate of 1.5 /spl times/ 10/sup -8/ per audio clip, per fingerprint, at a false negative rate of 0.2% per clip.

Proceedings ArticleDOI
06 Jul 2003
TL;DR: This work uses the sum rule and RBF-based integration strategies to combine three commonly used face classifiers based on PCA, ICA and LDA representations and shows that the proposed classifier combination approaches outperform individual classifiers.
Abstract: Current two-dimensional face recognition approaches can obtain a good performance only under constrained environments. However, in the real applications, face appearance changes significantly due to different illumination, pose, and expression. Face recognizers based on different representations of the input face images have different sensitivity to these variations. Therefore, a combination of different face classifiers which can integrate the complementary information should lead to improved classification accuracy. We use the sum rule and RBF-based integration strategies to combine three commonly used face classifiers based on PCA, ICA and LDA representations. Experiments conducted on a face database containing 206 subjects (2,060 face images) show that the proposed classifier combination approaches outperform individual classifiers.

Journal ArticleDOI
TL;DR: The conception of the generalized Fisher discriminant criterion is presented, the generalized Foley-Sammon transform (GFST) is proposed, and the experimental results show that present method is superior to the existing methods in terms of correct classification rate.

Journal ArticleDOI
TL;DR: A new QDA like method is proposed that effectively addresses the SSS problem using a regularization technique and outperforms traditional methods such as Eigenfaces, direct QDA and direct LDA in a number of SSS setting scenarios.

Journal ArticleDOI
TL;DR: A new clustering based feature extraction method for facial expression recognition that is effective with commonly used principal component analysis method and linear discriminant analysis method is described.

Journal ArticleDOI
TL;DR: The combination of classifiers leads to substantial reduction of misclassification error in a wide range of applications and benchmark problems and the procedure performs comparable to the best classifiers used in a number of artificial examples and applications.

Journal ArticleDOI
TL;DR: A number of methods have been proposed in the last decade to overcome the limitation of LDA on small sample size, and these methods, in applying to face recognition, can be roughly grouped into three categories.