scispace - formally typeset
Search or ask a question
Topic

Linear discriminant analysis

About: Linear discriminant analysis is a research topic. Over the lifetime, 18361 publications have been published within this topic receiving 603195 citations. The topic is also known as: Linear discriminant analysis & LDA.


Papers
More filters
Journal ArticleDOI
TL;DR: A new method that is close to the support vector machines insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space to deal with nonlinear discriminant analysis using kernel function operator.
Abstract: We present a new method that we call generalized discriminant analysis (GDA) to deal with nonlinear discriminant analysis using kernel function operator. The underlying theory is close to the support vector machines (SVM) insofar as the GDA method provides a mapping of the input vectors into high-dimensional feature space. In the transformed space, linear properties make it easy to extend and generalize the classical linear discriminant analysis (LDA) to nonlinear discriminant analysis. The formulation is expressed as an eigenvalue problem resolution. Using a different kernel, one can cover a wide class of nonlinearities. For both simulated data and alternate kernels, we give classification results, as well as the shape of the decision function. The results are confirmed using real data to perform seed classification.

1,743 citations

Journal ArticleDOI
TL;DR: After pointing out the key assumptions underlying CCA, the paper focuses on the interpretation of CCA ordination diagrams and some advanced uses, such as ranking environmental variables in importance and the statistical testing of effects are illustrated on a typical macroinvertebrate data-set.
Abstract: Canonical correspondence analysis (CCA) is a multivariate method to elucidate the relationships between biological assemblages of species and their environment. The method is designed to extract synthetic environmental gradients from ecological data-sets. The gradients are the basis for succinctly describing and visualizing the differential habitat preferences (niches) of taxa via an ordination diagram. Linear multivariate methods for relating two set of variables, such as twoblock Partial Least Squares (PLS2), canonical correlation analysis and redundancy analysis, are less suited for this purpose because habitat preferences are often unimodal functions of habitat variables. After pointing out the key assumptions underlying CCA, the paper focuses on the interpretation of CCA ordination diagrams. Subsequently, some advanced uses, such as ranking environmental variables in importance and the statistical testing of effects are illustrated on a typical macroinvertebrate data-set. The paper closes with comparisons with correspondence analysis, discriminant analysis, PLS2 and co-inertia analysis. In an appendix a new method, named CCA-PLS, is proposed that combines the strong features of CCA and PLS2.

1,715 citations

Journal ArticleDOI
TL;DR: This paper describes the automatic selection of features from an image training set using the theories of multidimensional discriminant analysis and the associated optimal linear projection, and demonstrates the effectiveness of these most discriminating features for view-based class retrieval from a large database of widely varying real-world objects.
Abstract: This paper describes the automatic selection of features from an image training set using the theories of multidimensional discriminant analysis and the associated optimal linear projection. We demonstrate the effectiveness of these most discriminating features for view-based class retrieval from a large database of widely varying real-world objects presented as "well-framed" views, and compare it with that of the principal component analysis.

1,713 citations

Journal ArticleDOI
TL;DR: The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.
Abstract: We present ADE-4, a multivariate analysis and graphical display software. Multivariate analysis methods available in ADE-4 include usual one-table methods like principal component analysis and correspondence analysis, spatial data analysis methods (using a total variance decomposition into local and global components, analogous to Moran and Geary indices), discriminant analysis and within/between groups analyses, many linear regression methods including lowess and polynomial regression, multiple and PLS (partial least squares) regression and orthogonal regression (principal component regression), projection methods like principal component analysis on instrumental variables, canonical correspondence analysis and many other variants, coinertia analysis and the RLQ method, and several three-way table (k-table) analysis methods. Graphical display techniques include an automatic collection of elementary graphics corresponding to groups of rows or to columns in the data table, thus providing a very efficient way for automatic k-table graphics and geographical mapping options. A dynamic graphic module allows interactive operations like searching, zooming, selection of points, and display of data values on factor maps. The user interface is simple and homogeneous among all the programs; this contributes to making the use of ADE-4 very easy for non- specialists in statistics, data analysis or computer science.

1,651 citations

Journal ArticleDOI
TL;DR: It is shown that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments and also considers estimating the variability of an error rate estimate.
Abstract: A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? This is an important question both for comparing models and for assessing a final selected model. The traditional answer to this question is given by cross-validation. The cross-validation estimate of prediction error is nearly unbiased but can be highly variable. Here we discuss bootstrap estimates of prediction error, which can be thought of as smoothed versions of cross-validation. We show that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments. Besides providing point estimates, we also consider estimating the variability of an error rate estimate. All of the results here are nonparametric and apply to any possible prediction rule; however, we study only classification problems with 0–1 loss in detail. Our simulations include “smooth” prediction rules like Fisher's linear discriminant fun...

1,602 citations


Network Information
Related Topics (5)
Regression analysis
31K papers, 1.7M citations
85% related
Artificial neural network
207K papers, 4.5M citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
79% related
Image segmentation
79.6K papers, 1.8M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023756
20221,711
2021678
2020815