scispace - formally typeset
Search or ask a question
Topic

Linear discriminant analysis

About: Linear discriminant analysis is a research topic. Over the lifetime, 18361 publications have been published within this topic receiving 603195 citations. The topic is also known as: Linear discriminant analysis & LDA.


Papers
More filters
Journal ArticleDOI
01 May 2020-Catena
TL;DR: A comparative analysis using the Wilcoxon signed-rank tests revealed a significant improvement of landslide prediction using the spatially explicit DL model over the quadratic discriminant analysis, Fisher's linear discriminantAnalysis, and multi-layer perceptron neural network.
Abstract: With the increasing threat of recurring landslides, susceptibility maps are expected to play a bigger role in promoting our understanding of future landslides and their magnitude. This study describes the development and validation of a spatially explicit deep learning (DL) neural network model for the prediction of landslide susceptibility. A geospatial database was generated based on 217 landslide events from the Muong Lay district (Vietnam), for which a suite of nine landslide conditioning factors was derived. The Relief-F feature selection method was employed to quantify the utility of the conditioning factors for developing the landslide predictive model. Several performance metrics demonstrated that the DL model performed well both in terms of the goodness-of-fit with the training dataset (AUC = 0.90; accuracy = 82%; RMSE = 0.36) and the ability to predict future landslides (AUC = 0.89; accuracy = 82%; RMSE = 0.38). The efficiency of the model was compared to the quadratic discriminant analysis, Fisher's linear discriminant analysis, and multi-layer perceptron neural network. A comparative analysis using the Wilcoxon signed-rank tests revealed a significant improvement of landslide prediction using the spatially explicit DL model over these other models. The insights provided from this study will be valuable for further development of landslide predictive models and spatially explicit assessment of landslide-prone regions around the world.

187 citations

Journal ArticleDOI
TL;DR: The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.
Abstract: We have developed an algorithm called Q5 for probabilistic classification of healthy vs. disease whole serum samples using mass spectrometry. The algorithm employs Principal Components Analysis (PCA) followed by Linear Discriminant Analysis (LDA) on whole spectrum SurfaceEnhanced Laser Desorption/Ionization Time of Flight (SELDI-TOF) Mass Spectrometry (MS) data, and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classification of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classification algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally ecient; it is non-iterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and verified for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of dierent training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classification method achieves excellent performance. We achieve sensitivity, specificity, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classification techniques, and can provide clues as to the molecular identities of dierentially-exp

187 citations

Journal ArticleDOI
TL;DR: The paper reviews the state-of-the-art in gender classification, giving special attention to linear techniques and their relations, and proves that Linear Discriminant Analysis on a linearly selected set of features also achieves similar accuracies.
Abstract: Emerging applications of computer vision and pattern recognition in mobile devices and networked computing require the development of resource-limited algorithms. Linear classification techniques have an important role to play in this context, given their simplicity and low computational requirements. The paper reviews the state-of-the-art in gender classification, giving special attention to linear techniques and their relations. It discusses why linear techniques are not achieving competitive results and shows how to obtain state-of-the-art performances. Our work confirms previous results reporting very close classification accuracies for Support Vector Machines (SVMs) and boosting algorithms on single-database experiments. We have proven that Linear Discriminant Analysis on a linearly selected set of features also achieves similar accuracies. We perform cross-database experiments and prove that single database experiments were optimistically biased. If enough training data and computational resources are available, SVM's gender classifiers are superior to the rest. When computational resources are scarce but there is enough data, boosting or linear approaches are adequate. Finally, if training data and computational resources are very scarce, then the linear approach is the best choice.

187 citations

Journal ArticleDOI
TL;DR: An improved LDA framework is proposed, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions, and can effectively capture the local structure of samples.
Abstract: The linear discriminant analysis (LDA) is a very popular linear feature extraction approach. The algorithms of LDA usually perform well under the following two assumptions. The first assumption is that the global data structure is consistent with the local data structure. The second assumption is that the input data classes are Gaussian distributions. However, in real-world applications, these assumptions are not always satisfied. In this paper, we propose an improved LDA framework, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions. Our LLDA framework can effectively capture the local structure of samples. According to different types of local data structure, our LLDA framework incorporates several different forms of linear feature extraction approaches, such as the classical LDA and principal component analysis. The proposed framework includes two LLDA algorithms: a vector-based LLDA algorithm and a matrix-based LLDA (MLLDA) algorithm. MLLDA is directly applicable to image recognition, such as face recognition. Our algorithms need to train only a small portion of the whole training set before testing a sample. They are suitable for learning large-scale databases especially when the input data dimensions are very high and can achieve high classification accuracy. Extensive experiments show that the proposed algorithms can obtain good classification results.

187 citations

Journal ArticleDOI
TL;DR: When and why the linear discriminant analysis (LDA) has poor performance is explored and a sparse LDA is proposed that is asymptotically optimal under some sparsity conditions on the unknown parameters.
Abstract: In many social, economical, biological and medical studies, one objective is to classify a subject into one of several classes based on a set of variables observed from the subject. Because the probability distribution of the variables is usually unknown, the rule of classification is constructed using a training sample. The well-known linear discriminant analysis (LDA) works well for the situation where the number of variables used for classification is much smaller than the training sample size. Because of the advance in technologies, modern statistical studies often face classification problems with the number of variables much larger than the sample size, and the LDA may perform poorly. We explore when and why the LDA has poor performance and propose a sparse LDA that is asymptotically optimal under some sparsity conditions on the unknown parameters. For illustration of application, we discuss an example of classifying human cancer into two classes of leukemia based on a set of 7,129 genes and a training sample of size 72. A simulation is also conducted to check the performance of the proposed method.

187 citations


Network Information
Related Topics (5)
Regression analysis
31K papers, 1.7M citations
85% related
Artificial neural network
207K papers, 4.5M citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Cluster analysis
146.5K papers, 2.9M citations
79% related
Image segmentation
79.6K papers, 1.8M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20251
20242
2023756
20221,711
2021678
2020815