Topic

Linear discriminant analysis

About: Linear discriminant analysis is a research topic. Over the lifetime, 18361 publications have been published within this topic receiving 603195 citations. The topic is also known as: Linear discriminant analysis & LDA.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size.

[...]

Berkman Sahiner¹, Heang Ping Chan¹, Nicholas Petrick¹, Robert F. Wagner¹, Lubomir M. Hadjiiski¹ - Show less +1 more•Institutions (1)

University of Michigan¹

01 Jul 2000-Medical Physics

TL;DR: The results indicated that the resubstitution estimate was always optimistically biased, except in cases where the parameters of stepwise feature selection were chosen such that too few features were selected by the stepwise procedure.

...read moreread less

Abstract: In computer-aided diagnosis (CAD), a frequently used approach for distinguishing normal and abnormal cases is first to extract potentially useful features for the classification task. Effective features are then selected from this entire pool of available features. Finally, a classifier is designed using the selected features. In this study, we investigated the effect of finite sample size on classification accuracy when classifier design involves stepwise feature selection in linear discriminant analysis, which is the most commonly used feature selection algorithm for linear classifiers. The feature selection and the classifier coefficient estimation steps were considered to be cascading stages in the classifier design process. We compared the performance of the classifier when feature selection was performed on the design samples alone and on the entire set of available samples, which consisted of design and test samples. The area Az under the receiver operating characteristic curve was used as our performance measure. After linear classifier coefficient estimation using the design samples, we studied the hold-out and resubstitution performance estimates. The two classes were assumed to have multidimensional Gaussian distributions, with a large number of features available for feature selection. We investigated the dependence of feature selection performance on the covariance matrices and means for the two classes, and examined the effects of sample size, number of available features, and parameters of stepwise feature selection on classifier bias. Our results indicated that the resubstitution estimate was always optimistically biased, except in cases where the parameters of stepwise feature selection were chosen such that too few features were selected by the stepwise procedure. When feature selection was performed using only the design samples, the hold-out estimate was always pessimistically biased. When feature selection was performed using the entire finite sample space, the hold-out estimates could be pessimistically or optimistically biased, depending on the number of features available for selection, the number of available samples, and their statistical distribution. For our simulation conditions, these estimates were always pessimistically (conservatively) biased if the ratio of the total number of available samples per class to the number of available features was greater than five.

...read moreread less

129 citations

Journal Article•DOI•

Multivariate analysis of hybrid fishes

[...]

Nancy A. Neff¹, Gerald R. Smith•Institutions (1)

American Museum of Natural History¹

01 Jun 1979-Systematic Biology

TL;DR: Principal components analysis and linear discriminant function analysis were applied to two data sets comprised of a sample of laboratory-reared hybrid fish, and wild-caught parental samples, evaluating the usefulness of each method for hybrid identification, quantification of hybrid variability, and general determination of morphological distance from the suspected parents.

...read moreread less

Abstract: Principal components analysis and linear discriminant function analysis were applied to two data sets comprised of a sample of laboratory-reared hybrid fish, and wild-caught parental samples. For each method, the assumptions required for making statistical inferences and the biological assumptions employed in hybrid studies are reviewed. The degree to which we can expect biological data sets to conform to both types of assumptions is assessed by examination of the two data sets discussed here. The usefulness of each method for hybrid identification, quantification of hybrid variability, and general determination of morphological distance from the suspected parents is evaluated by considering the results of the methods when applied to known hybrids. Evidence is presented for decreased developmental integration in the hybrids. Principal components analysis makes apparent the difference in the branchial baskets of the very similar Notropis spilopterus and N. whipplei, suggesting an ecological separation related to this morphology. The hybrids of both the Notropis and Lepomis cyanellus x L. macrochirus crosses had generally intermediate scores in both analyses, but were not uniformly intermediate, instead graded into the parental phenotypes. In the results of principal components analysis, Fl variability precludes the confident identification of all hybrid individuals as well as any specific identification of F2 and backeross individuals; the majority of hybrids should be identifiable as being of mixed genetic origin. Principal components analysis is demonstrated to be of use in the examination of variation in hybrid fishes. Linear discriminant function analysis as it is presently employed does not appear useful for hybrid analysis, for both practical and theoretical reasons. Discriminant function analysis of samples of known hybrid origin may permit subsequent analysis of suspected hybrids. [Multivariate analysis; principal components analysis; discriminant function analysis; multivariate analytic assumptions; hybrid identification; variation; hybrid variability.]

...read moreread less

129 citations

Journal Article•DOI•

Gifi Methods for Optimal Scaling in R: The Package homals

[...]

Jan de Leeuw, Patrick Mair

04 Aug 2009-Journal of Statistical Software

TL;DR: In this article, the authors present methodological and practical issues of the R package homals which performs homogeneity analysis and various extensions, such as nonlinear principal component analysis, nonlinear canonical correlation analysis, and predictive models which emulate discriminant analysis and regression models.

...read moreread less

Abstract: Homogeneity analysis combines the idea of maximizing the correlations between variables of a multivariate data set with that of optimal scaling. In this article we present methodological and practical issues of the R package homals which performs homogeneity analysis and various extensions. By setting rank constraints nonlinear principal component analysis can be performed. The variables can be partitioned into sets such that homogeneity analysis is extended to nonlinear canonical correlation analysis or to predictive models which emulate discriminant analysis and regression models. For each model the scale level of the variables can be taken into account by setting level constraints. All algorithms allow for missing values.

...read moreread less

129 citations

Journal Article•DOI•

Optimizing the fMRI data-processing pipeline using prediction and reproducibility performance metrics: I. A preliminary group analysis.

[...]

Stephen C. Strother¹, Stephen La Conte¹, Lars Kai Hansen², Jon R. Anderson, Jin Zhang¹, Sujit K. Pulapura¹, David A. Rottenberg¹ - Show less +3 more•Institutions (2)

University of Minnesota¹, Technical University of Denmark²

01 Jan 2004-NeuroImage

TL;DR: It is found that both prediction and reproducibility metrics were required for optimizing the pipeline and give somewhat different results, and the parameter settings of components in the pipeline interact so that the current practice of reporting the optimization of components tested in relative isolation is unlikely to lead to fully optimized processing pipelines.

...read moreread less

129 citations

Journal Article•DOI•

Bolstered error estimation

[...]

Ulisses Braga-Neto¹, Ulisses Braga-Neto², Edward R. Dougherty¹, Edward R. Dougherty²•Institutions (2)

Texas A&M University¹, University of Texas MD Anderson Cancer Center²

01 Jun 2004-Pattern Recognition

TL;DR: The results indicate the proposed method vastly improves on resubstitution and cross-validation, especially for small samples, in terms of bias and variance, while being tens to hundreds of times faster.

...read moreread less

128 citations

Collapse

Network Information

Performance

Metrics

20,826

Papers

671,342

Citations

No. of papers in the topic in previous years
Year	Papers
2025	1
2024	2
2023	756
2022	1,711
2021	678
2020	815

Linear discriminant analysis

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics