scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 1979"



Journal ArticleDOI
TL;DR: Principal components analysis and linear discriminant function analysis were applied to two data sets comprised of a sample of laboratory-reared hybrid fish, and wild-caught parental samples, evaluating the usefulness of each method for hybrid identification, quantification of hybrid variability, and general determination of morphological distance from the suspected parents.
Abstract: Principal components analysis and linear discriminant function analysis were applied to two data sets comprised of a sample of laboratory-reared hybrid fish, and wild-caught parental samples. For each method, the assumptions required for making statistical inferences and the biological assumptions employed in hybrid studies are reviewed. The degree to which we can expect biological data sets to conform to both types of assumptions is assessed by examination of the two data sets discussed here. The usefulness of each method for hybrid identification, quantification of hybrid variability, and general determination of morphological distance from the suspected parents is evaluated by considering the results of the methods when applied to known hybrids. Evidence is presented for decreased developmental integration in the hybrids. Principal components analysis makes apparent the difference in the branchial baskets of the very similar Notropis spilopterus and N. whipplei, suggesting an ecological separation related to this morphology. The hybrids of both the Notropis and Lepomis cyanellus x L. macrochirus crosses had generally intermediate scores in both analyses, but were not uniformly intermediate, instead graded into the parental phenotypes. In the results of principal components analysis, Fl variability precludes the confident identification of all hybrid individuals as well as any specific identification of F2 and backeross individuals; the majority of hybrids should be identifiable as being of mixed genetic origin. Principal components analysis is demonstrated to be of use in the examination of variation in hybrid fishes. Linear discriminant function analysis as it is presently employed does not appear useful for hybrid analysis, for both practical and theoretical reasons. Discriminant function analysis of samples of known hybrid origin may permit subsequent analysis of suspected hybrids. [Multivariate analysis; principal components analysis; discriminant function analysis; multivariate analytic assumptions; hybrid identification; variation; hybrid variability.]

129 citations


Journal ArticleDOI
TL;DR: A number of new and useful approaches for interpretation are suggested and illustrated in an innovator segment analysis of discriminant analysis for market segments.

78 citations


Journal ArticleDOI
TL;DR: The application of statistical linear discriminant analysis in analytical chemistry is discussed in this paper, with a general discussion of the theory of the method, which is illustrated by some examples, and its suitability for problem solving in analytical Chemistry is demonstrated by a review of published applications.

74 citations


Journal ArticleDOI
TL;DR: In this paper, a review of the performance of the linear discriminant function in situations where the assumptions of multivariate normality and equal group dispecificity were removed, as well as the assumption of equal group independence.
Abstract: This article is a review of the results, as are available, on the performance of the linear discriminant function in situations where the assumptions of multivariate normality and equal group dispe...

62 citations


Journal ArticleDOI
TL;DR: For particular populations, the change in probability of correct classilication caused by adding dimensions is given here to give insight into how many variables one should use for fixed training data sizes, especially when dealing with the populations of these studies.
Abstract: This paper is a continuation of earlier work (Van Ness and Simpson [9]) studying the high dimensionality problem in discriminant analysis. Frequently one has potentially many possible variables (dimensions) to be measured on each object but is limited to a fixed training data size. For particular populations, we give here the change in probability of correct classilication caused by adding dimensions. This gives insight into how many variables one should use for fixed training data sizes, especially when dealing with the populations of these studies. We consider six basic discriminant analysis algorithms. Graphs are provided which compare the relative performance of the algorithms in high dimensions.

58 citations


Journal ArticleDOI
TL;DR: Several pattern recognition methods are compared, including the Bayesian classification rule, linear discriminant analysis, the K-nearest neighbour rule, the linear learning machine for multicategory data, and soft independent modelling of class analogy.

47 citations


Journal ArticleDOI
TL;DR: This paper conducted individual structured interviews with 96 sixteen-year-olds, divided into equal social class/sex groups and found that 28 linguistic variables, in combination, distinctly separated the social classes on discriminant function I and that Discriminant Function II separated the groups in terms of the sex dimension.
Abstract: It was hypothesized that specific styles of linguistic coding would be evident along the dimensions of both social class and sex. Individual structured interviews were undertaken with 96 sixteen-year-olds, divided into equal social class/sex groups. The verbatim transcripts were examined along selected aspects of linguistic coding: structure, elaboration, prepositional and pronominal usage, and speech disruptions. Discriminant function analysis, extended and supplemented by analysis of variance, was used to test the hypothesis of differential coding patterns for the social class/sex groups. The discriminant analysis showed that the 28 linguistic variables, in combination, distinctly separated the social classes on Discriminant Function I and that Discriminant Function II separated the groups in terms of the sex dimension.

44 citations


Journal ArticleDOI
TL;DR: Predictive abilities as high as 100% were obtained for some of the partitions of the data set and a statistical algorithm was used to find the best subsets of descriptors to use with Bayesian discriminant analysis.
Abstract: Pattern recognition methods have been employed to classify crude oils based on their gas chromatograms. Four oil types were represented by gas chromatograms taken before and after artificial weathering. The chromatograms were hand digitized and coded with 13 descriptors each - peak areas for the normal alkanes for C/sub 16/ through C/sub 25/ plus pristane and phytane and also one descriptor characterizing the unresolved background. A statistical algorithm was used to find the best subsets of descriptors to use with Bayesian discriminant analysis. A variety of different partitions of the data set showed the similarities of some classes of oils and some dissimilarities for others. The nonparametric linear learning machine method of discriminant training was also applied to various partitions of the data. Predictive abilities as high as 100% were obtained for some of the partitions of the data set. 4 figures, 9 tables.

42 citations



Journal ArticleDOI
TL;DR: In this article, the performance of the linear discriminant function estimated from a mixture of two multivariate normal populations with a common covariance matrix when the total number of observations available is small is investigated.
Abstract: An investigation is undertaken of the performance of the linear discriminant function estimated from a mixture of two multivariate normal populations with a common covariance matrix when the total number of observations available is small. It is concluded from a series of simulation experiments that although the individual estimates of the discriminant function coefficients so obtained may not be very reliable the resulting discriminant function still provides adequate separation between the populations.

Journal ArticleDOI
TL;DR: In this paper, the variability between different examples of the same syllable and adjusting the metric accordingly is analyzed and the sum obtained is assumed to be proportional to the log probability of the two patterns having the same identity.
Abstract: Time‐warping pattern‐comparison algorithms are widely used in speech recognition. Two words or syllables being compared are described by a series of time frames each containing values of a set of acoustic parameters. After time alignment, the squared distance between the patterns is summed over the parameters within a frame and then across frames. The sum obtained is assumed to be proportional to the log probability of the two patterns having the same identity. This assumption is generally invalid, but it may be made substantially true by analyzing the variability between different examples of the same syllable and adjusting the metric accordingly. Variability is estimated both as a function of frame position within the syllable as a function of the acoustic parameters. In the latter case, within‐ and between‐class covariance matrices can be estimated and standard linear discriminant analysis methods applied. This permits the combination of disparate acoustic parameters into a single distance measure. In particular, combining frame and frame‐difference parameters allows one to use time development information and to take inter‐frame correlations into account.

Journal ArticleDOI
TL;DR: In this paper, the use of Linear Discriminant Analysis (LDA) in clinical neuropsychology research is reviewed and recommendations regarding employment of commonly available LDA computer programs are made for the researcher.
Abstract: This paper reviews the use of Linear Discriminant Analysis (LDA) in clinical neuropsychology research. The basic neuropsychological questions addressed by the method, the special problems and requirements for its use, and likely outcomes are described in detail. Recommendations regarding employment of commonly available LDA computer programs are made for the researcher. Careful attention to the assumptions and decisions inherent in LDA programs could enhance its value in clinical research.

Journal ArticleDOI
TL;DR: The accuracy of mortality predictions based upon four different statistical methods, including Baux's rule which adds the patient's age in years to the percentage of his body surface area burned, are compared, and some observations are made concerning the selection of an appropriate model for predicting burn mortality.
Abstract: Recent suggestions that patients “hopelessly burned” be permitted to die peacefully have refocused attention on the accuracy of different methods of predicting whether an individual burn patient will survive. The purpose of this presentation is to compare the accuracy of mortality predictions based upon four different statistical methods: (1) Baux's rule which adds the patient's age in years to the percentage of his body surface area burned the original assertion was that values over 75 meant a very poor prognosis, (2) probit analysis. (3) discriminant analysis, and (4) logistic risk function analysis. Each of these methods was applied to data for over three thousand consecutive admissions to St. Mary's Hospital Burn Center in Milwaukee, Wisconsin. This data base and the four statistical models are described. Mortality predictions derived from the four models are compared, and some observations are made concerning the selection of an appropriate model for predicting burn mortality.

Journal ArticleDOI
TL;DR: In this article, the asymptotic distribution of the errors of misclassification in using the Linear Discriminant Function (LDF) is investigated, and the effects of nonnormality on these errors are studied.

01 Jan 1979
TL;DR: In this paper, two commonly used procedures for estimating the parameters of a logistic regression function are the maximum likelihood estimators and the discriminant function estimators, and the comparison of these procedures can be found in the literature.
Abstract: : Two commonly used procedures for estimating the parameters of a logistic regression function are the maximum likelihood estimators and the discriminant function estimators. Comparisons of these procedures for fitting logistic regression models based on the experience of many researchers can be found in the literature. The comparisons become more complicated when one or more values of the independent variables of certain observations are missing at random. When data are missing, researchers may not be willing to base their estimates only on the subset of complete cases, particularly if the size of this subset is relatively small.

Journal ArticleDOI
01 Jul 1979
TL;DR: In this paper, a discriminant analysis was used to determine whether selected spoil properties could statistically distinguish failed and unfailed embankments in a mass movement on outslopes of contour surface-mines in West Virginia.
Abstract: As part of an investigation of mass movement on outslopes of contour surface-mines in West Virginia, discriminant analysis was used to determine whether selected spoil properties could statistically distinguish failed and unfailed embankments. The analysis utilized the variables degree of saturation, liquid limit, and shrinkage limit, and the a priori assignment of samples into either actively failing, unfailed, or regraded categories. Results were encouraging, with seventy-four percent of the samples being correctly classified. All misclassifications involved samples from active or regraded landslides, suggesting the method was conservative. Performance of discriminant analysis could probably be improved by modifying the sampling plan.

Journal ArticleDOI
01 Jun 1979
TL;DR: In this paper, the authors used multivariate approaches, including discriminant analysis, multiple regression, and multivariate analysis of variance, for psychophysiological behavioral assessment, and provided a description of idiosyncratic patterns and methods for their description.
Abstract: Psychophysiological behavioral assessment generates basic measurement problems. Reliability must be estimated and recognition must be given to possible attenuation of correlation. Change scores may be correlated with baselines, requiring the use of regression methods. Corrections have also been used to compensate for different individuals having different ranges of scores. Multivariate approaches are often used, including discriminant analysis, multiple regression, and multivariate analysis of variance. Sample-specific covariances may require cross-validation techniques. Factor and cluster analysis and path analysis may be appropriately applied to correlational psychophysiological data. Because repeated measures are often taken, time-series analysis has also been recommended. Finally, attention is given to idiosyncratic psychophysiological patterns and methods for their description.

Journal ArticleDOI
TL;DR: An investigation was made into the use of linear and quadratic discriminant analysis, along with K nearest-neighbor analysis, in the classification of a set of 51 compounds which were divided into five therapeutic categories, and suggests that molecular connectivity indices should prove useful in structural classification procedures.
Abstract: An investigation was made into the use of linear and quadratic discriminant analysis, along with K nearest-neighbor analysis, in the classification of a set of 51 compounds which were divided into five therapeutic categories. By superimposing each compound on a pattern structure, as first proposed by Cammarata, eight positions were assigned on the molecule. Each position was coded with the numerical value of a descriptor index. Relative molar refraction, which was the index used by Cammarata, was compared with a number of molecular connective indices. For each of the indices studied, it was found that only four of the eight positions contributed significantly to between-class differences. It was also found that first-order molecular connectivity, calculated as the sum of the contributions of each of the bonds joining a given position, resulted in consistently fewer misclassifications as compared with the other indices. Using first-order molecular connectivity, validation procedures were performed on the original set of compounds, on random samples drawn from this set, and on a set of ten compounds not included in the analysis. The results obtained were highly data dependent, but they, nevertheless, suggest that molecular connectivity indices should prove useful in structural classification procedures.


Journal ArticleDOI
TL;DR: The result suggests that methodologies used must contain an accommodation for correlation between bilateral traits and that pairwise classification procedures are often more applicable than multiple classification procedures considering a large number of groups.
Abstract: In recent years a number of papers have been presented on the usefulness of non-metric traits of the cranial and infracranial skeleton in order that one, some few, or a subsample of crania may be allocated to a pair, family, or larger group. These papers have explored (1) which traits should be used, (2) the theoretical implications of assignment, and (3) the methodology for making these assignments. This paper addresses itself to the theoretical implications of assignment and the methodology for making these assignments. Classification techniques based on the Bayes' theorem, weight of evidence procedures, linear discriminant functions, tally method and the Rubison procedure, were utilized in the first level of analysis. The result suggests that methodologies used must contain an accommodation for correlation between bilateral traits and that pairwise classification procedures are often more applicable than multiple classification procedures considering a large number of groups. The accuracy of various methods starts at slightly better than 50%, while the better methods produce results above the 90% level. Results further show that when acceptable assignments are made the theoretical implications of these assignments do not necessarily suggest the use of a particular methodology, and for ease of analysis the simplest methodology should be used.


Journal ArticleDOI
TL;DR: In this article, the authors propose linear transformations for preliminary data analysis when data consist solely of continuous variables or solely of binary variables, and a test for adequacy of such a function is given, together with some examples.
Abstract: SUMMARY Suitable linear transformations for preliminary data analysis are available when data consist solely of continuous variables or solely of binary variables. When mixtures of variables are observed, neither technique provides a satisfactory procedure. Some linear transformations for such data are therefore proposed in this paper. The aim of the transformations is to simplify structure. An additional motivation comes from the field of discriminant analysis, where it is hoped that dimensionality can be reduced and the transformed data will be amenable to the use of a simple linear discriminant function in place of more complicated procedures. A test for adequacy of such a function is given, together with some examples.


Proceedings ArticleDOI
06 Nov 1979
TL;DR: An optimal linear discriminant function algorithm which minimizes the error rate of internal samples and generates the least number of misclassifications is introduced.
Abstract: In this paper, we introduce an optimal linear discriminant function algorithm which minimizes the error rate of internal samples. We apply it two sets of data. One data set is imaginary and the other is actual data drawn from the medical field. We make many experiments and obtain the following information: 1) We compare O.L.D.F. to other discriminant functions such as the Fisher, Anderson & Bahadur, multiple logistic model and quadratic model. Naturally, O.L.D.F. generates the least number of misclassifications. 2) We investigate the influence on O.L.D.F. of sample size, number of variables and changing Maharanobis' distance. 3) We obtain reinforcing results about the relationship between misclassified samples and optimal solutions.

Journal ArticleDOI
TL;DR: In this paper, the precise functional relation between discriminant functions and classification functions is derived, and the interpretation of the two functions is discussed, and an illustrative example is provided.
Abstract: Fisher’s two-group discriminant function has been generalized in two different ways for the case of three or more groups. Both generalizations have been called discriminant functions, leading to confusion in the literature. A reasonable nomenclature is canonical discriminant functions and classification functions. The precise functional relation between the two functions is derived, and the interpretation of the two functions is discussed. An illustrative example is provided.

Journal ArticleDOI
TL;DR: The paper investigates the use of discriminant analysis as an analytical technique for classifying industrial location patterns and a series of models are calibrated in order to predict the probability that an individual industrial project will select a particular regional, designated/Non-Designated Area, or town size location.
Abstract: O'Farrell P N and Crouchley R (1979) The locational pattern of new manufacturing establishments: an application of discriminant analysis, Reg Studies 13, 39–59 The paper investigates the use of discriminant analysis as an analytical technique for classifying industrial location patterns A number of statistical problems are explored and solutions proposed A series of models are calibrated in order to predict the probability that an individual industrial project will select a particular regional, Designated/Non-Designated Area, or town size location The profiles of projects with high probabilities of locating in certain regions and urban size groups in Ireland are identified

Journal ArticleDOI
TL;DR: In this article, asymptotic expansions for the multivariate noncentral F distribution and for the distribution of latent roots in MANOVA and discriminant analysis are given for large error degrees of freedom.

ReportDOI
01 Nov 1979
TL;DR: Extensions of this basic tool promise substantial improvements in the added effectiveness of integration of spatial autocorrelation into the discriminant model, resolution of nonhomogeneous pixels, and data based prior probability estimates of class membership.
Abstract: Linear discriminant analysis is a commonly used statistical tool for classification of surface features using satellite surface reflectance data. Extensions of this basic tool promise substantial improvements. In particular, the added effectiveness of integration of spatial autocorrelation into the discriminant model, resolution of nonhomogeneous pixels, and data based prior probability estimates of class membership, and the use of unclassified pixels as part of the discriminant function training set are examined.

ReportDOI
01 Oct 1979
TL;DR: In this article, a Monte Carlo investigation of the robustness of Fisher's linear discriminant function to departures from the normal distribution is presented, and the Johnson system of distributions is used.
Abstract: A Monte Carlo investigation of the robustness of Fisher's linear discriminant function to departures from the normal distribution is presented. The Johnson system of distributions is used. 37 tables.