scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 1981"


Journal ArticleDOI
TL;DR: The model presented here calculates a linear combination of variables that quantifies shape differences among populations, independent of size, and construe size and shape not as measured variables, but as general factors, linear combinations most parsimoniously accounting for the associations among the distance measures.
Abstract: Humphries, J. M., F. L. Bookstein, B. Chernoff, G. R. Smith, R. L. Elder, and S. G. Poss (Museum of Zoology, Centerfor Human Growth and Development, Museum of Paleontology, and Division of Biological Sciences, The University of Michigan, Ann Arbor, Michigan 48109) 1981. Multivariate discrimination by shape in relation to size. Syst. Zool., 30:291-308.-The diverse methods for analyzing size-free shape differences tend to be guided by computational expediency rather than geometric principles. We question the use of ratios and ad hoc combinations of spatially unrelated measures. Neither are linear discriminant functions or series of independent regressions helpful to the visualization of shape differences. A bridge is needed between traditional quantitative methods and the geometrical analysis of shape. In principle any measured transects between landmarks of a form can serve as characters in a morphometric analysis. Systematic studies use a highly non-random sample of these, particularly biased regarding geometrical information. We suggest defining size and shape in terms of factors-estimates of information common to a universe of measured distances. The model presented here calculates a linear combination of variables that quantifies shape differences among populations, independent of size. In analyses in which the first two principal components confound size and shape, size is removed from one axis with shear coefficients derived from regression of general size on principal components centered by group. The general size factor is estimated by the principal axis of the within-group covariance matrix of the log-transformed data. Residuals from the regression of general size on the transformed axes approximate a shape-discriminating factor that is uncorrelated with size within group and displays the interpopulation shape differences borne by the first two principal components. The results bear a direct and interpretable correspondence to biorthogonal analysis of shape difference. [Multivariate analysis; principal components; discriminant functions; morphometrics; size-free shape; allometry; fishes.] Systematists need procedures that allow them to discriminate among groups of organisms that vary in size. The groups included in a study can be chosen a priori (e.g., several species or geographic populations within a species) or a posteriori (as a conclusion resulting from some method of analysis). However the groups are chosen, it has long been considered desirable to discriminate among them on the basis of size-free shape derived from distance measures. The terms shape and size have been used in various and sometimes conflicting ways (Huxley, 1932; Thompson, 1942; Simpson, Roe and Lewontin, 1960; Gould, 1966; Mosimann, 1970; Sprent, 1972; Bookstein, 1978). We construe size and shape not as measured variables, but as general factors, linear combinations most parsimoniously accounting for the associations among the distance measures. Size, in particular, is not a single variable such as biomass or a standard length, but a factor which, when called upon to predict all the distance measures within a population, leaves the smallest mean squared residual. We prefer a factor whose algebraic form acknowledges the allometric relationship (Jolicoeur, 1963). Our shape discriminators need to be independent of size (Flessa and Bray, 1977; Mosimann and James, 1979) in order to partition out the effects of growth (e.g., individuals of differing age and size). In general, shape can be defined as the geometry of the organism after "information about position, scale, and orientation" has been removed (Bookstein, 1978:8). There is then an endless variety of shape information remaining. While the quantification of size as a general factor de-

492 citations


Journal Article
TL;DR: In this paper, several techniques for discriminant analysis are applied to a set of data from patients with severe head injuries, for the purpose of prognosis, such that multidimensionality, continuous, binary and ordered categorical variables and missing data must be coped with.
Abstract: Several techniques for discriminant analysis are applied to a set of data from patients with severe head injuries, for the purpose of prognosis. The data are such that multidimensionality, continuous, binary and ordered categorical variables and missing data must be coped with. The various methods are compared using criteria of prognostic success and reliability. In general, performance varies more with choice of the set of predictor variables than with that of the discriminant rule.

281 citations


Journal ArticleDOI
TL;DR: In this article, the geometry of canonical variate analysis is described as a two-stage orthogonal rotation, where the first stage involves a principal component analysis of the original variables.
Abstract: The geometry of canonical variate analysis is described as a two-stage orthogonal rotation. The first stage involves a principal component analysis of the original variables. The second stage involves a principal component analysis of the group means for the orthonormal variables from the first-stage eigenanalysis. The geometry of principal component analysis is also outlined. Algebraic aspects of canonical variate analysis are discussed and these are related to the geometrical description. Some practical implications of the geometrical approach for stability of the canonical vectors and variable selection are presented. [Multivariate analysis; canonical variate analysis; discriminant analysis; principal component analysis.]

277 citations


Journal ArticleDOI
01 Mar 1981
TL;DR: In this work, several techniques for discriminant analysis are applied to a set of data from patients with severe head injuries, for the purpose of prognosis.
Abstract: Several techniques for discriminant analysis are applied to a set of data from patients with severe head injuries, for the purpose of prognosis. The data are such that multidimensionality, continuous, binary and ordered categorical variables and missing data must be coped with. The various methods are compared using criteria of prognostic success and reliability. In general, performance varies more with choice of the set of predictor variables than with that of the discriminant rule.

214 citations


Book ChapterDOI
01 Jan 1981
TL;DR: Clustering analysis is a newly developed computer-oriented data analysis technique that is a product of many research fields: statistics, computer science, operations research, and pattern recognition.
Abstract: Clustering analysis(1–4) is a newly developed computer-oriented data analysis technique. It is a product of many research fields: statistics, computer science, operations research, and pattern recognition. Because of the diverse backgrounds of researchers, clustering analysis has many different names. In biology, clustering analysis is called “taxonomy”.(5,6) In pattern recognition(7–15) it is called “unsupervised learning.” Perhaps the most confusing name of all, the term “classification” sometimes also denotes clustering analysis. Since classification may denote discriminant analysis, which is totally different from clustering analysis, it is perhaps important to distinguish these two terms.

77 citations


Journal ArticleDOI
TL;DR: A discriminant function obtained in 1978 to separate patients with glaucomatous visual field loss from those withoutvisual field loss was shown to have a predictive value in ocular hypertensive persons as to the subsequent development of visual field losses in five years.
Abstract: • A discriminant function obtained in 1978 to separate patients with glaucomatous visual field loss from those without visual field loss was shown to have a predictive value in ocular hypertensive persons as to the subsequent development of visual field loss in five years. A prospective discriminant analysis also was carried out to identify those factors that best separate those in whom visual field defects developed from those in whom they did not.

66 citations


Journal ArticleDOI
TL;DR: A comparative study of generalized cooccurrence texture analysis tools is presented and three experiments are discussed-the first based on a nearest neighbor classifier, the second on a linear discriminant classifiers, and the third on the Battacharyya distance figure of merit.
Abstract: A comparative study of generalized cooccurrence texture analysis tools is presented. A generalized cooccurrence matrix (GCM) reflects the shape, size, and spatial arrangement of texture features. The particular texture features considered in this paper are 1) pixel-intensity, for which generalized cooccurrence reduces to traditional cooccurrence; 2) edge-pixel; and 3) extended-edges. Three experiments are discussed-the first based on a nearest neighbor classifier, the second on a linear discriminant classifier, and the third on the Battacharyya distance figure of merit.

61 citations


ReportDOI
01 Dec 1981
TL;DR: The main lines of research undertaken during the period are: Probability Theory: Major advances were made in obtaining Edgeworth expansions in a variety of situations, e.g., involving discrete variables, and errors in variables models as mentioned in this paper.
Abstract: : The main lines of research undertaken during the period are: Probability Theory: Major advances were made in obtaining Edgeworth expansions in a variety of situations, e.g., involving discrete variables, and errors in variables models. New limit theorems were established and their applications were discussed. Several contributions have been made to characterization theory. Linear Models and Time Series: New methods of forecasting were developed using dynamic linear models and multiple bilinear time series models. Multivariate Analysis: Topics of research in this area included inference on interclass and intraclass correlations and principal component analysis. M-estimation: A unified theory of robust inference (estimation and tests of hypotheses) was developed using a convex discrepancy function for minimization.

52 citations


Journal ArticleDOI
TL;DR: In this article, a diagram is used to aid discussion of how several of the frequently used multivariate statistical techniques are interrelated, and all of those discussed can be regarded as special cases of canonical correlation.
Abstract: A diagram is used to aid discussion of how several of the frequently used multivariate statistical techniques are interrelated. All of those discussed can be regarded as special cases of canonical correlation.

49 citations


Journal ArticleDOI
TL;DR: Criteria indicate that principal component analysis performs best, but inaccurately, with variable standardization and the correlation matrix, and that nonlinear mapping consistently performs poorly, while performance of methods is consistent with the nonlinearity of Abronia data.
Abstract: Pimentel, R. A. (Department of Biological Sciences, California Polytechnic State Univ., San Luis Obispo, California) 1981. A comparative study of data and ordination techniques based on a hybrid swarm of sand verbenas (Abronia Juss.). Syst. Zool., 30:250-267.-The influence of kinds of variables, data errors, standardizations similarity coefficients, and ordination techniques are judged in reference to a model involving hybridization and introgression in three species of sand verbenas, Abronia (Juss.). Various criteria indicate that principal component analysis performs best, but inaccurately, with variable standardization and the correlation matrix, and that nonlinear mapping consistently performs poorly. Excellent results were obtained from principal coordinate analysis with Gower's general similarity coefficient based upon quantitative and multistate variables; a 'diagnostic' character measured with known error; and detailed color evaluation. This and other principal coordinate analysis results were further improved by nonmetric multidimensional scaling. Performance of methods is consistent with the nonlinearity of Abronia data. It is assumed that nonlinearity is common in taxonomic data. [Multivariate analysis; multivariate analysis assumptions; discriminant analysis; classification methods; principal component analysis; principal coQrdinate analysis; nonlinear mapping; nonmetric multidimensional scaling; ordination comparisons; data properties; nonlinearity; sand verbenas, Abronia.]

47 citations



Journal ArticleDOI
TL;DR: In a drug study the application of the coefficients of the AR model as input parameters in the discriminant analysis, instead of arbitrary chosen frequency bands, brought a significant improvement in distinguishing the effects of the medication.

Journal ArticleDOI
TL;DR: This paper illustrates that a combination of these two methods may reinforce the discriminating power of a system for the recognition of characters by making use of Fourier shape descriptors and contour approximations.

Journal ArticleDOI
TL;DR: In this paper, the robustness of Fisher's linear discriminant function is evaluated when the distributions of the two populations are characterized by two-component mixed normal distributions with known parameters.
Abstract: Robustness of Fisher's linear discriminant function is evaluated when the distributions of the two populations are characterized by two-component mixed normal distributions with known parameters. The results suggest that the linear discriminant function is rather robust when the two distributions do not markedly deviate from normality and are moderately distant, particularly if they are similar in shape.

Journal ArticleDOI
TL;DR: In this paper, an estimation system for the more common separate sampling which is applicable to continuous and/or discrete predictor variables is given for the probit discriminant function for distinguishing between two ordered groups.
Abstract: Most discriminant functions refer to qualitatively district groups. Talis et al. (1975) introduced the probit discriminant function for distinguishing between two ordered groups. They showed how to estimate this function for mixture sampling and continuous predictor variables. Here an estimation system is given for the more common separate sampling which is applicable to continuous and/or discrete predictor variables. When used solely with continuous variables) this method of estimation is more robust than Tallis! The relationship of probit and logistic discrimination is discussed.

Journal ArticleDOI
TL;DR: The computer program INDEP-SELECT has been developed for selection of an optimal subset from a set of possibly informative diagnostic or prognostic variables, and is equally useful for other discriminant analysis or pattern recognition problems involving variable selection.


Journal ArticleDOI
TL;DR: In this paper, a study was undertaken to discern differences between high, moderate, and low museum attendees, and one-way analysis of variance and stepwise discriminant analysis were used for market segmentation purposes to differentiate the characteristics of the three groups.
Abstract: This study was undertaken to discern differences between high, moderate, and low museum attendees. One-way analysis of variance and stepwise discriminant analysis were used for market segmentation purposes to differentiate the characteristics of the three groups. The discriminant analysis yielded a model which was found to predict better and was significantly different from the proportional chance prediction, and it is thus felt that a viable method for segmenting museum-goers has resulted.


Journal ArticleDOI
TL;DR: In this paper, a rank procedure developed by Broffitt, Randles, and Hogg was modified to control the conditional probability of misclassification given that classification has been attempted, which leads to a useful solution to the two-population partial discriminant analysis problem for even moderately sized training sets.
Abstract: A rank procedure developed by Broffitt, Randles, and Hogg (1976) is modified to control the conditional probability of misclassification given that classification has been attempted. This modification leads to a useful solution to the two-population partial discriminant analysis problem for even moderately sized training sets.

Journal ArticleDOI
TL;DR: It is shown that ALLOC when associated with descriptive statistical linear discriminant analysis (“display” SLDA) is in some situations a better alternative than ALLOC and SLDA for classification purposes (’classification’ SLDA).

01 Jan 1981
TL;DR: In this article, the development of some methods for ordinally measurable data and its implications for regional statistical and econometric analyses are discussed, such as multiple regression analysis, clustering and classification, principal component analysis, and partial least squares.
Abstract: The essential feature of multivariate methods is that they aim at reducing the complexity of phenomena in which many variables or attributes are involved. Given this general feature , it is no surprise to see that these methods have been applied in various fields of research, such as economics, geography, medicine, biology, etc. This paper will be devoted to the development of some methods for ordinally measurable data and its implications for regional statistical and econometric analyses. We will deal with the following methods: multiple regression analysis (and related subjects such as interdependence analysis and discriminant analysis); clustering and classification; principal component analysis (and related subjects such as canonical correlations and partial least squares).

Journal ArticleDOI
TL;DR: A non-parametric discriminant function for categorical or discrete-valued data belonging to at least two a priori classes is constructed by a computer program from the base sample data.
Abstract: SUMMARY A non-parametric discriminant function for categorical or discrete-valued data belonging to at least two a priori classes is constructed by a computer program from the base sample data. There is no limit to the number of data variables measured oh members of the base sample, but each variable can have no more than 10 possible values. The program prints out the discriminant function in the form of a branching key for use by hand; also in the form of a Fortran function which can be punched out on cards by the computer. A simple program to read the same variables for each individual, and then call the function, will classify a prospective series from the same population. The key itself may have a descriptive use, especially when the number of variables is enormous. In this case it may be used as a preliminary analysis which may suggest testable hypotheses about the interactions between some of the variables. The algorithm itself is presented in the algorithms section of this issue of Applied Statistics (Sturt, 1980).

Journal ArticleDOI
TL;DR: The main finding is that the syntactic approach outperforms the discriminant analysis method when each technique is compared to visual scoring.

Journal ArticleDOI
TL;DR: In this paper, two estimates of the probability of correct classification, called the apparent and plug-in correct classification rates, are considered and an asymptotic expansion for the conditional joint density of two observations given the sample mean and pooled covariance matrix is found.
Abstract: When classifying an observation into one of $k$ multivariate normal distributions based on samples of correctly classified observations, two estimates of the probability of correct classification, called the apparent and plug-in correct classification rates, are considered. Asymptotic expansions are found for the means and variances of these estimates. It is shown that these expansions can be used to help reduce the bias of the estimates. In the course of finding the expansions, an asymptotic expansion for the conditional joint density of two observations given the sample mean and pooled covariance matrix is found.

Journal ArticleDOI
TL;DR: Two important problems in the analysis of categorical questionnaire data are considered: assessment of question worth and variable selection, and discrete discriminant analysis when the data is nonordinal with many states and few respondents.
Abstract: Two important problems in the analysis of categorical questionnaire data are considered: assessment of question worth and variable selection, and discrete discriminant analysis when the data is nonordinal with many states and few respondents. The unifying approach used throughout is the concept of information theoretic distance measures. Simulations and applications to real data are presented.

Journal ArticleDOI
TL;DR: An increase in the predictive value for finding usable metaphases from 28-68% was achieved and a quality-parameter based on a linear combination of cluster projections, areas and perimeters was found to account for 64% of the variation between visual and measured quality indicators.
Abstract: The performance of metaphase-finding systems could be improved if they were able to determine the quality of the cells detected. This paper discusses the extent to which this may be realized by the introduction of a metaphase-quality parameter. Data obtained from 300 cells were statistically analyzed. Seventeen features were measured and nine visual properties were determined for each cell. Discriminant analysis and regression analysis were used to extract those features and visual properties which contribute to assessment of metaphase quality. Rather low correlations were found between the selected measured features and visual properties. A quality-parameter based on a linear combination of cluster projections, areas and perimeters was found to account for 64% of the variation between visual and measured quality indicators. In addition, an increase in the predictive value for finding usable metaphases from 28-68% was achieved.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss considerations in the unwary use of packaged discriminant analysis procedures including: the differences between the "group classification function" and the textbook classification function in both form and use, classification table confusions and their alleviation, and the hazards of stepping procedures.
Abstract: Discriminant-classification analysis is a multivariate statistical technique which will increasingly be used by psychologists as research situations become more varied. Use of discriminant analysis will be facilitated by statistical packages such as SPSS (Nie et al., 1975) and BMDP (Dixon and Brown, 1977). This paper discusses considerations in the unwary use of packaged discriminant analysis procedures including: the differences between the "group classification function" and the textbook classification function in both form and use, classification table confusions and their alleviation, and the hazards of stepping procedures. Recommendations concerning how to conduct an exploratory discriminant analysis are made and an example is presented.

Journal ArticleDOI
TL;DR: The proposed algorithm is a kind of error-correction procedure, and the learning procedure of the usual committee machine and the perceptron are clearly explained as special cases of the proposed algorithm.

Journal ArticleDOI
TL;DR: A computer assisted procedure for the diagnosis of thyroid diseases, based on seven clinical chemical parameters, is proposed, which seems to have some advantages over other diagnostic models published hitherto.