scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 1977"


Journal ArticleDOI
TL;DR: The purpose of this paper is to discuss problems of application of discriminant analysis techniques and the prospects for statistical research on the application of the techniques.
Abstract: OF THE APPLIED DISCRIMINANT analysis papers that have appeared in the business, finance, and economics literature to date, most have suffered from methodological or statistical problems that have limited the practical usefulness of their results. While it is not true that the statistical problems are unique to economics or finance, it does seem that the nature of the subject matter and data are such that one can expect to encounter statistical difficulties more frequently than in many other application areas. The problems are of several different types, among which are difficulties with (1) the distributions of the variables, (2) the group dispersions, (3) the interpretation of the significance of individual variables, (4) the reduction of dimensionality, (5) the definitions of the groups, (6) the choice of the appropriate a priori probabilities and/or costs of misclassification, and (7) the estimation of classification error rates. The purpose of this paper is to discuss these problems of application of discriminant analysis techniques. Ample references are made to the literature for examples to illustrate the pitfalls. Finally, a brief discussion of future problems and prospects for statistical research on the application of the techniques is provided.

686 citations


Journal ArticleDOI
TL;DR: The role that segmentation can play in the formulation of marketing strategy for either products or services, consumer or industrial is discussed and a new approach to segmentation is described nontechnically.

207 citations


Journal ArticleDOI
TL;DR: The validation problems inherent in small-sample discriminant analysis are examined and two recently developed alternatives to the more traditional methods are explained and illustrated in the context of a salesman-selection problem.
Abstract: The validation problems inherent in small-sample discriminant analysis are examined. Two recently developed alternatives to the more traditional methods are explained and illustrated in the context...

167 citations


Journal ArticleDOI
TL;DR: In this article, the authors compared the performance of three discriminant functions, the quadratic, best linear, and Fisher's linear discriminant function, to classify individuals into two normally distributed populations with unequal covariance matrices.
Abstract: A Monte Carlo study (Wahl 1971) is compared to the study of Marks and Dunn (1974) which investigated the ability of three discriminant functions, the quadratic, best linear, and Fisher's linear discriminant function, to classify individuals into two multivariate normally distributed populations with unequal covariance matrices. Parameters that were varied in all of the studies include the distance between populations, covariance matrices, number of variables, sample size and population proportion. Our results, when related to those of Marks and Dunn, indicate sample size to be a critical factor in choosing between the quadratic and linear functions.

129 citations


Journal ArticleDOI
TL;DR: Five programs for selection of variables in discriminant analysis are compared: the program DISCRIM of McCabe, the BMD07M program, see Dixon, the program ALLOC-I of Habhema, Hermana and van den Broek, and two more recent programs: SPSS and BMDP7M.
Abstract: Five programs for selection of variables in discriminant analysis are compared: the program DISCRIM of McCabe [8]. the BMD07M program, see Dixon [1], the program ALLOC-I of Habhema, Hermana and van den Broek [3], and two more recent programs: SPSS and BMDP7M, see Nie e.a. [10] and Dixon [2]. Emphasis is on the criteria for selection and on the distributional assumptions involved. The programs are compared experimentally using two examples: one with real data, also used by McCabe [8], and one with simulated data.

127 citations


Journal ArticleDOI
TL;DR: In this paper, a review of the published work on the performance of Fisher's linear discriminant function when underlying assumptions are violated is given, and new results are presented for the case of classification using both binary and continuous variables.
Abstract: A review is given of the published work on the performance of Fisher's linear discriminant function when underlying assumptions are violated. Some new results are presented for the case of classification using both binary and continuous variables, and conditions for success or failure of the linear discriminant function are investigated.

111 citations


Journal ArticleDOI
TL;DR: In this article, the importance of infrared vs. visual features, textural vs. spectral features, hierarchical vs. single-stage decision logic, and quadratic vs. linear discriminant functions for classification of NOAA-1 visible and infrared tropical cloud data was determined.

70 citations



Journal ArticleDOI
TL;DR: In this article, exact results for univariate (p = 1) two-group classification problems were derived assuming normality and equality of the variances, together with results presented elsewhere in the literature, constitute the basis of various detailed proposals to deal with problems from actual statistical practice.
Abstract: In Part I exact results for univariate (“p= 1”) two-group (“k = 2”) classification problems were derived assuming normality and equality of the variances. In Part IIa asymptotic results for multivariate (“p> I”) two-group classification and discrimination problems are based on the corresponding assumptions of multivariate normality and equality of the covariance matrices. The results (4.6.5), (4.6.6) and (4.6.7) are believed to be new. The asymptotic results in Section 4.6, together with results presented elsewhere in the literature, constitute the basis of various detailed proposals to deal with problems from actual statistical practice. Most of these proposals are modifications or specifications of existing ones. We shall pay some attention to (I) testing whether differences exist. But we are mainly interested in: (II) constructing a discriminant function, (III) assigning the individual under classification, and in (IV) constructing a confidence interval for “the” posterior probability that the individual under classification belongs to Population 2. An important part in our theory is played by various techniques for selecting variables in discriminant analysis. The need for such techniques follows from Section 4.10. The consequences of building-in a selection technique are discussed in Section 4.12. One of our proposals motivates the theory presented in Chapter 3 and is mentioned here for that reason: employ a large part of the data, say 70%, in order to construct a discriminant function (via a selection of variables); by applying this function to the rest of the data, the exact univariate theory of Part I becomes of application. Part IIb will contain a chapter on applications.

54 citations


Journal ArticleDOI
TL;DR: Two measures of multivariate niche overlap defined on p resource variables are presented and an illustration of the multivariate approach to actual field data is demonstrated.

42 citations


Journal ArticleDOI
TL;DR: The authors examined the impact of three separate factors influencing classification results obtained from discriminant analysis-multivariate normality, equality of the variance/covariance matrices and misclassification error rates.
Abstract: Several recent insurance studies [3, 8, 21, 24] have employed multiple discriminant analysis in an attempt to identify important predictor variables and to reclassify the original observations into one of two known groups. The purpose of this paper is to examine the impact of three separate factors influencing classification results obtained from discriminant analysis-multivariate normality, equality of the variance/covariance matrices and misclassification error rates. In order to focus on the potential impact of these three factors, the authors assume that the priori probabilities of group membership are equal as are the costs of misclassification, and that the specific predictor variables have already been determined.' Data from the previous Trieschmann-Pinches (T-P) study [24] are analyzed to illustrate the impact of these three factors on the classification results. The important objectives of this paper are twofold: first, to illustrate that it is possible to get many different classification results from the same discriminant model, and second, to provide a review of the present knowledge available on the impact of non-multivariate normality, unequal covariance matrices, and different misclassification error rates on classification results obtained when multiple discriminant analysis is employed.

Journal ArticleDOI
TL;DR: In this paper, a discriminant analysis approach was used to distinguish three groups of naval personnel: those eligible to reenlist who do, those eligible who do not, and those not eligible.
Abstract: Variables from five domains—demography, social background, service history, satisfaction, and performance—were used in a discriminant analysis approach to distinguishing three groups of naval personnel: Those eligible to reenlist who do, those eligible who do not, and those not eligible. Discriminant weights were derived from a sample of 642 first-term enlisted men and cross-validated on a sample of 628. The results indicated that both pre-service characteristics (demography and social background) and in-service experiences (service history, satisfaction, and performance) contributed importantly to prediction of attrition/retention. Potential usefulness of this method, including implications for better understanding and control of manpower turnover were discussed.

Book ChapterDOI
01 Jan 1977
TL;DR: In this article, the usefulness of R-and Q-type orthogonal factor analysis, linear typal analysis, and other classification typologies (e.g., multidimensional scaling, principal component analysis, cluster analysis) in uncovering homogeneous subgroups from naturally-selected, heterogeneous samples of animals or behaviors was described.
Abstract: Multivariate analyses organize and reduce data composed of numerous variables into fewer biologically interpretable dimensions. This paper describes the usefulness of R-and Q-type orthogonal factor analysis, linear typal analysis, and other classification typologies (e.g., multidimensional scaling, principal-components analysis, cluster analysis) in uncovering homogeneous subgroups from naturally-selected, heterogeneous samples of animals or behaviors. Once animal groups are identified according to some criteria, multiple step-wise discriminant analysis can be used to determine which behavioral variables best differentiate between all group combinations. R-type factor analysis applied to 20 agonistic behaviors exhibited during adult male-male agonistic interactions in the wolf spider Schizocosa crassipes yielded four behavior-related factors: (I) Approach/Signal, (II) Vigorous Pursuit, (III) Run/Retreat, and (IV) Non-Linking Behaviors. If the same matrix is rotated and a Q-type factor analysis is applied (this time to the 40 subjects as variables), two subject-related factors were extracted and interpreted as (I) Dominance and (II) Subordinance. Discriminant analysis of spider groups pre-identified on the basis of density parameters of dominance rankings characterized the groups in terms of the original 20 behaviors and indicated which behaviors most optimally discriminate between all pairwise group combinations. Q-type orthogonal powered-vector factor analysis and linear typal analysis of 10 burrowing parameters from the marine gastropod mollusc Aplysia brasiliana yielded three factors interpreted as Efficient, Inefficient, and Intermediate burrowers. Used as a diagnostic tool in proper perspective, multivariate analyses structure complex data, provide insight into underlying dimensions and sources of individual variation, and facilitate the formulation of testable hypotheses for studying mechanisms controlling behavior.

Journal ArticleDOI
TL;DR: The advantages of multivariate analysis of variance (MANOVA) and discriminant analysis (DSA) for data analysis in behavioral research are discussed in this paper, where the meanings of some multivariate statistics are also discussed.

Journal ArticleDOI
TL;DR: A method for recognizing both the three-dimensional pattern and the size of objects by grasping them with multijointed fingers equipped with tactile sensors, which shows that the most useful discriminant function is a linear one.

Journal ArticleDOI
TL;DR: In this paper some of the common misuses of discriminant analysis are discussed and identified, which leads to the appropriate corrective action.
Abstract: In this paper some of the common misuses of discriminant analysis are discussed. The problems fall into four groups : 1. Study goals are unfocused; 2. improper sampling procedure ; 3. assumptions are violated; 4. variables are poorly defined or selected. Identification of these problems leads to the appropriate corrective action.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the nature of sex differences in drinking motives and behavior within a stratified random sample from the Champaign-Urbana community (N=385).
Abstract: This paper investigates the nature of sex differences in drinking motives and behavior within a stratified random sample from the Champaign-Urbana community (N=385). Discriminant analysis is used to pinpoint which independent drinking variables differentiated most powerfully between the sexes, as well as to measure the importance of sex differences in drinking relative to age, student status, and marital status group differences. The findings indicate that despite some sex differences in drinking patterns, the drinking variables displayed their greatest discriminatory power on each of the other three demographic group variables. Compared to age, student status, or marital status differences, drinking differences between women and men appeared relatively minor. Throughout the paper the authors stress the utility of multivariate discriminant analysis for the investigation of group differences in sociological research.

Journal ArticleDOI
TL;DR: In this article, the relative importance of variable subsets in the analysis of multivariate data has been discussed and a set of procedures have been proposed to isolate subsets of variables which provide essentially as much separation among the groups in each subset of groups as the original set of variables.
Abstract: SUMMARY The purpose of the procedures proposed in this paper is to provide information about the relative importance of variable subsets in the analysis of multivariate data. In multiple discriminant analysis the procedures isolate subsets of variables which provide essentially as much separation among the groups in each subset of groups as the original set of variables. Each procedure is a simultaneous test procedure in that the type I probability error rate for the family of hypotheses tested cannot exceed a specified value.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the use of discriminant analysis as an empirical technique for assisting the urban planner in predicting patterns of neighborhood change and found that a discriminant model estimated for ninety low-income census tracts within the city of Pittsburgh predicts 97% of upgrading income paths and 92% of downgrading paths over the period 1960 to 1970.
Abstract: This paper investigates the use of discriminant analysis as an empirical technique for assisting the urban planner in predicting patterns of neighborhood change A discriminant model estimated for ninety low-income census tracts within the city of Pittsburgh predicts 97% of upgrading income paths and 92% of downgrading paths over the period 1960 to 1970 Some form of the discriminant model would appear to be a useful guide to policymakers and a reasonable technique for limiting the areas of immediate policy concern


Journal ArticleDOI
TL;DR: The present method is shown to be efficacious compared to other classification procedures such as the simple least-squares method, the Rao-type discriminant analysis, and the K-nearest neighbor method.
Abstract: An adaptive least-squares (ALS) classification which is capable of relating structure to activity rating of chemical compounds has been developed. The ALS method makes decisions for multicategory pattern classification by a single discriminant function. For the set of 16 mitomycin derivatives belonging to five activity classes used in this study, the present method is shown to be efficacious compared to other classification procedures such as the simple least-squares method, the Rao-type discriminant analysis, and the K-nearest neighbor method.

Journal ArticleDOI
TL;DR: The main concepts of several multivariate statistical methods used for analyzing and classifying stored-grain infestation data observed during a decade's ecological studies is presented briefly in non-mathematical language with simple diagrams to encourage their use by stored-product entomologists.
Abstract: The main concepts of several multivariate statistical methods used for analyzing and classifying stored-grain infestation data observed during a decade's ecological studies is presented briefly in non-mathematical language with simple diagrams to encourage their use by stored-product entomologists. The mathematical assumptions and limitations of cluster analysis, multiple regression analysis, principal component analysis, factor analysis, canonical correlation analysis and discriminant analysis are given. Original examples of application and interpretation of principal component analyses to insect- and mite-infested wheat and rapeseed bulks on western Canadian farms are given, as this method was found to be the most useful hypothesis-formulating tool.

Journal ArticleDOI
TL;DR: Four widely used statistical program packages—BMDP, SPSS, DATATEXT, and OSIRIS—were compared for computational accuracy on sample means, standard deviations, and correlations.
Abstract: Four widely used statistical program packages—BMDP, SPSS, DATATEXT, and OSIRIS—were compared for computational accuracy on sample means, standard deviations, and correlations. Only one, BMDP, was not seriously inaccurate in calculations on a data set of three observations. Further, SPSS computed inaccurate statistics in a discriminant analysis on a real data set of 848 observations. It is recommended that the desk calculator algorithm, found in most of these programs, not be used in packages which may run on short word length machines.

Journal ArticleDOI
TL;DR: In this paper, the authors used admission data from readily available admission applications for first-time entering freshmen and transfer students who initially had enrolled at the University of Northern Colorado during the fall quarter 1970 to predict membership into a class of students who were graduated by the end of the traditional 4-year college career, were still enrolled after the four-year period, were not enrolled the quarter following academic probation or suspension, or left the university while in good academic standing.
Abstract: Data from readily available admission applications were obtained for first-time entering freshmen and transfer students who initially had enrolled at the University of Northern Colorado during the fall quarter 1970. These data were used to predict membership into a class of students who (a) were graduated by the end of the traditional 4-year college career, (b) were still enrolled after the 4-year period, (c) were not enrolled the quarter following academic probation or suspension, or (d) left the university while in good academic standing. This study attempted to answer the following questions: (a) Could discriminant functions be developed which would allow for the correct classification of a student into one of the four categories of interest? (b) Which ones of the variables were the best discriminators between the groups? and (c) How efficient were the discriminant functions in this classification procedure? Results indicated that discriminant functions could be developed which accurately place 33% to ...

Journal ArticleDOI
01 Mar 1977
TL;DR: A computer algorithm employing fading-memory system identification and linear discriminant analysis is proposed for real-time detection of human shifts of attention in a control and monitoring situation and application of the method to computer-aided decisionmaking in multitask situations is discussed.
Abstract: A computer algorithm employing fading-memory system identification and linear discriminant analysis is proposed for real-time detection of human shifts of attention in a control and monitoring situation. Experimental results are presented that validate the usefulness of the method. Application of the method to computer-aided decisionmaking in multitask situations is discussed.

Journal ArticleDOI
TL;DR: In this article, a method for studying relationships among groups in terms of categorical data patterns is described, which yields a dimensional rep resentation of configural relationships among mul tiple groups and a quantitative scaling of cate gorical data pattern for use in subsequent assign ment of new individuals to the groups.
Abstract: A method for studying relationships among groups in terms of categorical data patterns is de scribed. The procedure yields a dimensional rep resentation of configural relationships among mul tiple groups and a quantitative scaling of cate gorical data patterns for use in subsequent assign ment of new individuals to the groups. Two ex amples are used to illustrate potential of the method. In the first, profile data that were pre viously analyzed by metric multiple discriminant function analysis are reanalyzed by the nonmetric categorical data pattern technique with highly similar results. The second example examines re lationships among psychiatric syndrome groups in terms of similarities in patterns of categorical background variables. Results appear consistent with other available information concerning the epidemiology of psychiatric disorders.

Book ChapterDOI
01 Jan 1977
TL;DR: In this paper, the robustness of the linear discriminant function against contamination of the initial samples is discussed, where the authors show that the amount of loss of discriminating power is a function of how non-normal the distributions are.
Abstract: Publisher Summary This chapter discusses one particular aspect of the robustness of the linear discriminant function, that is, robustness against contamination of the initial samples. Very little research has been done on the robustness of the linear discriminant function. The linear discriminant function is used to assign observations to one of two populations. If the underlying distributions of the observations are multivariate normal with the same covariance matrices, the sample discriminant has certain desirable asymptotic properties. If the underlying populations are not normal, the discriminant still has some desirable properties, but is not, in general, the optimal rule in any sense. The amount of loss of discriminating power is a function of how non-normal the distributions are. It is known, however, that even slight non-normality can hurt location estimators considerably. The overall probability of misclassification can be greatly affected by location contamination, depending upon the direction of the contamination. The greatest effect of location contamination occurs when the contaminating mean is on the opposite side of the uncontaminated population mean.

Journal ArticleDOI
TL;DR: In this paper, the statistical technique of discriminant analysis is used to define target areas for detailed general exploration given only general geological information and aeromagnetic anomaly blues, and the area was divided into two sub-areas based on major differences in each area's geology.
Abstract: The statistical technique of discriminant analysis is used to define target areas for detailed general exploration given only general geological information and aeromagnetic anomaly blues. In the test area, located in Central Norway, on-going exploration surveys have revealed the presence of mineralization; however, it still has not been determined if any of the sites will beeconomically feasible. The area was divided into 1400 1-km × 1-km cells by superimposing square grid on 1:50,000-scale geological and geophysical maps. Later the area was divided into two subareas based on major differences in each area's geology. A number of geological natures and the aeromagnetic anomaly values were coded systematically in each cell. The cells representing an advanced degree of exploration were chosen as control cells in each of the subareas. The geological and geophysical parameters were transformed, by means of relatively simple transformations, to produce near-normal frequency distributions. A discriminant function was then obtained by discriminant analysis to divide the control data into two groups, cells with presence of mineral occurrence and cells without mineral occurrence. the discriminant function obtained for the control area proved to be relevant both geologically and statistically. Consequently, the discriminant equation was applied to cells outside the control area. The cells were assigned to one of the two groups by entering the geologic factors pleasured from the maps into the discriminant model. The exploration potential of a large number of cells was evaluated by this procedure. To test the results, field work including geochemical sampling was carried out in the cells with highest probability of mineral occurrance The field work results have shown that the application of discriminant analysis to geological information at 1:50,000 scale with 1-km × 1-km cells combined with a careful selection of techniques for transforming the variables is a feasible method for predicting gaeralization, and as such could become a valuable tool for mining exploration.

Journal ArticleDOI
TL;DR: In this paper, a model of the dynamics of cumulative achievement developed by Atkinson (5) forms the basis for developing an instrument to be used in predicting academic achievement, and a discriminant analysis is used in lieu of the more typical but technically less correct MLR.
Abstract: Typical approaches to predicting scholastic achievement apply a multiple linear regression (MLR) approach to several predictors, the basis for whose selection is seldom apparently systematic or theoretically rational. In this study, a model of the dynamics of cumulative achievement developed by Atkinson (5) forms the basis for developing an instrument to be used in predicting academic achievement, and a discriminant analysis is used in lieu of the more typical but technically less correct MLR. The instrument was administered to 804 students in four undergraduate classes to provide estimates of reliability and validity. Average subscale reliability was .70, and hold-out cross-validation showed good nomological validity. It was concluded that the model-based approach to instrument development warrants further work, and that the discriminant analytic technique provided a viable approach to the question of prediction of academic achievement.

Journal ArticleDOI
TL;DR: In this paper, the covariance adjustment for the discriminators and the linear discriminant analysis on the adjusted variables were compared. But the covariation adjustment was not used for assigning new observations.
Abstract: In 1948, Cochran and Bliss [1] introduced the notion of using covariares in discriminant functions. These were variables that in themselves had no discriminating power, but because they were correlated with other variables, they could be useful in combination with those other variables. They showed that the Mahalanobis distance between the two populations always increased, and thus the power of tests would be increased. The use of covariates could never hurt. The method was simple; one computed the usual covariance adjustment for the discriminators and did a standard linear discriminant analysis on the adjusted variables. Somewhat later Cochran [2] compared the performance of this procedure with doing a discriminant analysis on the complete set of discriminators and covariates. In this study, he found that the covariance technique produced more powerful significance tests, but the gain was trivial for assigning new observations. This is not surprising, for let the densities be fi(x, z)=f~(x]z)g(z) i n / / i and f~(x, z)=f2(xlz)g(z) in /72. (/71 and //-0 denote which population we are sampling from.) For the discriminant function, we assume fi(x, z) are multivariate normal. The optimal rule is to assign the unknown observation to //1 if