scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 1991"


Journal ArticleDOI
TL;DR: The formal equivalence between discriminant analysis and multilayer perceptrons used for classification tasks is proved and evidence of generic properties of MLPs classifiers is shown.

139 citations


Journal ArticleDOI
TL;DR: In this paper, a probabilistic and distribution-free class-modeling technique is developed from potential function discriminant analysis, where the class boundary is built either by the sample percentile of the probability density estimated by means of potential functions, or by the estimate of the equivalent determinant of the variance covariance matrix.
Abstract: A probabilistic and distribution-free class-modelling technique is developed from potential function discriminant analysis. In the multidimensional space of variables the class boundary is built either by the sample percentile of the probability density estimated by means of potential functions, or by the estimate of the ‘equivalent’ determinant of the variance–covariance matrix. The equivalent determinant is that of a hypothetical multivariate normal distribution whose mean probability density was obtained by potential functions. The bases of this modelling rule are evaluated by means of Monte Carlo experiments. The results on four datasets are used to measure the performances of this method, which equal and sometimes exceed the performances of parametric class-modelling methods based on linear and quadratic discriminant analysis which were used for comparison.

113 citations


Journal Article
TL;DR: In this article, the authors evaluated the performance of image texture processing and integration with ancillary topographic data for a moderate relief, boreal environment in Gros Morne National Park, eastern Canada.
Abstract: This study evaluates the increases in classification accuracy possible from satellite and airbone image texture processing and integration with ancillary topographic data for a moderate relief, boreal environment in Gros Morne National Park, eastern Canada. The texture measures angular second moment, entropy, and inverse difference moment were computed from spatial co-occurrence matrices in different orientations from SPOT multispectral linear array (MLA) and synthetic aperture radar (SAR) imagery. The general system of geomorphometry (elevation, slope, aspect, curvature, relief) was extracted from a co-registered digital elevation model (DEM). Stepwise and linear discriminant analyses were used to rank the relative information content of all variables and to ascertain classification accuracies for different combinations of variables

104 citations


Journal ArticleDOI
TL;DR: An important aspect of the approach is to exhibit how a priori information regarding nonuniform class membership, uneven distribution between train and test sets, and misclassification costs may be exploited in a regularized manner in the training phase of networks.
Abstract: The problem of multiclass pattern classification using adaptive layered networks is addressed. A special class of networks, i.e., feed-forward networks with a linear final layer, that perform generalized linear discriminant analysis is discussed, This class is sufficiently generic to encompass the behavior of arbitrary feed-forward nonlinear networks. Training the network consists of a least-square approach which combines a generalized inverse computation to solve for the final layer weights, together with a nonlinear optimization scheme to solve for parameters of the nonlinearities. A general analytic form for the feature extraction criterion is derived, and it is interpreted for specific forms of target coding and error weighting. An important aspect of the approach is to exhibit how a priori information regarding nonuniform class membership, uneven distribution between train and test sets, and misclassification costs may be exploited in a regularized manner in the training phase of networks. >

99 citations


Journal ArticleDOI
TL;DR: Compared with the two other techniques, neural networks show a unique ability to detect features hidden in the input data which are not explicitly formulated as input, and their application, particularly in situations with complex data structures, should be investigated with more emphasis.
Abstract: Successful applications of neural network architecture have been described in various fields of science and technology. We have applied one such technique, error back-propagation, to a medical classification problem stemming from clinical chemistry, and we have compared the performance of two different neural networks with results obtained by conventional linear discriminant analysis or by the technique of classification and regression trees. The results obtained by the various models were tested for robustness by jackknife validation ("leave n out" method). Compared with the two other techniques, neural networks show a unique ability to detect features hidden in the input data which are not explicitly formulated as input. Thus, neural network techniques appear promising in the field of clinical chemistry, and their application, particularly in situations with complex data structures, should be investigated with more emphasis.

96 citations


Journal ArticleDOI
01 Mar 1991
TL;DR: Algorithms for dimensionality reduction and feature extraction and their applications as effective pattern recognizers in identifying computer users are presented and the applications of these algorithms could lead to better results in securing access to computer systems.
Abstract: Algorithms for dimensionality reduction and feature extraction and their applications as effective pattern recognizers in identifying computer users are presented. Fisher's linear discriminant technique was used for the reduction of dimensionality of the patterns. An approach for the extraction of physical features from pattern vectors is developed. This approach relies on shuffling two pattern vectors. The shuffling approach is competitive with the use of Fisher's technique in terms of speed and results. An online identification system was developed. The system was tested over a period of five weeks, used by ten participants, and in 1.17% of cases gave the error of being unable to decide. The applications of these algorithms in identifying computer users could lead to better results in securing access to computer systems. The user types a password and the system identifies not only the word but the time between each keystroke and the next. >

91 citations


Journal ArticleDOI
TL;DR: Back-propagation is more effective than other procedures, sometimes strikingly so, in correctly classifying the dependent, even when the amount of noise in the model is high, and ID3 is not consistently superior to procedures in the multiple linear general model (MLGH) family in terms of effectiveness, either for classification or for causal inference.
Abstract: New computer techniques for data analysis, notably the algorithms associated with neural networks and with expert systems, have not caught on to a significant extent in social science. To appraise these developments, an empirical assessment is conducted in which expert systems and neural network approaches are compared with multiple linear regression, logistic regression, effects analysis, path analysis, and discriminant analysis. A simple method of partitioning neural network output layer connections in terms of input nodes (corresponding to independent variables) is also presented, allowing neural net analysis for modeling as well as classification purposes. It is concluded that back-propagation (neural networks) is more effective than other procedures, sometimes strikingly so, in correctly classifying the dependent, even when the amount of noise in the model is high. Back-propagation was of less help, however, in causal inference. None of the techniques performed well by this important criterion. The I...

82 citations


Journal ArticleDOI
TL;DR: In this article, an analytic solution for the theoretical distribution of optimal values for univariate optimal linear discriminant analysis, under the assumption that the data are random and continuous, is presented.
Abstract: Optimal linear discriminant models maximize percentage accuracy for dichotomous classifications, but are rarely used because a theoretical framework that allows one to make valid statements about the statistical significance of the outcomes of such analyses does not exist. This paper describes an analytic solution for the theoretical distribution of optimal values for univariate optimal linear discriminant analysis, under the assumption that the data are random and continuous. We also present the theoretical distribution for sample sizes up to N= 30. The discovery of a statistical framework for evaluating the performance of optimal discriminant models should greatly increase their use by scientists in all disciplines.

76 citations


Journal ArticleDOI
TL;DR: In this article, two ground coffees, Coffea arabica and C.robusta, were clearly separated by cluster analysis and linear discriminant analysis, and two freeze-dried and spraydried commercial instant coffees were clearly distinguished by pattern recognition techniques.
Abstract: The method was applied to discriminating coffee aromas, essential oils, and volatile compounds with different functional groups. To standardise sample introduction and to remove excess ethanol from volatile mixtures, headspace concentration utilizing a porous polymer trap was incorporated into the sensing system. Pattern recognition techniques such as discriminant analysis and cluster analysis were applied to the normalized response pattern. Two ground coffees, Coffea arabica and C.robusta, and freeze-dried and spray-dried commercial instant coffees were clearly separated by cluster analysis and linear discriminant analysis.

70 citations


Journal ArticleDOI
07 Jul 1991
TL;DR: The Hotelling model is quite useful as a predictive tool unless there are high-pass noise correlations introduced by post-processing of the images, in which case it is suggested that the Hotelling observer be modified to include spatial-frequency-selective channels analogous to those in the visual system.
Abstract: The use of linear discriminant functions, and particularly a discriminant function derived from the work of Harold Hotelling, as a means of assessing image quality is reviewed. The relevant theory of ideal or Bayesian observers is briefly reviewed, and the circumstances under which this observer reduces to a linear discriminant are discussed. The Hotelling oberver is suggested as a linear discriminant in more general circumstances where the ideal observer is nonlinear and usually very difficult to calculate. Methods of calculation of the Hotelling discriminant and the associated figure of merit, the Hotelling trace, are discussed. Psychophysical studies carried out at the University of Arizona to test the predictive value of the Hotelling observer are reviewed, and it is concluded that the Hotelling model is quite useful as a predictive tool unless there are high-pass noise correlations introduced by post-processing of the images. In that case, we suggest that the Hotelling observer be modified to include spatial-frequency-selective channels analogous to those in the visual system.

67 citations


Journal ArticleDOI
TL;DR: Pattern recognition techniques based on multivariate analysis have been most useful in processing data from chromatography and spectrometry mainly due to the intrinsic multi dimensionality of flavor.
Abstract: Chemometrics is playing an increasingly important role in flavor research. Pattern recognition techniques based on multivariate analysis have been most useful in processing data from chromatography and spectrometry mainly due to the intrinsic multi dimensionality of flavor. Multiple regression analysis and its derivatives including partial least squares regression (PLS) have been frequently used for correlating instrumental data to sensory properties. Factor analysis and principal component analysis are widely used for searching latent factors and extracting information as unsupervised pattern recognition. Cluster analysis and discriminant analysis have been successful for classification of samples; however, modeling of samples using SIMCA and nonparametric classification such as KNN have also gained popularity for improving accuracy. Simplex optimization has been well established as a technique in chemometrics, however, it is relatively unknown in flavor research. Computer‐aided optimization has...

Journal ArticleDOI
TL;DR: In preliminary applications of the CART algorithms to data from The University of North Carolina Caries Risk Assessment Study, the method produced prediction rules having sensitivities and specificities that were similar to or slightly better than those associated with logistic and discriminant analyses.
Abstract: Caries prediction by Classification And Regression Tree (CART) analysis is an appropriate and powerful alternative or complement to the commonly used classification methods of logistic regression and discriminant analysis, both parametric and nonparametric. The binary classification tree method discussed in this article is designed for complex data and does not require assumptions about the predictor variables or about the presence or absence of interactions among the predictor variables. Furthermore, the results give insight into the structures and interactions in the data and are easy to interpret and apply. In preliminary applications of the CART algorithms to data from The University of North Carolina Caries Risk Assessment Study, the method produced prediction rules having sensitivities and specificities that were similar to or slightly better than those associated with logistic and discriminant analyses. The classification trees constructed tended to involve far fewer predictor variables than requir...

Journal ArticleDOI
TL;DR: In this paper, a critical comparison of the quadratic discriminant rule (QD) for large, normal training sets in the sense of minimizing the overall misclassification rate is presented.

Journal ArticleDOI
TL;DR: In this article, a pattern recognition technique based on piecewise linear discriminant analysis (PLDA) is described and algorithms for the calculation and optimization of the individual discriminants are presented.
Abstract: A pattern recognition technique based on piecewise linear discriminant analysis (PLDA) is described. Algorithms for the calculation and optimization of piecewise linear discriminants are presented. A simplex optimization of the individual discriminants is described, and a new method to optimize a piecewise linear discriminant is proposed and shown to produce significantly improved results over the nonoptimized method. This methodology is demonstrated through the use of a set of Fourier transform infrared interferograms collected by a remote sensor

Journal ArticleDOI
TL;DR: In this article, Monte Carlo methods were used to compare the stepwise variable selection procedure in discriminant analysis with stepwise procedure using logistic regression and found that in most situations there was little difference in the probability of selecting the related variables.
Abstract: Monte Carlo methods were used to compare the stepwise variable selection procedure in discriminant analysis with the stepwise procedure using logistic regression. In these studies four of the candidate variables were related to group membership and four were not. The data sets were generated from normal, lognormal, and Bernoulli distributions. Several sample sizes, mean vectors, and covariance matrices were used. In most situations there was little difference between stepwise logistic regression and discriminant analysis in the probability of selecting the related variables. In some situations stepwise discriminant analysis gave a greater probability of selecting the related variables.

Journal ArticleDOI
TL;DR: In split-sample tests using the Butterworth international conflict data set, the neural network outperforms both discriminant analysis and ID3 in terms of accuracy; it is roughly comparable in accuracy to multinomial logit.
Abstract: This article uses a neural network to predict international conflict outcomes, comparing its accuracy to that of models constructed using discriminant analysis, logit analysis, and the rule-based ID3 algorithm. While neural networks originally attracted attention because they mirrored the structure of biological nervous systems, they are used increasingly to solve practical problems of prediction and classification. Neural networks may also be important for modeling international behavior because of structural similarities with some organizational processes used to determine foreign policy. In split-sample tests using the Butterworth international conflict data set, the neural network outperforms both discriminant analysis and ID3 in terms of accuracy; it is roughly comparable in accuracy to multinomial logit. The neural network is less successful than discriminant and logit at predicting nonmodal values of the dependent variable. The variables identified as important in the neural network appear to be si...

Journal ArticleDOI
TL;DR: Preliminary results have found the restricted Coulomb Energy (RCE) neural network model to have a testing accuracy of 90.6% which is approximately 10% better than any of the other techniques investigated.

Journal ArticleDOI
P Falbo1
TL;DR: In this article, the problem of make-up accountancy is dealt with, and a solution is proposed through an extension of the variables to be included among discriminatory models, in order to demonstrate that passing from "level" ratios to dynamic variables, such as trend and stability, improves the performance of discriminatory models and addresses problems of "window dressing" implicitly.
Abstract: A large part of the research devoted to the application of Discriminant analysis techniques to give early warning of bankruptcy has only partially used balance sheet information, as discriminatory variables typically consisted of ‘single year’ financial and operating ratios. Lengthening the time scale of such ratios, permits us to look more deeply into the economic reality of firms, since in this way, some strategies may be directly observed. Such an effort even becomes, in certain cases, necessary, for example in certain ‘window dressing’ operations on balance sheets. Such an event may seriously compromise performance of the usual discriminant models. The problem of ‘make-up accountancy’ is dealt with here, and a solution is proposed through an extension of the variables to be included among discriminatory models. Our aim is to demonstrate that passing from ‘level’ ratios to dynamic variables, such as trend and stability, improves the performance of discriminatory models and addresses problems of ‘window dressing’ implicitly. After the determination of a discriminant model of the traditional kind, we proceeded with an integration of such aspects (i.e., trend and stability) into the model, in order to show separately their contribution to the improvement in the total discriminatory power. While constantly keeping an eye on the bank credit selection problem, where discriminant models represent a straightforward solution, this methodology constitutes an extension of Discriminant Analysis technique in financial applications.

Proceedings ArticleDOI
14 Apr 1991
TL;DR: The authors extend to continuous speech recognition (CSR) the Alphanet approach to integrating backprop networks and HMM (hidden Markov model)-based isolated word recognition and present the theory of a method for discriminative training of components of a CSR system, using training data in the form of complete sentences.
Abstract: The authors extend to continuous speech recognition (CSR) the Alphanet approach to integrating backprop networks and HMM (hidden Markov model)-based isolated word recognition. They present the theory of a method for discriminative training of components of a CSR system, using training data in the form of complete sentences. The derivatives of the discriminative score with respect to the parameters are expressed in terms of the posterior probabilities of state occupancies (gammas) under two conditions called 'clamped' and 'free' because they correspond to the two conditions in Boltzmann machine training. The authors compute these clamped and free gammas using the forward-backward algorithm twice, and use the differences to drive the adaptation of a preprocessing data transformation, which can be thought of as replacing the linear transformation which yields MFCCs, or which normalizes a grand covariance matrix. >

Journal ArticleDOI
TL;DR: In this article, the influence of observations on misclassification probability estimates in linear discriminant analysis is examined and a quadratic approximation is developed for this problem using Pregibon's (1981) case weights scheme.
Abstract: SUMMARY The influence of observations upon misclassification probability estimates in linear discriminant analysis is examined. For a single observation, an exact expression is given confirming two surprising results of Campbell (1978). It also shows that the influence of an observation is governed by two quantities: (a) the difference between its linear discriminant score and that for its sample mean, and (b) its atypicality estimate for its own population. This is analogous to linear model results where an observation's influence depends upon its residual and its leverage. However, important differences from the regression situation are noted. Contours of this one-at-a-time influence can be superimposed on the plot introduced by Critchley & Ford (1985). Examining the joint influence of several observations -is complicated by the computational burden and by possible masking effects. A quadratic approximation is developed for this problem using Pregibon's (1981) case weights scheme. This approximation has an error of order n-3, where n denotes the assumed common order of the sample sizes. An example is given.

Journal ArticleDOI
TL;DR: Results with three trivariate populations show that logistic discrimination is preferable to other widely-used methods for multiple group classification with non-normal data, and is comparable to classification by multiple linear discrimination with normal data.
Abstract: Methods of multiple group discriminant analysis have not been fully studied with respect to classification into more than two populations when the covariate distributions are normal or non-normal. The present study examines the classification performance of several multiple discrimination methods under a variety of simulated continuous normal and non-normal covariate distributions. The methods include polychotomous logistic regression, multiple group linear discriminant analysis, kernel density estimation, and rank transformations of the data as input into the linear function. The parameters of interest were distance among populations, configuration of population mean vectors (collinear or forming the vertices of a regular simplex), skewness, kurtosis and bimodality. Simulation of the last three parameters was by log-normal, sinh-1 normal and a two-component mixture of normal distributions, respectively. Results with three trivariate populations show that for all distributions, logistic discrimination classifies close to the optimal under Neyman-Pearson allocation. These results suggest that logistic discrimination is preferable to other widely-used methods for multiple group classification with non-normal data, and is comparable to classification by multiple linear discrimination with normal data.

Journal ArticleDOI
TL;DR: A genetic algorithm that determines linear discriminant functions is studied that routinely found the optimal solution while taking significantly less time than exact algorithms which must both find and prove optimality.
Abstract: This paper studies a genetic algorithm that determines linear discriminant functions. The objective is to minimize the number of misclassifications and then to minimize the number of used attributes. The algorithm produces results that are significantly better than Fisher's method for minimizing the probability of misclassification but not significantly different than those that exactly minimize the number of misclassifications. The algorithm has the added benefit that the solutions obtained are more parsimonious than these exact algorithms since two objectives could be considered simultaneously. In our empirical studies, the algorithm routinely found the optimal solution while taking significantly less time than exact algorithms which must both find and prove optimality. INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.

Journal Article
TL;DR: This chapter introduces two techniques for accomplishing multiple regression and discriminant analysis, and explores the nature of the weighted composite variable in logistic regression with a dichotomous dependent variable and the main statistical tools that accompany it.
Abstract: I n the previous chapter, multiple regression was presented as a flexible technique for analyzing the relationships between multiple independent variables and a single dependent variable. Much of its flexibility is due to the way in which all sorts of independent variables can be accommodated. However, this flexibility stops short of allowing a dependent variable consisting of categories. How then can the analyst deal with data representing multiple independent variables and a categorical dependent variable? How can independent variables be used to account for differences in categories? This chapter introduces two techniques for accomplishing this aim: logistic regression and discriminant analysis. Even though the two techniques often reveal the same patterns in a set of data, they do so in different ways and require different assumptions. As the name implies, logistic regression draws on much of the same logic as ordinary least squares regression, so it is helpful to discuss it first, immediately after Chapter 4. Discriminant analysis sits alongside multivariate analysis of variance, the topic of Chapter 6, so discussing it second will help to build a bridge across the present chapter and the next. That said, the multivariate strategy of forming a composite of weighted independent variables remains central, despite differences in the ways in which it is accomplished. In Subsection 5.1.1 we explore the nature of the weighted composite variable in logistic regression with a dichotomous dependent variable and introduce the main statistical tools that accompany it. Subsection 5.1.2 shows two-group, or \" binary, \" logistic regression in action, first with further analyses

Proceedings ArticleDOI
14 Apr 1991
TL;DR: The authors suggest a linear discriminant function to complete the distance score instead of a conventional average distance, which achieved a 78.1% accuracy, compared to 67.6% with the traditional average method.
Abstract: The authors suggest a linear discriminant function to complete the distance score instead of a conventional average distance. Several discriminative algorithms are proposed to learn the discriminant function. These include one heuristic method, two methods based on the error propagation algorithm, and one method based on the generalized probabilistic descent (GPD) algorithm. The authors study these methods in a speaker-independent speech recognition task involving utterances of the highly confusable English E-set. The results show that the best performance is obtained by using the GPD method, which achieved a 78.1% accuracy, compared to 67.6% with the traditional average method. >

Journal ArticleDOI
Mehmet Celenk1
01 Sep 1991
TL;DR: In this paper, a new clustering technique is described for segmenting the colour images of natural scenes, which detects image clusters in some linear decision volumes of the (L*, a*, b*) uniform colour space.
Abstract: A new clustering technique is described for segmenting the colour images of natural scenes. The proposed method detects image clusters in some linear decision volumes of the (L*, a*, b*) uniform colour space. The detected clusters are projected onto the line of the Fisher discriminant for 1-D thresholding. This permits the use of all the property values of colour clusters and, in turn, produces better segmentation results than the colour component thresholding schemes. The method is also faster than the existing colour clustering techniques because clusters are specified in linear decision volumes using only 1-D histograms.

Journal ArticleDOI
TL;DR: A new feature extraction method based on the modified “plus e -take away f” algorithm is proposed and it is shown from experimental results that the proposed method is superior to the ODV method in terms of the error probability.

Book ChapterDOI
01 Jan 1991
TL;DR: The techniques of multivariate analysis are well suited for studying structure-biodegradability relationships when objects are chemical structures and measurements are molecular descriptors and rate of aerobic biodegradation.
Abstract: Multivariate analysis, in the strictest sense, is the study of systems of correlated random variables or random samples from such systems (Gifi 1990). However, practically,multivariate methods deal with the problem of linear representation of relationships among a set of measurements on a number of objects. When objects are chemical structures and measurements are molecular descriptors and rate of aerobic biodegradation, for instance, the techniques of multivariate analysis are well suited for studying structure-biodegradability relationships.

Journal ArticleDOI
TL;DR: In this paper, the authors show how the logistic model can be adapted to make it robust against outlying observations from the populations of two or more populations, where the model is then used on data collected from the teeth of rats.
Abstract: SUMMARY Logistic discrimination is a well established method for allocating observations to one of two or more populations. In this paper we show how the logistic model can be adapted to make it robust against outlying observations from the populations. The new model is successfully tested using simulated data against ordinary logistic discrimination and also against logistic discrimination where parameters are fitted using resistant methods. The model is then used on data collected from the teeth of rats. Logistic discrimination is a partially parametric method for discrimination applicable in many situations and under several sampling schemes. Notable references to the method are Day & Kerridge (1967), Anderson (1972, 1979, 1982) and Anderson & Blair (1982). The appeal of the logistic method for discrimination is that only the ratio of likelihoods has to be modelled and not the full distributional form of the underlying variables. Press & Wilson (1978) and Byth & McLachlan (1980) show that in many circumstances logistic discrimination is preferable to the usual linear discrimination based on either normal populations with equal covariance matrices or Fisher's linear discriminant. For the two group case, suppose x is a vector of observations, H1 and H2 are the two groups and a a parameter vector. Then the log likelihood ratio is modelled as

Journal ArticleDOI
TL;DR: In this article, Fisher's discriminant analysis (FDA) is used to obtain a prediction model for dichotomous classifications on the basis of two or more independent variables, where values on independent variables are combined into a single predicted value (Y*) that is compared against a cutpoint and direction in order to make classifications.
Abstract: Fisher's discriminant analysis (FDA) is often used to obtain a prediction model for dichotomous classifications on the basis of two or more independent variables. FDA provides an equation whereby values on independent variables are combined into a single predicted value (Y*) that is compared against a cutpoint and direction in order to make classifications. Theoretically, univariate optimal discriminant analysis employed on these Y* will maximize training classification accuracy. This methodology is illustrated using three examples.

Proceedings ArticleDOI
08 Jul 1991
TL;DR: It was found that for DARPA Data Set I, no single classifier is superior to the others; however, a hybrid set of classifiers yields the best performance.
Abstract: Discusses the development of a hybrid classifier of SDSs (short duration signals) in the underwater acoustic environment. The classifier is envisioned to include both neural-based and non-neural networks. The authors present a comparison of performance of two neural-based and two non-neural-based classifiers for four signal extractors. It was found that for DARPA Data Set I, no single classifier is superior to the others; however, a hybrid set of classifiers yields the best performance. The neural network classifiers are based on RBFs (radial basis functions) and the MLP (multilayer perceptron) trained with backpropagation. The classical classification techniques of k-nearest neighbor and the Fisher linear discriminant are used for comparison. The signal extraction methods used include two versions of wavelets, autoregressive spectral coefficients and a linear combination of a wavelet with an autoregressive representation. >