scispace - formally typeset
Search or ask a question

Showing papers on "Linear discriminant analysis published in 1996"


Book ChapterDOI
15 Apr 1996
TL;DR: A face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression is developed and the proposed “Fisherface” method has error rates that are significantly lower than those of the Eigenface technique when tested on the same database.
Abstract: We develop a face recognition algorithm which is insensitive to gross variation in lighting direction and facial expression. Taking a pattern classification approach, we consider each pixel in an image as a coordinate in a high-dimensional space. We take advantage of the observation that the images of a particular face under varying illumination direction lie in a 3-D linear subspace of the high dimensional feature space — if the face is a Lambertian surface without self-shadowing. However, since faces are not truly Lambertian surfaces and do indeed produce self-shadowing, images will deviate from this linear subspace. Rather than explicitly modeling this deviation, we project the image into a subspace in a manner which discounts those regions of the face with large deviation. Our projection method is based on Fisher's Linear Discriminant and produces well separated classes in a low-dimensional subspace even under severe variation in lighting and facial expressions. The Eigenface technique, another method based on linearly projecting the image space to a low dimensional subspace, has similar computational requirements. Yet, extensive experimental results demonstrate that the proposed “Fisherface” method has error rates that are significantly lower than those of the Eigenface technique when tested on the same database.

2,428 citations


Journal ArticleDOI
TL;DR: This paper describes the automatic selection of features from an image training set using the theories of multidimensional discriminant analysis and the associated optimal linear projection, and demonstrates the effectiveness of these most discriminating features for view-based class retrieval from a large database of widely varying real-world objects.
Abstract: This paper describes the automatic selection of features from an image training set using the theories of multidimensional discriminant analysis and the associated optimal linear projection. We demonstrate the effectiveness of these most discriminating features for view-based class retrieval from a large database of widely varying real-world objects presented as "well-framed" views, and compare it with that of the principal component analysis.

1,713 citations


Journal ArticleDOI
TL;DR: A locally adaptive form of nearest neighbor classification is proposed to try to finesse this curse of dimensionality, and a method for global dimension reduction is proposed, that combines local dimension information.
Abstract: Nearest neighbour classification expects the class conditional probabilities to be locally constant, and suffers from bias in high dimensions. We propose a locally adaptive form of nearest neighbour classification to try to ameliorate this curse of dimensionality. We use a local linear discriminant analysis to estimate an effective metric for computing neighbourhoods. We determine the local decision boundaries from centroid information, and then shrink neighbourhoods in directions orthogonal to these local decision boundaries, and elongate them parallel to the boundaries. Thereafter, any neighbourhood-based classifier can be employed, using the modified neighbourhoods. The posterior probabilities tend to be more homogeneous in the modified neighbourhoods. We also propose a method for global dimension reduction, that combines local dimension information. In a number of examples, the methods demonstrate the potential for substantial improvements over nearest neighbour classification.

908 citations


Journal ArticleDOI
TL;DR: This paper fits Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered.
Abstract: Fisher-Rao linear discriminant analysis (LDA) is a valuable tool for multigroup classification. LDA is equivalent to maximum likelihood classification assuming Gaussian distributions for each class. In this paper, we fit Gaussian mixtures to each class to facilitate effective classification in non-normal settings, especially when the classes are clustered. Low dimensional views are an important by-product of LDA-our new techniques inherit this feature. We can control the within-class spread of the subclass centres relative to the between-class spread. Our technique for fitting these models permits a natural blend with nonparametric versions of LDA.

791 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of combining a collection of general regression fit vectors to obtain a better predictive model and develop a general framework for this problem and examine a cross-validation-based proposal called "model mix" or "stacking" in this context.
Abstract: We consider the problem of how to combine a collection of general regression fit vectors to obtain a better predictive model. The individual fits may be from subset linear regression, ridge regression, or something more complex like a neural network. We develop a general framework for this problem and examine a cross-validation—based proposal called “model mix” or “stacking” in this context. We also derive combination methods based on the bootstrap and analytic methods and compare them in examples. Finally, we apply these ideas to classification problems where the estimated combination weights can yield insight into the structure of the problem.

318 citations


Journal ArticleDOI
TL;DR: Re-examination of single channel EEG data obtained from normal human subjects suggests that the previous indication of low-dimensional structure was an artifact of autocorrelation in the oversampled signal, and discriminatory analysis indicates that the correlation dimension is a poor discriminator for distinguishing between EEGs recorded at rest and during periods of cognitive activity.

302 citations


Journal ArticleDOI
TL;DR: In this article, an adjusted version of the Euclidean distance metric is proposed to incorporate knowledge of class separation contained in the data, which is applied to a real data set and discussed the selection of optimal values of the parameters k and D included in the method.
Abstract: SUMMARY The last 30 years have seen the development of credit scoring techniques for assessing the creditworthiness of consumer loan applicants. Traditional credit scoring methodology has involved the use of techniques such as discriminant analysis, linear or logistic regression, linear programming and decision trees. In this paper we look at the application of the k-nearest-neighbour (k-NN) method, a standard technique in pattern recognition and nonparametric statistics, to the credit scoring problem. We propose an adjusted version of the Euclidean distance metric which attempts to incorporate knowledge of class separation contained in the data. Our k-NN methodology is applied to a real data set and we discuss the selection of optimal values of the parameters k and D included in the method. To assess the potential of the method we make comparisons with linear and logistic regression and decision trees and graphs. We end by discussing a practical implementation of the proposed k-NN classifier.

299 citations


Journal ArticleDOI
01 Jul 1996
TL;DR: A complete Decision Support System for financial diagnosis based on Self Organizing Feature Maps (SOFM) provides a complete analysis which goes beyond that of the traditional models based on the construction of a solvency indicator also known as Z score, without renouncing simplicity for the final decision maker.
Abstract: A complete Decision Support System (DSS) for financial diagnosis based on Self Organizing Feature Maps (SOFM) is described. This is a neural network model which, on the basis of the information contained in a multidimensional space — in the case exposed, financial ratios — generates a space of lesser dimensions. In this way, similar input patterns — in the case exposed, companies — are represented close to one another on a map. The neural network has been complemented and compared with multivariate statistical models such as Linear Discriminant Analysis (LDA), as well as with neural models such as the Multilayer Perceptron (MLP). As the principal advantage, this DSS provides a complete analysis which goes beyond that of the traditional models based on the construction of a solvency indicator also known as Z score, without renouncing simplicity for the final decision maker.

246 citations


Journal ArticleDOI
TL;DR: The DWCE algorithm is introduced and results of a preliminary study based on 25 digitized mammograms with biopsy proven masses are presented, which compares morphological feature classification based on sequential thresholding, linear discriminant analysis, and neural network classifiers for reduction of false-positive detections.
Abstract: Presents a novel approach for segmentation of suspicious mass regions in digitized mammograms using a new adaptive density-weighted contrast enhancement (DWCE) filter in conjunction with Laplacian-Gaussian (LG) edge detection. The DWCE enhances structures within the digitized mammogram so that a simple edge detection algorithm can be used to define the boundaries of the objects. Once the object boundaries are known, morphological features are extracted and used by a classification algorithm to differentiate regions within the image. This paper introduces the DWCE algorithm and presents results of a preliminary study based on 25 digitized mammograms with biopsy proven masses. It also compares morphological feature classification based on sequential thresholding, linear discriminant analysis, and neural network classifiers for reduction of false-positive detections.

236 citations


Journal ArticleDOI
TL;DR: Partial least squares (PLS) as discussed by the authors has been proposed as a valuable alternative to PCA for compressing high-dimensional data before performing linear discriminant analysis (LDA).

229 citations


Journal ArticleDOI
TL;DR: An experiment to help identify the relevant higher order features of texture perceived by humans using the techniques of hierarchical cluster analysis, non-parametric multidimensional scaling (MDS), Classification and Regression Tree Analysis (CART), discriminant analysis, and principal component analysis.

Journal ArticleDOI
TL;DR: This article proposes an alternative approach to RDA of discriminant analysis in the Gaussian framework, called EDDA, that is based on the reparameterization of the covariance matrix of a group Gk in terms of its eigenvalue decomposition.
Abstract: Friedman proposed a regularization technique (RDA) of discriminant analysis in the Gaussian framework. RDA uses two regularization parameters to design an intermediate classifier between the linear, the quadratic the nearest-means classifiers. In this article we propose an alternative approach, called EDDA, that is based on the reparameterization of the covariance matrix [Σ k ] of a group Gk in terms of its eigenvalue decomposition Σ k = λ k D k A k D k ′, where λk specifies the volume of density contours of Gk, the diagonal matrix of eigenvalues specifies its shape the eigenvectors specify its orientation. Variations on constraints concerning volumes, shapes orientations λ k , A k , and D k lead to 14 discrimination models of interest. For each model, we derived the normal theory maximum likelihood parameter estimates. Our approach consists of selecting a model by minimizing the sample-based estimate of future misclassification risk by cross-validation. Numerical experiments on simulated and rea...

Journal ArticleDOI
TL;DR: It is concluded that in many cases, LDA or QDA should be recommended for practical use, depending on the characteristics of the data, but in those cases where even small gains in classification quality are important, the application of RDA might be useful.

Journal ArticleDOI
TL;DR: Three alternative techniques that can be used to empirically select predictors for neural networks in failure prediction, based on linear discriminant analysis, logit analysis and genetic algorithms, are focused on.
Abstract: We are focusing on three alternative techniques-linear discriminant analysis, logit analysis and genetic algorithms-that can be used to empirically select predictors for neural networks in failure prediction. The selected techniques all have different assumptions about the relationships between the independent variables. Linear discriminant analysis is based on linear combination of independent variables, logit analysis uses the logistical cumulative function and genetic algorithms is a global search procedure based on the mechanics of natural selection and natural genetics. In an empirical test all three selection methods chose different bankruptcy prediction variables. The best prediction results were achieved when using genetic algorithms.

16 Sep 1996
TL;DR: In this paper, three alternative techniques that can be used to empirically select predictors for failure prediction purposes are compared. And they have all different assumptions about the relationships between the independent variables.
Abstract: We are focusing on three alternative techniques that can be used to empirically select predictors for failure prediction purposes. The selected techniques have all different assumptions about the relationships between the independent variables. Linear discriminant analysis is based on linear combination of independent variables, logit analysis uses the logistic cumulative probability function and genetic algorithms is a global search procedure based on the mechanics of natural selection and natural genetics. Our aim is to study if these essential differences between the methods (1) affect the empirical selection of independent variables to the model and (2) lead to significant differences in failure prediction accuracy.

Journal ArticleDOI
TL;DR: This study indicates that a GA can provide versatility in the design of linear or nonlinear classifiers without a trade-off in the effectiveness of the selected features.
Abstract: We investigated a new approach to feature selection, and demonstrated its application in the task of differentiating regions of interest (ROIs) on mammograms as either mass or normal tissue. The classifier included a genetic algorithm (GA) for image feature selection, and a linear discriminant classifier or a backpropagation neural network (BPN) for formulation of the classifier outputs. The GA‐based feature selection was guided by higher probabilities of survival for fitter combinations of features, where the fitness measure was the area A z under the receiver operating characteristic (ROC) curve. We studied the effect of different GA parameters on classification accuracy, and compared the results to those obtained with stepwise feature selection. The data set used in this study consisted of 168 ROIs containing biopsy‐proven masses and 504 ROIs containing normal tissue. From each ROI, a total of 587 features were extracted, of which 572 were texture features and 15 were morphological features. The GA was trained and tested with several different partitionings of the ROIs into training and testing sets. With the best combination of the GA parameters, the average test A z value using a linear discriminant classifier reached 0.90, as compared to 0.89 for stepwise feature selection. Test A z values with a BPN classifier and a more limited feature pool were 0.90 with GA‐based feature selection, and 0.89 for stepwise feature selection. The use of a GA in tailoring classifiers with specific design characteristics was also discussed. This study indicates that a GA can provide versatility in the design of linear or nonlinear classifiers without a trade‐off in the effectiveness of the selected features.

Journal ArticleDOI
Hongkyu Jo1, Ingoo Han1
TL;DR: In this article, a new structured model with multiple stages was proposed for bankruptcy prediction, which consists of four phases (training, test, adjustment, and prediction), and three types of input data.
Abstract: Recently, it has been an issue of interest how to integrate classification models to increase the prediction performance. This paper suggests a new structured model with multiple stages. It consists of four phases (training, test, adjustment, and prediction), and three types of input data (training, testing, and generalization). The integrated model is applied for bankruptcy prediction. A statistical model, discriminant analysis and two artificial intelligence models, neural network and case-based forecasting, are used in this study. The integration approach produces higher prediction accuracy than individual models.

Journal ArticleDOI
TL;DR: The weighted-Parzen-window classifier requires less computation and storage than the full Parzen- window classifier, and Experimental results showed that significant savings could be achieved with only minimal, if any, error rate degradation for synthetic and real data sets.
Abstract: This paper introduces the weighted-Parzen-window classifier. The proposed technique uses a clustering procedure to find a set of reference vectors and weights which are used to approximate the Parzen-window (kernel-estimator) classifier. The weighted-Parzen-window classifier requires less computation and storage than the full Parzen-window classifier. Experimental results showed that significant savings could be achieved with only minimal, if any, error rate degradation for synthetic and real data sets.

Book
12 Dec 1996
TL;DR: In this article, the authors present graphical methods of displaying data one-way designs factorial designs repeated measure designs simple linear regression and multiple regression analysis log-linear models and logistics regression the generalised linear model distribution-free, computer-intensive models multivariate analysis I - principle components analysis and exploratory factor analysis multiivariate analysis II - confirmatory factor analyses and covariance structure modelling multivariate analyses III - cluster analysis, discriminant analysis, and multidimensional scaling the asssessment of reliability
Abstract: Data, models, and a little history graphical methods of displaying data one-way designs factorial designs repeated measure designs simple linear regression and multiple regression analysis log-linear models and logistics regression the generalised linear model distribution-free, computer-intensive models multivariate analysis I - principle components analysis and exploratory factor analysis multivariate analysis II - confirmatory factor analysis and covariance structure modelling multivariate analysis III - cluster analysis, discriminant analysis, and multidimensional scaling the asssessment of reliability

Journal ArticleDOI
TL;DR: The results show that recognition rates of about 95% are possible using average spectra, and the diVering recognition rates by species and fish sizes are discussed and the two identification methods compared.
Abstract: The paper reports the results of species-recognition-rate measurements on caged aggregations of mackerel, horse mackerel, saithe, haddock, and two sizes of cod. Data on the acoustic backscattering coeYcients were collected in eight contiguous bandwidth intervals covering the frequency band between 27 and 54 kHz. The measurements were made during two to six periods of 24 h for each aggregation of fish. Replicate experiments were carried out for mackerel, horse mackerel, and two sizes of cod. The data were processed to give average frequency spectra. The number of independent observations used to establish the mean was varied to examine the species-recognition dependence on the number of independent observations. The mean spectra were analysed using two recognition methods: neural network and discriminant analysis. A neural network was trained on subsets of the data and recognition rates established for the diVerent numbers of samples used to calculate the mean spectra. Classical discriminant analysis was applied using the same data sets. The results of the two identification methods are presented and show that recognition rates of about 95% are possible using average spectra. The diVering recognition rates by species and fish sizes are discussed and the two identification methods compared. Implications for the future development of these methods are considered. ? 1996 International Council for the Exploration of the Sea

Journal ArticleDOI
TL;DR: Out of 634 adult females of five caprine breeds of Andalusia, it is determined that the differences between breeds, with regard to production level and degree of wildness, were indicated by head length and shin circumference and rump length and the most discriminant variables in groups based on productive ability and cephalic profile.

Journal ArticleDOI
TL;DR: In this article, a multilayered back-propagation neural network was used for liver lesion classification using B-scan ultrasound images for normal, hemangioma and malignant livers.
Abstract: Ultrasound imaging is a powerful tool for characterizing the state of soft tissues; however, in some cases, where only subtle differences in images are seen as in certain liver lesions such as hemangioma and malignancy, existing B-scan methods are inadequate. More detailed analyses of image texture parameters along with artificial neural networks can be utilized to enhance differentiation. From B-scan ultrasound images, 11 texture parameters comprising of first, second and run length statistics have been obtained for normal, hemangioma and malignant livers. Tissue characterization was then performed using a multilayered backpropagation neural network. The results for 113 cases have been compared with a classification based on discriminant analysis. For linear discriminant analysis, classification accuracy is 79.6% and with neural networks the accuracy is 100%. The present results show that neural networks classify better than discriminant analysis, demonstrating a much potential for clinical application.

Proceedings ArticleDOI
07 May 1996
TL;DR: This paper describes a novel technique specifically developed for gender identification which combines acoustic analysis and pitch which was tested on three British English databases giving less than 1% identification error rate with two seconds of speech.
Abstract: This paper describes a novel technique specifically developed for gender identification which combines acoustic analysis and pitch. Two sets of hidden Markov models, male and female, are matched to the speech using the Viterbi algorithm and the most likely sequence of models with corresponding likelihood scores are produced. Linear discriminant analysis is used to normalise the models and reduce bias towards a particular gender. An enhanced version of the pitch estimation algorithm used for IMBE speech coding is used to give an average pitch estimate for the speaker. The information provided by the acoustic analysis and pitch estimation are combined using a linear classifier to identify the gender of the speech. The system was tested on three British English databases giving less than 1% identification error rate with two seconds of speech. Further tests without optimisation on eleven languages of the OGI database gave error rates less than 5.2% and an average of 2.0%.

Journal ArticleDOI
TL;DR: A selection of classifiers and a selection of dimensionality reducing techniques are applied to the discrimination of seagrass spectral data and results indicate a promising future for wavelets in discriminant analysis, and the recently introduced flexible and penalized discriminantAnalysis.

Journal ArticleDOI
Christof Ebert1
TL;DR: The hypothesis whether fuzzy classification applied to criticality prediction provides better results than other classification techniques that have been introduced in this area is investigated.
Abstract: Managing software development and maintenance projects requires predictions about components of the software system that are likely to have a high error rate or that need high development effort. The value of any classification is determined by the accuracy and cost of such predictions. The paper investigates the hypothesis whether fuzzy classification applied to criticality prediction provides better results than other classification techniques that have been introduced in this area. Five techniques for identifying error-prone software components are compared, namely Pareto classification, crisp classification trees, factor-based discriminant analysis, neural networks, and fuzzy classification. The comparison is illustrated with experimental results from the development of industrial real-time projects. A module quality model — with respect to changes — provides both quality of fit (according to past data) and predictive accuracy (according to ongoing projects). Fuzzy classification showed best results in terms of overall predictive accuracy.

Journal ArticleDOI
TL;DR: In this paper, a methodology for the automatic recognition of weld defects, detected by a P-scan ultrasonic system, has been developed within two stages in the first stage, a selection of the shape parameters defining the pulse-echo envelope reflected from a generic flaw, and defined in the time domain, is performed by Fischer linear discriminant analysis, in the second stage the classification is carried out by a three-layered neural network trained with the backpropagation rule, where the input values are the parameters selected by the Fischer analysis.
Abstract: A methodology for the automatic recognition of weld defects, detected by a P-scan ultrasonic system, has been developed within two stages in the present work. In the first stage, a selection of the shape parameters defining the pulse-echo envelope reflected from a generic flaw, and defined in the time domain, is performed by Fischer linear discriminant analysis. In the second stage the classification is carried out by a three-layered neural network trained with the backpropagation rule, where the input values are the parameters selected by the Fischer analysis. With regard to the neural network learning process, 135 real weld defects have been considered. The defects, distributed among the classes of cracks, slags of inclusion and porosity, had been previously characterized by X-ray inspection. The results obtained confirm the effectiveness of the approach in preserving the discriminant information needed for characterization by an iterative use of Fischer analysis, and in increasing the generalization properties of the layered network by an interpretation of the knowledge embedded in the generated connections and weights. The required computation time allows in-process application.

Proceedings ArticleDOI
07 May 1996
TL;DR: The discriminatory power of different segments of a human face is studied end a new scheme for face recognition is proposed and an efficient projection based feature extraction and classification scheme for recognition of human faces is proposed.
Abstract: The discriminatory power of different segments of a human face is studied end a new scheme for face recognition is proposed. We first focus on the linear discriminant analysis (LDA) of human faces in spatial and wavelet domains, which enables us to objectively evaluate the significant of visual information in different parts of the face for identifying the person. The results of this study can be compared with subjective psychovisual findings. The LDA of faces also provides us with a small set of features that carry the most relevant information for face recognition. The features are obtained through the eigenvector analysis of scatter matrices with the objective of maximizing between class variations and minimizing within class variations. The result is an efficient projection based feature extraction and classification scheme for recognition of human faces. For a midsize database of faces excellent classification accuracy is achieved with only four features.

Proceedings Article
18 Jun 1996
TL;DR: In this article, a composite rejector is proposed to eliminate a large fraction of the candidate classes or inputs, which allows a recognition algorithm to dedicate its efforts to a much smaller number of possibilities.
Abstract: The efficiency of pattern recognition is particularly crucial in two scenarios; whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We propose a single technique, namely, pattern rejection, that greatly enhances efficiency in both cases. A rejector is a generalization of a classifier, that quickly eliminates a large fraction of the candidate classes or inputs. This allows a recognition algorithm to dedicate its efforts to a much smaller number of possibilities. Importantly, a collection of rejectors may be combined to form a composite rejector, which is shown to be far more effective than any of its individual components. A simple algorithm is proposed for the construction of each of the component rejectors. Its generality is established through close relationships with the Karhunen-Loeve expansion and Fisher's discriminant analysis. Composite rejectors were constructed for two representative applications, namely, appearance matching based object recognition and local feature detection. The results demonstrate substantial efficiency improvements over existing approaches, most notably Fisher's discriminant analysis.

Proceedings ArticleDOI
25 Aug 1996
TL;DR: A stability measure is proposed and a study on the performance and stability of the following techniques: regularization by the ridge-estimate of the covariance matrix, bootstrapping followed by aggregation and editing combined with pseudo-inversion are presented.
Abstract: In this paper the possibilities for constructing linear classifiers are considered for very small sample sizes. We propose a stability measure and present a study on the performance and stability of the following techniques: regularization by the ridge-estimate of the covariance matrix, bootstrapping followed by aggregation ("bagging") and editing combined with pseudo-inversion. It is shown that by these techniques a smooth transition can be made between the nearest mean classifier and the Fisher discriminant (1936, 1940) based on large samples sizes. Especially for highly correlated data very good results are obtained compared with the nearest mean method.

Journal ArticleDOI
TL;DR: The proposed feature space is compared to those generated by tissue-parameter-weighted images, principal component images, and angle images, demonstrating its superiority for feature extraction and scene segmentation and its relationship with discriminant analysis is discussed.
Abstract: This paper presents development and application of a feature extraction method for magnetic resonance imaging (MRI), without explicit calculation of tissue parameters. A three-dimensional (3-D) feature space representation of the data is generated in which normal tissues are clustered around prespecified target positions and abnormalities are clustered elsewhere. This is accomplished by a linear minimum mean square error transformation of categorical data to target positions. From the 3-D histogram (cluster plot) of the transformed data, clusters are identified and regions of interest (ROI's) for normal and abnormal tissues are defined. These ROI's are used to estimate signature (prototype) vectors for each tissue type which in turn are used to segment the MRI scene. The proposed feature space is compared to those generated by tissue-parameter-weighted images, principal component images, and angle images, demonstrating its superiority for feature extraction and scene segmentation. Its relationship with discriminant analysis is discussed. The method and its performance are illustrated using a computer simulation and MRI images of an egg phantom and a human brain.