scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 1998"


Journal ArticleDOI
TL;DR: This work decomposed eight fMRI data sets from 4 normal subjects performing Stroop color‐naming, the Brown and Peterson word/number task, and control tasks into spatially independent components, and found the ICA algorithm was superior to principal component analysis (PCA) in determining the spatial and temporal extent of task‐related activation.
Abstract: r r Abstract: Current analytical techniques applied to functional magnetic resonance imaging (fMRI) data require a priori knowledge or specific assumptions about the time courses of processes contributing to the measured signals. Here we describe a new method for analyzing fMRI data based on the independent component analysis (ICA) algorithm of Bell and Sejnowski ((1995): Neural Comput 7:1129-1159). We decomposed eight fMRI data sets from 4 normal subjects performing Stroop color-naming, the Brown and Peterson word/number task, and control tasks into spatially independent components. Each component consisted of voxel values at fixed three-dimensional locations (a component ''map''), and a unique associated time course of activation. Given data from 144 time points collected during a 6-min trial, ICA extracted an equal number of spatially independent components. In all eight trials, ICA derived one and only one component with a time course closely matching the time course of 40-sec alternations between experimental and control tasks. The regions of maximum activity in these consistently task-related components generally overlapped active regions detected by standard correlational analysis, but included frontal regions not detected by correlation. Time courses of other ICA components were transiently task-related, quasiperiodic, or slowly varying. By utilizing higher-order statistics to enforce successively stricter criteria for spatial independence between component maps, both the ICA algorithm and a related fourth-order decomposition technique (Comon (1994): Signal Processing 36:11-20) were superior to principal component analysis (PCA) in determining the spatial and temporal extent of task-related activation. For each subject, the time courses and active regions of the task-related ICA components were consistent across trials and were robust to the addition of simulated noise. Simulated movement artifact and simulated task-related activations added to actual fMRI data were clearly separated by the algorithm. ICA can be used to distinguish between nontask-related signal components, movements, and other artifacts, as well as consistently or transiently task-related fMRI activations, based on only weak

2,014 citations


Book
Margaret A. Nemeth1
06 Feb 1998
TL;DR: An overview of applied multivariate methods, Matrix results, quadratic forms, eigenvalues and eigenvectors, distances and angles, miscellaneous results work attitudes survey, data file structure, SPSS data entry commands, SAS data entry command study.
Abstract: 1. Applied multivariate methods. An Overview of Multivariate Methods. Two Examples. Types of Variables. Data Matrices and Vectors. The Multivariate Normal Distribution. Statistical Computing. Multivariate Outliers. Multivariate Summary Statistics. Standardized Data and/or z-Scores. Exercises. 2. Sample correlations. Statistical Tests and Confidence Intervals. Summary. Exercises. 3. Multivariate data plots. Three-Dimensional Data Plots. Plots of Higher Dimensional Data. Plotting to Check for Multivariate Normality. Exercises. 4. Eigenvalues and eigenvectors. Trace and Determinant. Eigenvalues. Eigenvectors. Geometrical Descriptions (p=2). Geometrical Descriptions (p=3). Geometrical Descriptions (p>3). Exercises. 5. Principal components analysis. Reasons For Doing a PCA. Objectives of a PCA. PCA on the Variance-Covariance Matrix, Sigma. Estimation of Principal Components. Determining the Number of Principal Components. Caveats. PCA on the Correlation Matrix, P. Testing for Independence of the Original Variables. Structural Relationships. Statistical Computing Packages. Exercises. 6. Factor analysis. Objectives of an FA. Caveats. Some History on Factor Analysis. The Factor Analysis Model. Factor Analysis Equations. Solving the Factor Analysis Equations. Choosing the Appropriate Number of Factors. Computer Solutions of the Factor Analysis Equations. Rotating Factors. Oblique Rotation Methods. Factor Scores. Exercises. 7. Discriminant analysis. Discrimination for Two Multivariate Normal Populations. Cost Functions and Prior Probabilities (Two Populations). A General Discriminant Rule (Two Populations). Discriminant Rules (More Than Two Populations). Variable Selection Procedures. Canonical Discriminant Functions. Nearest Neighbour Discriminant Analysis. Classification Trees. Exercises. 8. Logistic regression methods. The Logit Transformation. Logistic Discriminant Analysis (More than Two Populations.) Exercises. 9. Cluster analysis. Measures of Similarity and/or Dissimilarity. Graphical Aids in Clustering. Clustering Methods. Multidimensional Scaling. Exercises. 10. Mean vectors and variance-covariance matrices. Inference Procedures for Variance-Covariance Matrices. Inference Procedures for a Mean Vector. Two Sample Procedures. Profile Analyses. Additional Two Groups Analyses. Exercises. 11. Multivariate analysis of variance. manova. Dimensionality of the Alternative Hypothesis. Canonical Variates Analysis. Confidence Regions for Canonical Variates. Exercises. 12. Prediction models and multivariate regression. Multiple Regression. Canonical Correlation Analysis. Factor Analysis and Regression. Exercises. Appendices: Matrix results, quadratic forms, eigenvalues and eigenvectors, distances and angles, miscellaneous results work attitudes survey, data file structure, SPSS data entry commands, SAS data entry commands family control study.

982 citations


Journal ArticleDOI
TL;DR: It is recommended that in cases where the variables can be separated into meaningful blocks, the standard PCA and PLS methods be used to build the models and then the weights and loadings of the individual blocks and super block and the percentage variation explained in each block be calculated from the results.
Abstract: Multiblock and hierarchical PCA and PLS methods have been proposed in the recent literature in order to improve the interpretability of multivariate models. They have been used in cases where the number of variables is large and additional information is available for blocking the variables into conceptually meaningful blocks. In this paper we compare these methods from a theoretical or algorithmic viewpoint using a common notation and illustrate their differences with several case studies. Undesirable properties of some of these methods, such as convergence problems or loss of data information due to deflation procedures, are pointed out and corrected where possible. It is shown that the objective function of the hierarchical PCA and hierarchical PLS methods is not clear and the corresponding algorithms may converge to different solutions depending on the initial guess of the super score. It is also shown that the results of consensus PCA (CPCA) and multiblock PLS (MBPLS) can be calculated from the standard PCA and PLS methods when the same variable scalings are applied for these methods. The standard PCA and PLS methods require less computation and give better estimation of the scores in the case of missing data. It is therefore recommended that in cases where the variables can be separated into meaningful blocks, the standard PCA and PLS methods be used to build the models and then the weights and loadings of the individual blocks and super block and the percentage variation explained in each block be calculated from the results. © 1998 John Wiley & Sons, Ltd.

682 citations


Book ChapterDOI
14 Apr 1998
TL;DR: A hybrid classifier using PCA and LDA provides a useful framework for other image recognition tasks as well and demonstrates a significant improvement when principal components rather than original images are fed to the LDA classifier.
Abstract: In this paper we describe a face recognition method based on PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis). The method consists of two steps: first we project the face image from the original vector space to a face subspace via PCA, second we use LDA to obtain a best linear classifier. The basic idea of combining PCA and LDA is to improve the generalization capability of LDA when only few samples per class are available. Using PCA, we are able to construct a face subspace in which we apply LDA to perform classification. Using FERET dataset we demonstrate a significant improvement when principal components rather than original images are fed to the LDA classifier. The hybrid classifier using PCA and LDA provides a useful framework for other image recognition tasks as well.

670 citations


01 Jan 1998
TL;DR: A hybrid classifier using PCA and LDA provides a useful framework for other image recognition tasks as well and demonstrates a significant improvement when principal components rather than original images are fed to the LDA classifier.
Abstract: In this paper we describe a face recognition method based on PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis). The method consists of two steps: first we project the face image from the original vector space to a face subspace via PCA, second we use LDA to obtain a linear classifier. The basic idea of combining PCA and LDA is to improve the generalization capability of LDA when only few samples per class are available. Using FERET dataset we demonstrate a significant improvement when principal components rather than original images are fed to the LDA classifier. The hybrid classifier using PCA and LDA provides a useful framework for other image recognition tasks as well.

539 citations


Proceedings ArticleDOI
04 May 1998
TL;DR: It is demonstrated that the document classification accuracy obtained after the dimensionality has been reduced using a random mapping method will be almost as good as the original accuracy if the final dimensionality is sufficiently large.
Abstract: When the data vectors are high-dimensional it is computationally infeasible to use data analysis or pattern recognition algorithms which repeatedly compute similarities or distances in the original data space It is therefore necessary to reduce the dimensionality before, for example, clustering the data If the dimensionality is very high, like in the WEBSOM method which organizes textual document collections on a self-organizing map, then even the commonly used dimensionality reduction methods like the principal component analysis may be too costly It is demonstrated that the document classification accuracy obtained after the dimensionality has been reduced using a random mapping method will be almost as good as the original accuracy if the final dimensionality is sufficiently large (about 100 out of 6000) In fact, it can be shown that the inner product (similarity) between the mapped vectors follows closely the inner product of the original vectors

434 citations


Journal ArticleDOI
TL;DR: This work found that ensemble averaging was found to be effective in controlling nonlinear instability, and the mysterious hidden layer could be given a phase space interpretation, and spectral analysis aided in understanding the nonlinear NN relat...
Abstract: Empirical or statistical methods have been introduced into meteorology and oceanography in four distinct stages: 1) linear regression (and correlation), 2) principal component analysis (PCA), 3) canonical correlation analysis, and recently 4) neural network (NN) models. Despite the great popularity of the NN models in many fields, there are three obstacles to adapting the NN method to meteorology–oceanography, especially in large-scale, low-frequency studies: (a) nonlinear instability with short data records, (b) large spatial data fields, and (c) difficulties in interpreting the nonlinear NN results. Recent research shows that these three obstacles can be overcome. For obstacle (a), ensemble averaging was found to be effective in controlling nonlinear instability. For (b), the PCA method was used as a prefilter for compressing the large spatial data fields. For (c), the mysterious hidden layer could be given a phase space interpretation, and spectral analysis aided in understanding the nonlinear NN relat...

427 citations


Proceedings ArticleDOI
17 Jul 1998
TL;DR: ICA was performed on a set of face images by an unsupervised learning algorithm derived from the principle of optimal information transfer through sigmoidal neurons, which maximizes the mutual information between the input and the output, which produces statistically independent outputs under certain conditions.
Abstract: In a task such as face recognition, much of the important information may be contained in the high-order relationships among the image pixels. A number of face recognition algorithms employ principal component analysis (PCA), which is based on the second-order statistics of the image set, and does not address high-order statistical dependencies such as the relationships among three or more pixels. Independent component analysis (ICA) is a generalization of PCA which separates the high-order moments of the input in addition to the second-order moments. ICA was performed on a set of face images by an unsupervised learning algorithm derived from the principle of optimal information transfer through sigmoidal neurons (Bell and Sejnowski, 1995). The algorithm maximizes the mutual information between the input and the output, which produces statistically independent outputs under certain conditions. ICA was performed on the face images under two different architectures, one which separated images across spatial location, and a second which separated the feature code across images. The first architecture provided a statistically independent basis set for the face images that can be viewed as a set of independent facial feature images. The second architecture provided a factorial code, in which the probability of any combination of features can be obtained from the product of their individual probabilities. Both ICA representations were superior to representations based on principal components analysis for recognizing faces across days and changes in expression.

411 citations



Proceedings Article
Christopher M. Bishop1
01 Dec 1998
TL;DR: This paper uses probabilistic reformulation as the basis for a Bayesian treatment of PCA to show that effective dimensionality of the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure.
Abstract: The technique of principal component analysis (PCA) has recently been expressed as the maximum likelihood solution for a generative latent variable model. In this paper we use this probabilistic reformulation as the basis for a Bayesian treatment of PCA. Our key result is that effective dimensionality of the latent space (equivalent to the number of retained principal components) can be determined automatically as part of the Bayesian inference procedure. An important application of this framework is to mixtures of probabilistic PCA models, in which each component can determine its own effective complexity.

319 citations


Book
01 Jan 1998
TL;DR: ICA is a method for solving the blind source separation problem by finding a linear coordinate system (the unmixing system) such that the resulting signals are as statistically independent from each other as possible.
Abstract: The goal of blind source separation (BSS) is to recover independent sources given only sensor observations that are linear mixtures of independent source signals. The term blind indicates that both the source signals and the way the signals were mixed are unknown. Independent Component Analysis (ICA) is a method for solving the blind source separation problem. It is a way to find a linear coordinate system (the unmixing system) such that the resulting signals are as statistically independent from each other as possible. In contrast to correlation-based transformations such as Principal Component Analysis (PCA), ICA not only decorrelates the signals (2nd-order statistics) but also reduces higher-order statistical dependencies.

Journal ArticleDOI
TL;DR: The Gifi system of analyzing categorical data through nonlinear varieties of classical multivariate analysis techniques as mentioned in this paper is characterized by the optimal scaling of categorical variables which is implemented through alternating least squares algorithms.
Abstract: The Gifi system of analyzing categorical data through nonlinear varieties of classical multivariate analysis techniques is reviewed. The system is characterized by the optimal scaling of categorical variables which is implemented through alternating least squares algorithms. The main technique of homogeneity analysis is presented, along with its extensions and generalizations leading to nonmetric principal components analysis and canonical correlation analysis. Several examples are used to illustrate the methods. A brief account of stability issues and areas of applications of the techniques is also given.

Journal ArticleDOI
TL;DR: Two approaches in aggregating multiple inputs and multiple outputs in the evaluation of decision making units (DMUs), data envelopment analysis (DEA) and principal component analysis (PCA) are compared.

Journal ArticleDOI
TL;DR: In this article, a unified approach to process and sensor fault detection, identification, and reconstruction via principal component analysis is presented, which partitions the measurement space into a principal component subspace where normal variation occurs, and a residual subspace that faults may occupy.

Journal ArticleDOI
TL;DR: Two methods of multivariate analysis are discussed on their merits: 1) the canonical ordination technique Principal Response Curves (PRC) and 2) the similarity indices of Bray-Curtis and Stander.
Abstract: Experiments in microcosms and mesocosms, which can be carried out in an advanced tier of risk assessment, usually result in large data sets on the dynamics of biological communities of treated and control cosms. Multivariate techniques are an accepted tool to evaluate the community treatment effects resulting from these complex experiments. In this paper two methods of multivariate analysis are discussed on their merits: 1) the canonical ordination technique Principal Response Curves (PRC) and 2) the similarity indices of Bray-Curtis and Stander. For this, the data sets of a microcosm experiment were used to simultaneously study the impact of nutrient loading and insecticide application. Both similarity indices display, in a single graph, the total effect size against time and do not allow a direct interpretation down to the taxon level. In the PRC method, the principal components of the treatment effects are plotted against time. Since the species of the example data sets, react in qualitatively different ways to the treatments, more than one PRC is needed for a proper description of the treatment effects. The first PRC of one of the data sets describes the effects due to the chlorpyrifos addition, the second one the effects as a result of the nutrient loading. The resulting principal response curves jointly summarize the essential features of the response curves of the individual taxa. This paper goes beyond the first PRC to visualize the effects of chemicals at the community level. In both multivariate analysis methods the statistical significance of the effects can be assessed by Monte Carlo permutation testing.

Journal ArticleDOI
TL;DR: In this article, five years data on CO, NO, NO2, O3, smoke and SO2 concentrations recorded at one air-pollution monitoring station in the city of Athens were analyzed using principal component analysis(PCA).

Proceedings ArticleDOI
18 May 1998
TL;DR: The "eigenfaces method", originally used in human face recognition, is introduced, to model the sound frequency distribution features and it is shown that it can be a simple and reliable acoustic identification method if the training samples can be properly chosen and classified.
Abstract: The sound (engine, noise, etc.) of a working vehicle provides an important clue, e.g., for surveillance mission robots, to recognize the vehicle type. In this paper, we introduce the "eigenfaces method", originally used in human face recognition, to model the sound frequency distribution features. We show that it can be a simple and reliable acoustic identification method if the training samples can be properly chosen and classified. We treat the frequency spectra of about 200 ms of sound (a "frame") as a vector in a high-dimensional frequency feature space. In this space, we study the vector distribution for each kind of vehicle sound produced under similar working conditions. A collection of typical sound samples is used as the training data set. The mean frequency vector of the training set is first calculated, and subtracted from each vector in the set. To capture the frequency vectors' variation within the training set, we then calculate the eigenvectors of the covariance matrix of the zero-mean-adjusted sample data set. These eigenvectors represent the principal components of the vector distribution: for each such eigenvector, its corresponding eigenvalue indicates its importance in capturing the variation distribution, with the largest eigenvalues accounting for the most variance within this data set. Thus for each set of training data, its mean vector and its moat important eigenvectors together characterize its sound signature. When a new frame (not in the training set) is tested, its spectrum vector is compared against the mean vector; the difference vector is then projected into the principal component directions, and the residual is found. The coefficients of the unknown vector, in the training set eigenvector basis subspace, identify the unknown vehicle noise in terms of the classes represented in the training set. The magnitude of the residual vector measures the extent to which the unknown vehicle sound cannot be well characterized by the vehicle sounds included in the training set.

Journal ArticleDOI
01 May 1998
TL;DR: A fast and simple algorithm for approximately calculating the principal components PCs of a dataset and so reducing its dimensionality is described and shows a fast convergence rate compared with other methods and robustness to the reordering of the samples.
Abstract: A fast and simple algorithm for approximately calculating the principal components PCs of a dataset and so reducing its dimensionality is described. This Simple Principal Components Analysis SPCA method was used for dimensionality reduction of two high-dimensional image databases, one of handwritten digits and one of handwritten Japanese characters. It was tested and compared with other techniques. On both databases SPCA shows a fast convergence rate compared with other methods and robustness to the reordering of the samples.

Journal ArticleDOI
TL;DR: The NLPCA techniques are used to classify each segment into one of two classes: normal and abnormal (ST+, ST-, or artifact) and test results show that using only two nonlinear components and a training set of 1000 normal samples from each file produce a correct classification rate.
Abstract: The detection of ischemic cardiac beats from a patient's electrocardiogram (EGG) signal is based on the characteristics of a specific part of the beat called the ST segment. The correct classification of the beats relies heavily on the efficient and accurate extraction of the ST segment features. An algorithm is developed for this feature extraction based on nonlinear principal component analysis (NLPCA). NLPCA is a method for nonlinear feature extraction that is usually implemented by a multilayer neural network. It has been observed to have better performance, compared with linear principal component analysis (PCA), in complex problems where the relationships between the variables are not linear. In this paper, the NLPCA techniques are used to classify each segment into one of two classes: normal and abnormal (ST+, ST-, or artifact). During the algorithm training phase, only normal patterns are used, and for classification purposes, we use only two nonlinear features for each ST segment. The distribution of these features is modeled using a radial basis function network (RBFN). Test results using the European ST-T database show that using only two nonlinear components and a training set of 1000 normal samples from each file produce a correct classification rate of approximately 80% for the normal beats and higher than 90% for the ischemic beats.

Journal ArticleDOI
TL;DR: The glottal to noise excitation ratio (GNE) is an acoustic measure designed to assess the amount of noise in a pulse train generated by the oscillation of the vocal folds that is found to be independent of variations of fundamental frequency and amplitude.
Abstract: The glottal to noise excitation ratio (GNE) is an acoustic measure designed to assess the amount of noise in a pulse train generated by the oscillation of the vocal folds. So far its properties have only been studied for synthesized signals, where it was found to be independent of variations of fundamental frequency (jitter) and amplitude (shimmer). On the other hand, other features designed for the same purpose like NNE (normalized noise energy) or CHNR (cepstrum based harmonics-to-noise ratio) did not show this independence. This advantage of the GNE over NNE and CHNR, as well as its general applicability in voice quality assessment, is now tested for real speech using a large group of pathologic voices (n=447). A set of four acoustic features is extracted from a total of 22 mostly well-known acoustic voice quality measures by correlation analysis, mutual information analysis, and principal components analysis. Three of these measures are chosen to assess primarily different aspects of signal aperiodici...

Journal ArticleDOI
TL;DR: In this article, principal components analysis (PCA) and linear discriminant analysis (LDA) were applied to 1H NMR spectra of the apple juices produced from different varieties (Spartan, Bramley, Russet).

Journal ArticleDOI
TL;DR: It is concluded that PCA is an essential statistical tool for event-related potential analysis, but only if applied appropriately.
Abstract: Interpretation of evoked response potentials is complicated by the extensive superposition of multiple electrical events. The most common approach to disentangling these features is principal components analysis (PCA). Critics have demonstrated a number of caveats that complicate interpretation, notably misallocation of variance and latency jitter. This paper describes some further caveats to PCA as well as using simulations to evaluate three potential methods for addressing them: parallel analysis, oblique rotations, and spatial PCA. An improved simulation model is introduced for examining these issues. It is concluded that PCA is an essential statistical tool for event-related potential analysis, but only if applied appropriately.

Journal ArticleDOI
TL;DR: In this article, a water deficiency was detected among the spectral population of wheat using principal component analysis (PCA) and unmixing of two endmembers of wheat, related respectively to well-developed and stressed plants.

Proceedings ArticleDOI
16 Aug 1998
TL;DR: Two enhanced Fisher linear discriminant models (EFM) are introduced in order to improve the generalization ability of the standard FLD based classifiers such as Fisherfaces and Experimental data shows that the EFM models outperform the standardFLD based methods.
Abstract: We introduce two enhanced Fisher linear discriminant (FLD) models (EFM) in order to improve the generalization ability of the standard FLD based classifiers such as Fisherfaces Similar to Fisherfaces, both EFM models apply first principal component analysis (PCA) for dimensionality reduction before proceeding with FLD type of analysis EFM-1 implements the dimensionality reduction with the goal to balance between the need that the selected eigenvalues account for most of the spectral energy of the raw data and the requirement that the eigenvalues of the within-class scatter matrix in the reduced PCA subspace are not too small EFM-2 implements the dimensionality reduction as Fisherfaces do It proceeds with the whitening of the within-class scatter matrix in the reduced PCA subspace and then chooses a small set of features (corresponding to the eigenvectors of the within-class scatter matrix) so that the smaller trailing eigenvalues are not included in further computation of the between-class scatter matrix Experimental data using a large set of faces-1,107 images drawn from 369 subjects and including duplicates acquired at a later time under different illumination-from the FERET database shows that the EFM models outperform the standard FLD based methods

Journal ArticleDOI
TL;DR: In this article, a set of multilinear regression problems or singular value decomposition problems with iterative imputation of missing values are proposed. But the results with a substantial amount of missing data are different and superior to those obtained with the naive NIPALS algorithm in common use in chemometrics.

Journal ArticleDOI
TL;DR: Analysis of variance of each component indicated that the variety effect was highly significant for the 1st, 2nd and 3rd principal components derived from group A coefficients, which were related to the aspect ratio, bluntness of the distal part of the root, and swelling of the middle part, respectively, suggesting that these traits are heritable and can be effectively selected through quantified measures based on elliptic Fourier descriptors presented in this report.
Abstract: Variation was of root shape in Japanese radish, due to genotypes, soil types and growth stages, were quantitatively evaluated by principal components scores based on elliptic Fourier descriptors. Photographic images of sampled roots on 35mm color reversal films were converted into digital images. After image processing, the contour of each root was expressed as chain-code and then described by 77 coefficients of elliptic Fourier descriptors. After normalization about size, rotation, and starting point of the contour, two groups of the coefficients, which are related to the symmetrical and asymmetrical variations of shape, were analyzed separately, since artificially determined direction of curvature of the root may influence the results. Principal component analysis of the coefficients showed that the major part of the symmetrical (A) and asymmetrical (B) variations were summarized by at most 5 components. The cumulative contribution was 95.2% and 97.1%, respectively. Analysis of variance of each component indicated that the variety effect was highly significant for the 1st, 2nd and 3rd principal components derived from group A coefficients, which were related to the aspect ratio, bluntness of the distal part of the root, and swelling of the middle part, respectively. This suggests that these traits are heritable and can be effectively selected through quantified measures based on elliptic Fourier descriptors presented in this report. Direction and degree of curvature of root could be analyzed independently of the symmetrical variation.

Journal ArticleDOI
TL;DR: The utility of combining recurrence quantification analysis with principal components analysis to allow for a probabilistic evaluation of the presence of deterministic signals in relatively short data lengths is demonstrated.

Journal ArticleDOI
TL;DR: A fast and robust method of classifying a library of optical stellar spectra for O to M type stars is presented and it is shown that the library can be classified to accuracies similar to those achieved by Gulati et al. but with less computational load.
Abstract: A fast and robust method of classifying a library of optical stellar spectra for O to M type stars is presented. The method employs, as tools: (1) principal component analysis (PCA) for reducing the dimensionality of the data and (2) multilayer back propagation network (MBPN) based artificial neural network (ANN) scheme to automate the process of classification. We are able to reduce the dimensionality of the original spectral data to very few components by using PCA and are able to successfully reconstruct the original spectra. A number of NN architectures are used to classify the library of test spectra. Performance of ANN with this reduced dimension shows that the library can be classified to accuracies similar to those achieved by Gulati et al. but with less computational load. Furthermore, the data compression is so efficient that the NN scheme successfully classifies to the desired accuracy for a wide range of architectures. The procedure will greatly improve our capabilities in handling and analysing large spectral data bases of the future.

Proceedings ArticleDOI
23 Jun 1998
TL;DR: The rationales behind PCA and LDA and the pros and cons of applying them to pattern classification task are illustrated and the improved performance of this combined approach is demonstrated.
Abstract: In face recognition literature, holistic template matching systems and geometrical local feature based systems have been pursued. In the holistic approach, PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis) are popular ones. More recently, the combination of PCA and LDA has been proposed as a superior alternative over pure PCA and LDA. In this paper, we illustrate the rationales behind these methods and the pros and cons of applying them to pattern classification task. A theoretical performance analysis of LDA suggests applying LDA over the principal components from the original signal space or the subspace. The improved performance of this combined approach is demonstrated through experiments conducted on both simulated data and real data.

Journal ArticleDOI
TL;DR: It is shown that PCA can compress the spectra by a factor of over 30 while retaining essentially all of the useful information in the data set, and that this compression optimally removes noise and can be used to identify unusual spectra.
Abstract: We investigate the application of neural networks to the automation of MK spectral classification. The data set for this project consists of a set of over 5000 optical (3800–5200 A) spectra obtained from objective prism plates from the Michigan Spectral Survey. These spectra, along with their two-dimensional MK classifications listed in the Michigan Henry Draper Catalogue, were used to develop supervised neural network classifiers. We show that neural networks can give accurate spectral type classifications (σ68= 0.82 subtypes, σrms= 1.09 subtypes) across the full range of spectral types present in the data set (B2–M7). We show also that the networks yield correct luminosity classes for over 95 per cent of both dwarfs and giants with a high degree of confidence. Stellar spectra generally contain a large amount of redundant information. We investigate the application of principal components analysis (PCA) to the optimal compression of spectra. We show that PCA can compress the spectra by a factor of over 30 while retaining essentially all of the useful information in the data set. Furthermore, it is shown that this compression optimally removes noise and can be used to identify unusual spectra. This paper is a continuation of the work carried out by von Hippel et al. (Paper I).