Showing papers on "Principal component analysis published in 2000"

PDF

Open Access

Journal Article•DOI•

A global geometric framework for nonlinear dimensionality reduction.

[...]

Joshua B. Tenenbaum¹, V. de Silva¹, John Langford²•Institutions (2)

Stanford University¹, Carnegie Mellon University²

22 Dec 2000-Science

TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

...read moreread less

Abstract: Scientists working with large volumes of high-dimensional data, such as global climate patterns, stellar spectra, or human gene distributions, regularly confront the problem of dimensionality reduction: finding meaningful low-dimensional structures hidden in their high-dimensional observations. The human brain confronts the same problem in everyday perception, extracting from its high-dimensional sensory inputs-30,000 auditory nerve fibers or 10(6) optic nerve fibers-a manageably small number of perceptually relevant features. Here we describe an approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set. Unlike classical techniques such as principal component analysis (PCA) and multidimensional scaling (MDS), our approach is capable of discovering the nonlinear degrees of freedom that underlie complex natural observations, such as human handwriting or images of a face under different viewing conditions. In contrast to previous algorithms for nonlinear dimensionality reduction, ours efficiently computes a globally optimal solution, and, for an important class of data manifolds, is guaranteed to converge asymptotically to the true structure.

...read moreread less

13,652 citations

Journal Article•DOI•

Independent component analysis: algorithms and applications

[...]

Aapo Hyvärinen¹, Erkki Oja¹•Institutions (1)

Helsinki University of Technology¹

01 May 2000-Neural Networks

TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.

...read moreread less

8,231 citations

Journal Article•DOI•

A new LDA-based face recognition system which can solve the small sample size problem

[...]

Li Fen Chen¹, Hong-Yuan Mark Liao², Ming-Tat Ko², Ja-Chen Lin¹, Gwo-Jong Yu³ - Show less +1 more•Institutions (3)

National Chiao Tung University¹, Academia Sinica², National Central University³

01 Oct 2000-Pattern Recognition

TL;DR: It is proved that the most expressive vectors derived in the null space of the within-class scatter matrix using principal component analysis (PCA) are equal to the optimal discriminant vectorsderived in the original space using LDA.

...read moreread less

1,447 citations

Book•

Multivariate statistics for wildlife and ecology research

[...]

Kevin McGarigal, Samuel A. Cushman, Susan G. Stafford

01 Jan 2000

TL;DR: In this paper, the authors presented an approach for the detection of anomalous clusters based on principal components analysis (PCA) and cluster clustering, and the results showed that PCA is more accurate than other clustering techniques.

...read moreread less

Abstract: 1 Introduction and Overview.- 1.1 Objectives.- 1.2 Multivariate Statistics: An Ecological Perspective.- 1.3 Multivariate Description and Inference.- 1.4 Multivariate Confusion!.- 1.5 Types of Multivariate Techniques.- 1.5.1 Ordination.- 1.5.2 Cluster Analysis.- 1.5.3 Discriminant Analysis.- 1.5.4 Canonical Correlation Analysis.- 2 Ordination: Principal Components Analysis.- 2.1 Objectives.- 2.2 Conceptual Overview.- 2.2.1 Ordination.- 2.2.2 Principal Components Analysis (PCA).- 2.3 Geometric Overview.- 2.4 The Data Set.- 2.5 Assumptions.- 2.5.1 Multivariate Normality.- 2.5.2 Independent Random Sample and the Effects of Outliers.- 2.5.3 Linearity.- 2.6 Sample Size Requirements.- 2.6.1 General Rules.- 2.6.2 Specific Rules.- 2.7 Deriving the Principal Components.- 2.7.1 The Use of Correlation and Covariance Matrices.- 2.7.2 Eigenvalues and Associated Statistics.- 2.7.3 Eigenvectors and Scoring Coefficients.- 2.8 Assessing the Importance of the Principal Components.- 2.8.1 Latent Root Criterion.- 2.8.2 Scree Plot Criterion.- 2.8.3 Broken Stick Criterion.- 2.8.4 Relative Percent Variance Criterion.- 2.8.5 Significance Tests.- 2.9 Interpreting the Principal Components.- 2.9.1 Principal Component Structure.- 2.9.2 Significance of Principal Component Loadings.- 2.9.3 Interpreting the Principal Component Structure.- 2.9.4 Communality.- 2.9.5 Principal Component Scores and Associated Plots.- 2.10 Rotating the Principal Components.- 2.11 Limitations of Principal Components Analysis.- 2.12 R-Factor Versus Q-Factor Ordination.- 2.13 Other Ordination Techniques.- 2.13.1 Polar Ordination.- 2.13.2 Factor Analysis.- 2.13.3 Nonmetric Multidimensional Scaling.- 2.13.4 Reciprocal Averaging.- 2.13.5 Detrended Correspondence Analysis.- 2.13.6 Canonical Correspondence Analysis.- Appendix 2.1.- 3 Cluster Analysis.- 3.1 Objectives.- 3.2 Conceptual Overview.- 3.3 The Definition of Cluster.- 3.4 The Data Set.- 3.5 Clustering Techniques.- 3.6 Nonhierarchical Clustering.- 3.6.1 Polythetic Agglomerative Nonhierarchical Clustering.- 3.6.2 Polythetic Divisive Nonhierarchical Clustering.- 3.7 Hierarchical Clustering.- 3.7.1 Polythetic Agglomerative Hierarchical Clustering.- 3.7.2 Polythetic Divisive Hierarchical Clustering.- 3.8 Evaluating the Stability of the Cluster Solution.- 3.9 Complementary Use of Ordination and Cluster Analysis.- 3.10 Limitations of Cluster Analysis.- Appendix 3.1.- 4 Discriminant Analysis.- 4.1 Objectives.- 4.2 Conceptual Overview.- 4.2.1 Overview of Canonical Analysis of Discriminance.- 4.2.2 Overview of Classification.- 4.2.3 Analogy with Multiple Regression Analysis and Multivariate Analysis of Variance.- 4.3 Geometric Overview.- 4.4 The Data Set.- 4.5 Assumptions.- 4.5.1 Equality of Variance-Covariance Matrices.- 4.5.2 Multivariate Normality.- 4.5.3 Singularities and Multicollinearity.- 4.5.4 Independent Random Sample and the Effects of Outliers.- 4.5.5 Prior Probabilities Are Identifiable.- 4.5.6 Linearity 153.- 4.6 Sample Size Requirements.- 4.6.1 General Rules.- 4.6.2 Specific Rules.- 4.7 Deriving the Canonical Functions.- 4.7.1 Stepwise Selection of Variables.- 4.7.2 Eigenvalues and Associated Statistics.- 4.7.3 Eigenvectors and Canonical Coefficients.- 4.8 Assessing the Importance of the Canonical Functions.- 4.8.1 Relative Percent Variance Criterion.- 4.8.2 Canonical Correlation Criterion.- 4.8.3 Classification Accuracy.- 4.8.4 Significance Tests.- 4.8.5 Canonical Scores and Associated Plots.- 4.9 Interpreting the Canonical Functions.- 4.9.1 Standardized Canonical Coefficients.- 4.9.2 Total Structure Coefficients.- 4.9.3 Covariance-Controlled Partial F-Ratios.- 4.9.4 Significance Tests Based on Resampling Procedures.- 4.9.5 Potency Index.- 4.10 Validating the Canonical Functions.- 4.10.1 Split-Sample Validation.- 4.10.2 Validation Using Resampling Procedures.- 4.11 Limitations of Discriminant Analysis.- Appendix 4.1.- 5 Canonical Correlation Analysis.- 5.1 Objectives.- 5.2 Conceptual Overview.- 5.3 Geometric Overview.- 5.4 The Data Set.- 5.5 Assumptions.- 5.5.1 Multivariate Normality.- 5.5.2 Singularities and Multicollinearity.- 5.5.3 Independent Random Sample and the Effects of Outliers.- 5.5.4 Linearity.- 5.6 Sample Size Requirements.- 5.6.1 General Rules.- 5.6.2 Specific Rules.- 5.7 Deriving the Canonical Variates.- 5.7.1 The Use of Covariance and Correlation Matrices.- 5.7.2 Eigenvalues and Associated Statistics.- 5.7.3 Eigenvectors and Canonical Coefficients.- 5.8 Assessing the Importance of the Canonical Variates.- 5.8.1 Canonical Correlation Criterion.- 5.8.2 Canonical Redundancy Criterion.- 5.8.3 Significance Tests.- 5.8.4 Canonical Scores and Associated Plots.- 5.9 Interpreting the Canonical Variates.- 5.9.1 Standardized Canonical Coefficients.- 5.9.2 Structure Coefficients.- 5.9.3 Canonical Cross-Loadings.- 5.9.4 Significance Tests Based on Resampling Procedures.- 5.10 Validating the Canonical Variates.- 5.10.1 Split-Sample Validation.- 5.10.2 Validation Using Resampling Procedures.- 5.11 Limitations of Canonical Correlation Analysis.- Appendix 5.1.- 6 Summary and Comparison.- 6.1 Objectives.- 6.2 Relationship Among Techniques.- 6.2.1 Purpose and Source of Variation Emphasized.- 6.2.2 Statistical Procedure.- 6.2.3 Type of Statistical Technique and Variable Set Characteristics.- 6.2.4 Data Structure.- 6.2.5 Sampling Design.- 6.3 Complementary Use of Techniques.- Appendix: Acronyms Used in This Book.

...read moreread less

1,371 citations

Journal Article•DOI•

Recursive PCA for adaptive process monitoring

[...]

Weihua Li¹, H. Henry Yue¹, Sergio Valle-Cervantes¹, S. Joe Qin¹•Institutions (1)

University of Texas at Austin¹

01 Oct 2000-Journal of Process Control

TL;DR: A complete adaptive monitoring algorithm that addresses the issues of missing values and outlines is presented and is applied to a rapid thermal annealing process in semiconductor processing for adaptive monitoring.

...read moreread less

757 citations

Journal Article•DOI•

Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis

[...]

Leo H. Chiang¹, Evan L. Russell¹, Richard D. Braatz¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

13 Mar 2000-Chemometrics and Intelligent Laboratory Systems

TL;DR: In this article, the authors developed an information criterion that automatically determines the order of the dimensionality reduction for FDA and DPLS, and show that FDA is more proficient than PCA for diagnosing faults, both theoretically and by applying these techniques to simulated data collected from the Tennessee Eastman chemical plant simulator.

...read moreread less

586 citations

Proceedings Article•

Automatic Choice of Dimensionality for PCA

[...]

Tom Minka¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2000

TL;DR: By interpreting PCA as density estimation, it is shown how to use Bayesian model selection to estimate the true dimensionality of the data, and the resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data.

...read moreread less

Abstract: A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate the true dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. The estimate involves an integral over the Steifel manifold of k-frames, which is difficult to compute exactly. But after choosing an appropriate parameterization and applying Laplace's method, an accurate and practical estimator is obtained. In simulations, it is convincingly better than cross-validation and other proposed algorithms, plus it runs much faster.

...read moreread less

564 citations

Journal Article•DOI•

Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis

[...]

Evan L. Russell¹, Leo H. Chiang¹, Richard D. Braatz¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

08 May 2000-Chemometrics and Intelligent Laboratory Systems

TL;DR: A residual-based CVA statistic proposed in this paper gave the best overall sensitivity and promptness, but the initially proposed threshold for the statistic lacked robustness, so increasing the threshold to achieve a specified missed detection rate was motivated.

...read moreread less

456 citations

Journal Article•DOI•

Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies

[...]

Christophe Croux, Gentiane Haesbroeck¹•Institutions (1)

University of Liège¹

01 Sep 2000-Biometrika

TL;DR: In this paper, the influence functions and the corresponding asymptotic variances for these robust estimators of eigenvalues and eigenvectors are investigated by a simulation study, and it turns out that the theoretical results and simulations favor the use of S-estimators since they combine a high efficiency with appealing robustness properties.

...read moreread less

Abstract: A robust principal component analysis can be easily performed by computing the eigenvalues and eigenvectors of a robust estimator of the covariance or correlation matrix. In this paper we derive the influence functions and the corresponding asymptotic variances for these robust estimators of eigenvalues and eigenvectors. The behaviour of several of these estimators is investigated by a simulation study. It turns out that the theoretical results and simulations favour the use of S-estimators, since they combine a high efficiency with appealing robustness properties. © 2000 Biometrika Trust.

...read moreread less

335 citations

Journal Article•DOI•

[...]

Berk Hess¹•Institutions (1)

University of Groningen¹

01 Dec 2000-Physical Review E

TL;DR: This work derives the principal components for high-dimensional random diffusion, which are almost perfect cosines, which implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.

...read moreread less

Abstract: Principal component analysis, also called essential dynamics, is a powerful tool for finding global, correlated motions in atomic simulations of macromolecules. It has become an established technique for analyzing molecular dynamics simulations of proteins. The first few principal components of simulations of large proteins often resemble cosines. We derive the principal components for high-dimensional random diffusion, which are almost perfect cosines. This resemblance between protein simulations and noise implies that for many proteins the time scales of current simulations are too short to obtain convergence of collective motions.

...read moreread less

273 citations

Proceedings Article•DOI•

Face recognition using kernel eigenfaces

[...]

Ming-Hsuan Yang¹, Narendra Ahuja¹, David J. Kriegman¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

10 Sep 2000

TL;DR: This work investigates a generalization of PCA, kernel principal component analysis (kernel PCA), for learning low dimensional representations in the context of face recognition and shows that kernel PCA outperforms the eigenface method in face recognition.

...read moreread less

Abstract: Eigenface or principal component analysis (PCA) methods have demonstrated their success in face recognition, detection, and tracking. The representation in PCA is based on the second order statistics of the image set, and does not address higher order statistical dependencies such as the relationships among three or more pixels. Higher order statistics (HOS) have been used as a more informative low dimensional representation than PCA for face and vehicle detection. We investigate a generalization of PCA, kernel principal component analysis (kernel PCA), for learning low dimensional representations in the context of face recognition. In contrast to HOS, kernel PCA computes the higher order statistics without the combinatorial explosion of time and memory complexity. While PCA aims to find a second order correlation of patterns, kernel PCA provides a replacement which takes into account higher order correlations. We compare the recognition results using kernel methods with eigenface methods on two benchmarks. Empirical results show that kernel PCA outperforms the eigenface method in face recognition.

...read moreread less

Journal Article•DOI•

Neural-network based analog-circuit fault diagnosis using wavelet transform as preprocessor

[...]

M. Aminian¹, F. Aminian²•Institutions (2)

St. Mary's University¹, Trinity University²

01 Feb 2000-IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing

TL;DR: An analog-circuit fault diagnostic system based on backpropagation neural networks using wavelet decomposition, principal component analysis, and data normalization as preprocessors that performs significantly better in fault diagnosis of analog circuits due to the proposed preprocessing techniques.

...read moreread less

Abstract: We have developed an analog-circuit fault diagnostic system based on backpropagation neural networks using wavelet decomposition, principal component analysis, and data normalization as preprocessors. The proposed system has the capability to detect and identify faulty components in an analog electronic circuit by analyzing its impulse response. Using wavelet decomposition to preprocess the impulse response drastically reduces the number of inputs to the neural network, simplifying its architecture and minimizing its training and processing time. The second preprocessing by principal component analysis can further reduce the dimensionality of the input space and/or select input features that minimize diagnostic errors. Input normalization removes large dynamic variances over one or more dimensions in input space, which tend to obscure the relevant data fed to the neural network. A comparison of our work with that of Spina and Upadhyaya (see ibid., vol. 44, p. 188-196, 1997), which also employs backpropagation neural networks, reveals that our system requires a much smaller network and performs significantly better in fault diagnosis of analog circuits due to our proposed preprocessing techniques.

...read moreread less

Journal Article•DOI•

Spike sorting based on discrete wavelet transform coefficients.

[...]

Juan Carlos Letelier¹, Pamela P. Weber¹•Institutions (1)

University of Chile¹

15 Sep 2000-Journal of Neuroscience Methods

TL;DR: The main advantage of the WSC method is its use of parameters that describe the joint time-frequency localization of spike features to build a fast and unspecialized pattern recognition procedure.

...read moreread less

Journal Article•DOI•

Determining the number of principal components for best reconstruction

[...]

S. Joe Qin¹, Ricardo Dunia¹•Institutions (1)

University of Texas at Austin¹

01 Apr 2000-Journal of Process Control

TL;DR: In this article, a well-defined variance of reconstruction error (VRE) is proposed to determine the number of principal components in a PCA model for best reconstruction, which avoids the arbitrariness of other methods with monotonic indices.

...read moreread less

Journal Article•DOI•

The application of principal component analysis and kernel density estimation to enhance process monitoring

[...]

Qian Chen¹, R.J. Wynne², P. Goulding¹, David J. Sandoz¹•Institutions (2)

University of Manchester¹, Sheffield Hallam University²

01 May 2000-Control Engineering Practice

TL;DR: The application of kernel density estimation (KDE) and principal component analysis (PCA) to provide enhanced monitoring of multivariate processes to demonstrate the power and advantages of the KDE approach over parametric density estimation which is still widely used.

...read moreread less

Journal Article•DOI•

An introduction to independent component analysis

[...]

Lieven De Lathauwer¹, Bart De Moor¹, Joos Vandewalle¹•Institutions (1)

Katholieke Universiteit Leuven¹

01 May 2000-Journal of Chemometrics

TL;DR: Independent Component Analysis (ICA) as mentioned in this paper is a variant of principal component analysis (PCA) in which the components are assumed to be mutually statistically independent instead of merely uncorrelated.

...read moreread less

Abstract: SUMMARY This paper is an introduction to the concept of independent component analysis (ICA) which has recently been developed in the area of signal processing. ICA is a variant of principal component analysis (PCA) in which the components are assumed to be mutually statistically independent instead of merely uncorrelated. The stronger condition allows one to remove the rotational invariance of PCA, i.e. ICA provides a meaningful unique bilinear decomposition of two-way data that can be considered as a linear mixture of a number of independent source signals. The discipline of multilinear algebra offers some means to solve the ICA problem. In this paper we briefly discuss four orthogonal tensor decompositions that can be interpreted in terms of higher-order generalizations of the symmetric eigenvalue decomposition. Copyright © 2000 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

A handbook of statistical analyses using Stata

[...]

Sophia Rabe-Hesketh, Brian Everitt

01 May 2000-The American Statistician

TL;DR: A handbook of statistical analyses using Stata, a guide to statistical techniques used in the construction of graphs and models for statistical inference.

...read moreread less

Abstract: A Brief Introduction to Stata Data Description and Simple Inference Multiple Regression Analysis of Variance I Analysis of Variance II Logistic Regression Generalized Linear Models Analysis of Longitudinal Data I Analysis of Longitudinal Data II Some Epidemiology Survival Analysis Principal Components Analysis Maximum Likelihood Estimation

...read moreread less

Journal Article•DOI•

Fast orthonormal PAST algorithm

[...]

Karim Abed-Meraim¹, A. Chkeif¹, Yingbo Hua²•Institutions (2)

Télécom ParisTech¹, University of Melbourne²

01 Mar 2000-IEEE Signal Processing Letters

TL;DR: An orthonormal version of the PAST algorithm for fast estimation and tracking of the principal subspace or/and principal components of a vector sequence and guarantees the orthonormality of the weight matrix at each iteration is elaborated on.

...read moreread less

Abstract: Subspace decomposition has proven to be an important tool in adaptive signal processing A number of algorithms have been proposed for tracking the dominant subspace Among the most robust and most efficient methods is the projection approximation and subspace tracking (PAST) method This paper elaborates on an orthonormal version of the PAST algorithm for fast estimation and tracking of the principal subspace or/and principal components of a vector sequence The orthonormal PAST (OPAST) algorithm guarantees the orthonormality of the weight matrix at each iteration Moreover, it has a linear complexity like the PAST algorithm and a global convergence property like the natural power (NP) method

...read moreread less

An empirical study on Principal Component Analysis for clustering gene expression data

[...]

Ka Yee Yeung, Walter L. Ruzzo

01 Jan 2000

TL;DR: The empirical study showed that clustering with the PC’s instead of the original variables does not necessarily improve, and often degrade, cluster quality, and would not recommend PCA before clustering except in special circumstances.

...read moreread less

Abstract: Motivation: There is a great need to develop analytical methodology to analyze and to exploit the information contained in gene expression data. Because of the large number of genes and the complexity of biological networks, clustering is a useful exploratory technique for analysis of gene expression data. Other classical techniques, such as principal component analysis (PCA), have also been applied to analyze gene expression data. Using different data analysis techniques and different clustering algorithms to analyze the same data set can lead to very different conclusions. Our goal is to study the effectiveness of principal components (PC’s) in capturing cluster structure. In other words, we empirically compared the quality of clusters obtained from the original data set to the quality of clusters obtained from clustering the PC’s using both real and synthetic gene expression data sets. Results: Our empirical study showed that clustering with the PC’s instead of the original variables does not necessarily improve, and often degrade, cluster quality. In particular, the first few PC’s (which contain most of the variation in the data) do not necessarily capture most of the cluster structure. We also showed that clustering with PC’s has different impact on different algorithms and different similarity metrics. Overall, we would not recommend PCA before clustering except in special circumstances. Availability: The software is under development. Contact: kayee cs.washington.edu Supplementary information: http://www.cs.washington.edu/homes/kayee/pca

...read moreread less

Journal Article•DOI•

Comparison of two exploratory data analysis methods for fMRI: fuzzy clustering vs. principal component analysis

[...]

Richard Baumgartner¹, Lawrence Ryner¹, Wolfgang Richter¹, Randy Summers¹, Mark Jarmasz¹, Ray L. Somorjai¹ - Show less +2 more•Institutions (1)

National Research Council¹

01 Jan 2000-Magnetic Resonance Imaging

TL;DR: If fMRI data are corrupted by scanner noise only, FCA and PCA show comparable performance, and FCA outperforms PCA in the entire CNR range of interest in fMRI, particularly for low CNR values.

...read moreread less

Journal Article•DOI•

Human face recognition using PCA on wavelet subband

[...]

Guocan Feng¹, Pong C. Yuen¹, Dao-Qing Dai•Institutions (1)

Hong Kong Baptist University¹

01 Apr 2000-Journal of Electronic Imaging

TL;DR: Wang et al. as mentioned in this paper proposed a subband approach in using principal component analysis (PCA) on wavelet subband, which decomposes an im- age into different frequency subbands, and a midrange frequency subband is used for PCA representation.

...read moreread less

Abstract: Together with the growing interest in the development of human and computer interface and biometric identification, human face recognition has become an active research area since early 1990. Nowadays, principal component analysis (PCA) has been widely adopted as the most promising face recognition algorithm. Yet still, traditional PCA approach has its limitations: poor discrimi- natory power and large computational load. In view of these limita- tions, this article proposed a subband approach in using PCA— apply PCA on wavelet subband. Traditionally, to represent the human face, PCA is performed on the whole facial image. In the proposed method, wavelet transform is used to decompose an im- age into different frequency subbands, and a midrange frequency subband is used for PCA representation. In comparison with the traditional use of PCA, the proposed method gives better recogni- tion accuracy and discriminatory power; further, the proposed method reduces the computational load significantly when the im- age database is large, with more than 256 training images. This article details the design and implementation of the proposed method, and presents the encouraging experimental results. © 2000 SPIE and IS&T. (S1017-9909(00)01702-5)

...read moreread less

Journal Article•DOI•

Unsupervised hyperspectral image analysis with projection pursuit

[...]

Agustin I. Ifarraguerri¹, Chein-I Chang²•Institutions (2)

University of Baltimore¹, University of Maryland, Baltimore County²

01 Nov 2000-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: An application of projection pursuit (PP) is proposed, which seeks to find a set of projections that are "interesting," in the sense that they deviate from the Gaussian distribution assumption, and which can be used for image compression, segmentation, or enhancement for visual analysis.

...read moreread less

Abstract: Principal components analysis (PCA) is effective at compressing information in multivariate data sets by computing orthogonal projections that maximize the amount of data variance. Unfortunately, information content in hyperspectral images does not always coincide with such projections. The authors propose an application of projection pursuit (PP), which seeks to find a set of projections that are "interesting," in the sense that they deviate from the Gaussian distribution assumption. Once these projections are obtained, they can be used for image compression, segmentation, or enhancement for visual analysis. To find these projections, a two-step iterative process is followed where they first search for a projection that maximizes a projection index based on the information divergence of the projection's estimated probability distribution from the Gaussian distribution and then reduce the rank by projecting the data onto the subspace orthogonal to the previous projections. To calculate each projection, they use a simplified approach to maximizing the projection index, which does not require an optimization algorithm. It searches for a solution by obtaining a set of candidate projections from the data and choosing the one with the highest projection index. The effectiveness of this method is demonstrated through simulated examples as well as data from the hyperspectral digital imagery collection experiment (HYDICE) and the spatially enhanced broadband array spectrograph system (SEBASS).

...read moreread less

Journal Article•DOI•

Using principal component analysis in process performance for multivariate data

[...]

Fu-Kwun Wang¹, Timon C. Du²•Institutions (2)

Chang Gung University¹, Chung Yuan Christian University²

01 Apr 2000-Omega-international Journal of Management Science

TL;DR: In this paper, several capability indices are proposed to summarize the process performance using principal component analysis (PCA) and the corresponding confidence intervals are derived, which is particularly useful in analyzing large sets of correlated data.

...read moreread less

Abstract: Quality measures can be used to evaluate a process’s performance. Analyzing related quality characteristics such as weight, width and height can be combined using multivariate statistical techniques. Recently, multivariate capability indices have been developed to assess the process capability of a product with multiple quality characteristics. This approach assumes multivariate normal distribution. However, obtaining these distributions can be a complicated task, making it difficult to derive the needed confidence intervals. Therefore, there is a need to develop one robust method to deal with the process performance on non-multivariate normal data. Principal component analysis (PCA) can transform the high-dimensional problems into lower dimensional problems and provide sufficient information. This method is particularly useful in analyzing large sets of correlated data. Also, the application of PCA does not require multivariate normal assumption. In this study, several capability indices are proposed to summarize the process performance using PCA. Also, the corresponding confidence intervals are derived. Real-world case studies will illustrate the value and power of this methodology.

...read moreread less

Journal Article•DOI•

Independent component analysis for EEG source localization

[...]

Leonid Zhukov¹, David Weinstein¹, Chris R. Johnson²•Institutions (2)

University of Utah¹, Scientific Computing and Imaging Institute²

01 May 2000-IEEE Engineering in Medicine and Biology Magazine

TL;DR: A spatiotemporal method for source localization, taking advantage of the entire EEG time series to reduce the configuration space the authors must evaluate and substantially reduce the search complexity and increase the likelihood of efficiently converging on the correct solution.

...read moreread less

Abstract: We consider a spatiotemporal method for source localization, taking advantage of the entire EEG time series to reduce the configuration space we must evaluate. The EEG data are first decomposed into signal and noise subspaces using a principal component analysis (PCA) decomposition. This partitioning allows us to easily discard the noise subspace, which has two primary benefits: the remaining signal is less noisy, and it has lower dimensionality. After PCA, we apply independent component analysis (ICA) on the signal subspace. The ICA algorithm separates multichannel data into activation maps due to temporally independent stationary sources. For each activation map we perform an EEG source localization procedure, looking only for a single dipole per map. By localizing multiple dipoles independently, we substantially reduce our search complexity and increase the likelihood of efficiently converging on the correct solution.

...read moreread less

Journal Article•DOI•

Genetic algorithms applied to the selection of factors in principal component regression

[...]

U Depczynski¹, V.J Frost, K Molt•Institutions (1)

University of Hohenheim¹

14 Sep 2000-Analytica Chimica Acta

TL;DR: A new kind of fitness function was applied which combined the prediction error of the calibration and an independent validation set, and a general statistical criterion for judging the significance of differences between individual calibration models is introduced.

...read moreread less

Journal Article•DOI•

Geostatistical Simulation of Regionalized Pore-Size Distributions Using Min/Max Autocorrelation Factors

[...]

Alexandre J. Desbarats¹, Roussos Dimitrakopoulos²•Institutions (2)

Geological Survey of Canada¹, University of Queensland²

01 Nov 2000-Mathematical Geosciences

TL;DR: In this article, a principal components-based approach known as Min/Max Autocorrelation Factor (MAF) is applied to the modeling of pore-size distributions in partially welded tuff.

...read moreread less

Abstract: In many fields of the Earth Sciences, one is interested in the distribution of particle or void sizes within samples. Like many other geological attributes, size distributions exhibit spatial variability, and it is convenient to view them within a geostatistical framework, as regionalized functions or curves. Since they rarely conform to simple parametric models, size distributions are best characterized using their raw spectrum as determined experimentally in the form of a series of abundance measures corresponding to a series of discrete size classes. However, the number of classes may be large and the class abundances may be highly cross-correlated. In order to model the spatial variations of discretized size distributions using current geostatistical simulation methods, it is necessary to reduce the number of variables considered and to render them uncorrelated among one another. This is achieved using a principal components-based approach known as Min/Max Autocorrelation Factors (MAF). For a two-structure linear model of coregionalization, the approach has the attractive feature of producing orthogonal factors ranked in order of increasing spatial correlation. Factors consisting largely of noise and exhibiting pure nugget–effect correlation structures are isolated in the lower rankings, and these need not be simulated. The factors to be simulated are those capturing most of the spatial correlation in the data, and they are isolated in the highest rankings. Following a review of MAF theory, the approach is applied to the modeling of pore-size distributions in partially welded tuff. Results of the case study confirm the usefulness of the MAF approach for the simulation of large numbers of coregionalized variables.

...read moreread less

Journal Article•DOI•

Nonlinear Principal Component Analysis by Neural Networks: Theory and Application to the Lorenz System

[...]

Adam H. Monahan¹•Institutions (1)

University of British Columbia¹

15 Feb 2000-Journal of Climate

TL;DR: A nonlinear generalization of principal component analysis (PCA) is implemented in a variational framework using a five-layer autoassociative feed-forward neural network and it is found that as noise is added to the Lorenz attractor, the NLPCA approximation remain superior to the PCA approximations until the noise level is so great that the lowerdimensional nonlinear structure of the data is no longer manifest to the eye.

...read moreread less

Abstract: A nonlinear generalization of principal component analysis (PCA), denoted nonlinear principal component analysis (NLPCA), is implemented in a variational framework using a five-layer autoassociative feed-forward neural network. The method is tested on a dataset sampled from the Lorenz attractor, and it is shown that the NLPCA approximations to the attractor in one and two dimensions, explaining 76% and 99.5% of the variance, respectively, are superior to the corresponding PCA approximations, which respectively explain 60% (mode 1) and 95% (modes 1 and 2) of the variance. It is found that as noise is added to the Lorenz attractor, the NLPCA approximations remain superior to the PCA approximations until the noise level is so great that the lowerdimensional nonlinear structure of the data is no longer manifest to the eye. Finally, directions for future work are presented, and a cinematographic technique to visualize the results of NLPCA is discussed.

...read moreread less

Journal Article•DOI•

Derivative Preprocessing and Optimal Corrections for Baseline Drift in Multivariate Calibration

[...]

Christopher D. Brown¹, Lorenzo Vega-Montoto¹, Peter D. Wentzell¹•Institutions (1)

Dalhousie University¹

01 Jul 2000-Applied Spectroscopy

TL;DR: In this article, it is shown that convolution of derivative filter coefficients with the error covariance matrices for the data tend to reduce the contributions of correlated error, thereby reducing the presence of drift noise.

...read moreread less

Abstract: The characteristics of baseline drift are discussed from the perspective of error covariance. From this standpoint, the operation of derivative filters as preprocessing tools for multivariate calibration is explored. It is shown that convolution of derivative filter coefficients with the error covariance matrices for the data tend to reduce the contributions of correlated error, thereby reducing the presence of drift noise. This theory is corroborated by examination of experimental error covariance matrices before and after derivative preprocessing. It is proposed that maximum likelihood principal components analysis (MLPCA) is an optimal method for countering the deleterious effects of drift noise when the characteristics of that noise are known, since MLPCA uses error covariance information to perform a maximum likelihood projection of the data. In simulation and experimental studies, the performance of MLPCR and derivative-preprocessed PCR are compared to that of PCR with multivariate calibration data showing significant levels of drift. MLPCR is found to perform as well as or better than derivative PCR (with the best-suited derivative filter characteristics), provided that reasonable estimates of the drift noise characteristics are available. Recommendations are given for the use of MLPCR with poor estimates of the error covariance information.

...read moreread less

Journal Article•DOI•

Simple principal components

[...]

S. K. Vines¹•Institutions (1)

Open University¹

01 Jan 2000-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: An algorithm for producing simple approximate principal components directly from a variance–covariance matrix using a series of ‘simplicity preserving’ linear transformations that can always be represented by integers.

...read moreread less

Abstract: We introduce an algorithm for producing simple approximate principal components directly from a variance-covariance matrix. At the heart of the algorithm is a series of `simplicity preserving' linear transformations. Each transformation seeks a direction within a two-dimensional subspace that has maximum variance. However, the choice of directions is limited so that the direction can be represented by a vector of integers whenever the subspace can also be represented by vectors of integers. The resulting approximate components can therefore always be represented by integers. Furthermore the elements of these integer vectors are often small, particularly for the first few components. We demonstrate the performance of this algorithm on two data sets and show that good approximations to the principal components that are also clearly simple and interpretable can result.

...read moreread less

Journal Article•DOI•

Independent component analysis for noisy data: MEG data analysis

[...]

Shiro Ikeda¹, K. Toyama•Institutions (1)

National Presto Industries¹

01 Dec 2000-Neural Networks

TL;DR: This article implements a factor analysis model for pre-processing of neurobiological data and shows an approach to separate noise-contaminated data without knowing the number of independent components is effective.

...read moreread less

Collapse