scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 2013"


Journal ArticleDOI
TL;DR: In this paper, the Principal Orthogonal Factor Factorization Thresholding (POET) method was introduced to explore an approximate factor structure for high-dimensional covariance with a conditional sparsity structure, which is the composition of a low rank matrix plus a sparse matrix.
Abstract: This paper deals with estimation of high-dimensional covariance with a conditional sparsity structure, which is the composition of a low-rank matrix plus a sparse matrix. By assuming sparse error covariance matrix in a multi-factor model, we allow the presence of the cross-sectional correlation even after taking out common but unobservable factors. We introduce the Principal Orthogonal complEment Thresholding (POET) method to explore such an approximate factor structure. The POET estimator includes the sample covariance matrix, the factor-based covariance matrix (Fan, Fan and Lv, 2008), the thresholding estimator (Bickel and Levina, 2008) and the adaptive thresholding estimator (Cai and Liu, 2011) as specic examples. We provide mathematical insights when the factor analysis is approximately the same as the principal component analysis for high dimensional data. The rates of convergence of the sparse residual covariance matrix and the conditional sparse covariance matrix are studied under various norms, including the spectral norm. It is shown that the impact of estimating the unknown factors vanishes as the dimensionality increases. The uniform rates of convergence for the unobserved factors and their factor loadings are derived. The asymptotic results are also veried by extensive simulation studies.

635 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: A novel approach to the pedestrian re-identification problem that uses metric learning to improve the state-of-the-art performance on standard public datasets and is an effective way to process observations comprising multiple shots, and is non-iterative: the computation times are relatively modest.
Abstract: Metric learning methods, for person re-identification, estimate a scaling for distances in a vector space that is optimized for picking out observations of the same individual. This paper presents a novel approach to the pedestrian re-identification problem that uses metric learning to improve the state-of-the-art performance on standard public datasets. Very high dimensional features are extracted from the source color image. A first processing stage performs unsupervised PCA dimensionality reduction, constrained to maintain the redundancy in color-space representation. A second stage further reduces the dimensionality, using a Local Fisher Discriminant Analysis defined by a training set. A regularization step is introduced to avoid singular matrices during this stage. The experiments conducted on three publicly available datasets confirm that the proposed method outperforms the state-of-the-art performance, including all other known metric learning methods. Further-more, the method is an effective way to process observations comprising multiple shots, and is non-iterative: the computation times are relatively modest. Finally, a novel statistic is derived to characterize the Match Characteristic: the normalized entropy reduction can be used to define the 'Proportion of Uncertainty Removed' (PUR). This measure is invariant to test set size and provides an intuitive indication of performance.

607 citations


Journal ArticleDOI
TL;DR: Five types of beat classes of arrhythmia as recommended by Association for Advancement of Medical Instrumentation (AAMI) were analyzed and dimensionality reduced features were fed to the Support Vector Machine, neural network and probabilistic neural network (PNN) classifiers for automated diagnosis.

586 citations


Journal ArticleDOI
TL;DR: JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data, and provides new directions for the visual exploration of joint and individual structure.
Abstract: Research in several fields now requires the analysis of datasets in which multiple high-dimensional types of data are available for a common set of objects. In particular, The Cancer Genome Atlas (TCGA) includes data from several diverse genomic technologies on the same cancerous tumor samples. In this paper we introduce Joint and Individual Variation Explained (JIVE), a general decomposition of variation for the integrated analysis of such datasets. The decomposition consists of three terms: a low-rank approximation capturing joint variation across data types, low-rank approximations for structured variation individual to each data type, and residual noise. JIVE quantifies the amount of joint variation between data types, reduces the dimensionality of the data, and provides new directions for the visual exploration of joint and individual structure. The proposed method represents an extension of Principal Component Analysis and has clear advantages over popular two-block methods such as Canonical Correlation Analysis and Partial Least Squares. A JIVE analysis of gene expression and miRNA data on Glioblastoma Multiforme tumor samples reveals gene-miRNA associations and provides better characterization of tumor types.

424 citations


Journal ArticleDOI
TL;DR: This article presents MFA, reviews recent extensions, and illustrates it with a detailed example that shows the common factor scores could be obtained by replacing the original normalized data tables by the normalized factor scores obtained from the PCA of each of these tables.
Abstract: Multiple factor analysis MFA, also called multiple factorial analysis is an extension of principal component analysis PCA tailored to handle multiple data tables that measure sets of variables coll...

333 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider how one of the oldest and most widely applied statistical methods, principal components analysis (PCA), is employed with spatial data, and identify four main methodologies, which are defined as (1) PCA applied to spatial objects, (2) PCAs applied to raster data, (3) atmospheric science PCA, and (4)PCA on flows.
Abstract: This article considers critically how one of the oldest and most widely applied statistical methods, principal components analysis (PCA), is employed with spatial data. We first provide a brief guide to how PCA works: This includes robust and compositional PCA variants, links to factor analysis, latent variable modeling, and multilevel PCA. We then present two different approaches to using PCA with spatial data. First we look at the nonspatial approach, which avoids challenges posed by spatial data by using a standard PCA on attribute space only. Within this approach we identify four main methodologies, which we define as (1) PCA applied to spatial objects, (2) PCA applied to raster data, (3) atmospheric science PCA, and (4) PCA on flows. In the second approach, we look at PCA adapted for effects in geographical space by looking at PCA methods adapted for first-order nonstationary effects (spatial heterogeneity) and second-order stationary effects (spatial autocorrelation). We also describe how PCA can be...

331 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered both minimax and adaptive estimation of the principal subspace in the high dimensional setting and established the optimal rates of convergence for estimating the subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in terms of the convergence rate.
Abstract: Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano’s lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.

305 citations


Journal ArticleDOI
TL;DR: In this paper, a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse is proposed, and the new approach recovers the principal subspace and leading eigvectors consistently, and even optimally, in a range of high-dimensional sparse settings.
Abstract: Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features $p$ is comparable to, or even much larger than, the sample size $n$. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.

298 citations


Journal ArticleDOI
TL;DR: The principal component analysis is a kind of algorithms in biometrics that covers standard deviation, covariance, and eigenvectors and is a tool to reduce multidimensional data to lower dimensions while retaining most of the information.
Abstract: The principal component analysis (PCA) is a kind of algorithms in biometrics. It is a statistics technical and used orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables. PCA also is a tool to reduce multidimensional data to lower dimensions while retaining most of the information. It covers standard deviation, covariance, and eigenvectors. This background knowledge is meant to make the PCA section very straightforward, but can be skipped if the concepts are already familiar.

295 citations


Journal ArticleDOI
TL;DR: In this article, conditions under which the latent factors can be estimated asymptotically without rotation were studied and the limiting distributions for the estimated factors and factor loadings when N and T are large and how identification of the factors affects inference based on factor augmented regressions.

279 citations


Journal ArticleDOI
TL;DR: A Bayesian model based on automatic relevance determination (ARD) in which the columns of the dictionary matrix and the rows of the activation matrix are tied together through a common scale parameter in their prior is proposed.
Abstract: This paper addresses the estimation of the latent dimensionality in nonnegative matrix factorization (NMF) with the β-divergence. The β-divergence is a family of cost functions that includes the squared euclidean distance, Kullback-Leibler (KL) and Itakura-Saito (IS) divergences as special cases. Learning the model order is important as it is necessary to strike the right balance between data fidelity and overfitting. We propose a Bayesian model based on automatic relevance determination (ARD) in which the columns of the dictionary matrix and the rows of the activation matrix are tied together through a common scale parameter in their prior. A family of majorization-minimization (MM) algorithms is proposed for maximum a posteriori (MAP) estimation. A subset of scale parameters is driven to a small lower bound in the course of inference, with the effect of pruning the corresponding spurious components. We demonstrate the efficacy and robustness of our algorithms by performing extensive experiments on synthetic data, the swimmer dataset, a music decomposition example, and a stock price prediction task.

Journal ArticleDOI
TL;DR: In this paper, a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix is performed, based on a sparse eigenvalue statistic.
Abstract: We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.

Journal ArticleDOI
TL;DR: The algorithm proposed in this paper allows to automatically segment the optic disc from a fundus image to facilitate the early detection of certain pathologies and to fully automate the process so as to avoid specialist intervention.
Abstract: The algorithm proposed in this paper allows to automatically segment the optic disc from a fundus image. The goal is to facilitate the early detection of certain pathologies and to fully automate the process so as to avoid specialist intervention. The method proposed for the extraction of the optic disc contour is mainly based on mathematical morphology along with principal component analysis (PCA). It makes use of different operations such as generalized distance function (GDF), a variant of the watershed transformation, the stochastic watershed, and geodesic transformations. The input of the segmentation method is obtained through PCA. The purpose of using PCA is to achieve the grey-scale image that better represents the original RGB image. The implemented algorithm has been validated on five public databases obtaining promising results. The average values obtained (a Jaccard's and Dice's coefficients of 0.8200 and 0.8932, respectively, an accuracy of 0.9947, and a true positive and false positive fractions of 0.9275 and 0.0036) demonstrate that this method is a robust tool for the automatic segmentation of the optic disc. Moreover, it is fairly reliable since it works properly on databases with a large degree of variability and improves the results of other state-of-the-art methods.

Journal ArticleDOI
TL;DR: In this paper, a simple time series method for bearing fault feature extraction using singular spectrum analysis (SSA) of the vibration signal is proposed, which is easy to implement and fault feature is noise immune.

Book
21 Mar 2013
TL;DR: In this article, the authors present an analysis of multivariate data and the forward search for regression data in order to find a Multivariate Transformations to Normality (MTN) with the Forward Search.
Abstract: Contents Preface Notation 1 Examples of Multivariate Data 1.1 In.uence, Outliers and Distances 1.2 A Sketch of the Forward Search 1.3 Multivariate Normality and our Examples 1.4 Swiss Heads 1.5 National Track Records forWomen 1.6 Municipalities in Emilia-Romagna 1.7 Swiss Bank Notes 1.8 Plan of the Book 2 Multivariate Data and the Forward Search 2.1 The Univariate Normal Distribution 2.1.1 Estimation 2.1.2 Distribution of Estimators 2.2 Estimation and the Multivariate Normal Distribution 2.2.1 The Multivariate Normal Distribution 2.2.2 The Wishart Distribution 2.2.3 Estimation of O 2.3 Hypothesis Testing 2.3.1 Hypotheses About the Mean 2.3.2 Hypotheses About the Variance 2.4 The Mahalanobis Distance 2.5 Some Deletion Results 2.5.1 The Deletion Mahalanobis Distance 2.5.2 The (Bartlett)-Sherman-Morrison-Woodbury Formula 2.5.3 Deletion Relationships Among Distances 2.6 Distribution of the Squared Mahalanobis Distance 2.7 Determinants of Dispersion Matrices and the Squared Mahalanobis Distance 2.8 Regression 2.9 Added Variables in Regression 2.10 TheMean Shift OutlierModel 2.11 Seemingly Unrelated Regression 2.12 The Forward Search 2.13 Starting the Search 2.13.1 The Babyfood Data 2.13.2 Robust Bivariate Boxplots from Peeling 2.13.3 Bivariate Boxplots from Ellipses 2.13.4 The Initial Subset 2.14 Monitoring the Search 2.15 The Forward Search for Regression Data 2.15.1 Univariate Regression 2.15.2 Multivariate Regression 2.16 Further Reading 2.17 Exercises 2.18 Solutions 3 Data from One Multivariate Distribution 3.1 Swiss Heads 3.2 National Track Records for Women 3.3 Municipalities in Emilia-Romagna 3.4 Swiss Bank Notes 3.5 What Have We Seen? 3.6 Exercises 3.7 Solutions 4 Multivariate Transformations to Normality 4.1 Background 4.2 An Introductory Example: the Babyfood Data 4.3 Power Transformations to Approximate Normality 4.3.1 Transformation of the Response in Regression 4.3.2 Multivariate Transformations to Normality 4.4 Score Tests for Transformations 4.5 Graphics for Transformations 4.6 Finding a Multivariate Transformation with the Forward Search 4.7 Babyfood Data 4.8 Swiss Heads 4.9 Horse Mussels 4.10 Municipalities in Emilia-Romagna 4.10.1 Demographic Variables 4.10.2 Wealth Variables 4.10.3 Work Variables 4.10.4 A Combined Analysis 4.11 National Track Records for Women 4.12 Dyestuff Data 4.13 Babyfood Data and Variable Selection 4.14 Suggestions for Further Reading 4.15 Exercises 4.16 Solutions 5 Principal Components Analysis 5.1 Background 5.2 Principal Components and Eigenvectors 5.2.1 Linear Transformations and Principal Components . 5.2.2 Lack of Scale Invariance and Standardized Variables 5.2.3 The Number of Components 5.3 Monitoring the Forward Search 5.3.1 Principal Components and Variances 5.3.2 Principal Component Scores 5.3.3 Correlations Between Variables and Principal Components 5.3.4 Elements of the Eigenvectors 5.4 The Biplot and the Singular Value Decomposition 5.5 Swiss Heads 5.6 Milk Data 5.7 Quality of Life 5.8 Swiss Bank Notes 5.8.1 Forgeries and Genuine Notes 5.8.2 Forgeries Alone 5.9 Municipalities in Emilia-Romagna 5.10 Further reading 5.11 Exercises 5.12 Solutions 6 Discriminant Analysis 6.1 Background 6.2 An Outline of Discriminant Analysis 6.2.1 Bayesian Discrimination 6.2.2 Quadratic Discriminant Analysis 6.2.3 Linear Discriminant Analysis 6.2.4 Estimation of Means and Variances 6.2.5 Canonical Variates 6.2.6 Assessment of Discriminant Rules 6.3 The Forward Search 6.3.1 Step 1: Choice of the Initial Subset 6.3.2 Step 2: Adding

Journal ArticleDOI
TL;DR: A new method to achieve bearing degradation prediction based on principal component analysis (PCA) and optimized LS-SVM method is proposed and an accelerated bearing run-to-failure experiment proved the effectiveness of the methodology.

Journal ArticleDOI
TL;DR: In this paper, the Joint Approximate Diagonalization of Eigenmatrices (JADE) algorithm is used to compute ICA parameters, and three examples are used to illustrate its performance, and highlight the differences between ICA results and those of other methods.
Abstract: Independent Components Analysis (ICA) is a relatively recent method, with an increasing number of applications in chemometrics. Of the many algorithms available to compute ICA parameters, the Joint Approximate Diagonalization of Eigenmatrices (JADE) algorithm is presented here in detail. Three examples are used to illustrate its performance, and highlight the differences between ICA results and those of other methods, such as Principal Components Analysis. A comparison with Parallel Factor Analysis (PARAFAC) is also presented in the case of a three-way data set to show that ICA applied on an unfolded high-order array can give results comparable with those of PARAFAC. (c) 2013 Elsevier Ltd. All rights reserved.

Journal ArticleDOI
TL;DR: An online oversampling principal component analysis (osPCA) algorithm to address the problem of detecting the presence of outliers from a large amount of data via an online updating technique, and the experimental results verify the feasibility of the proposed method in terms of both accuracy and efficiency.
Abstract: Anomaly detection has been an important research topic in data mining and machine learning. Many real-world applications such as intrusion or credit card fraud detection require an effective and efficient framework to identify deviated data instances. However, most anomaly detection methods are typically implemented in batch mode, and thus cannot be easily extended to large-scale problems without sacrificing computation and memory requirements. In this paper, we propose an online oversampling principal component analysis (osPCA) algorithm to address this problem, and we aim at detecting the presence of outliers from a large amount of data via an online updating technique. Unlike prior principal component analysis (PCA)-based approaches, we do not store the entire data matrix or covariance matrix, and thus our approach is especially of interest in online or large-scale problems. By oversampling the target instance and extracting the principal direction of the data, the proposed osPCA allows us to determine the anomaly of the target instance according to the variation of the resulting dominant eigenvector. Since our osPCA need not perform eigen analysis explicitly, the proposed framework is favored for online applications which have computation or memory limitations. Compared with the well-known power method for PCA and other popular anomaly detection algorithms, our experimental results verify the feasibility of our proposed method in terms of both accuracy and efficiency.

Journal ArticleDOI
TL;DR: This paper proposes a simple but effective robust LDA version based on L1-norm maximization, which learns a set of local optimal projection vectors by maximizing the ratio of the L2-norm-based between-class dispersion and the within- class dispersion.
Abstract: Linear discriminant analysis (LDA) is a well-known dimensionality reduction technique, which is widely used for many purposes. However, conventional LDA is sensitive to outliers because its objective function is based on the distance criterion using L2-norm. This paper proposes a simple but effective robust LDA version based on L1-norm maximization, which learns a set of local optimal projection vectors by maximizing the ratio of the L1-norm-based between-class dispersion and the L1-norm-based within-class dispersion. The proposed method is theoretically proved to be feasible and robust to outliers while overcoming the singular problem of the within-class scatter matrix for conventional LDA. Experiments on artificial datasets, standard classification datasets and three popular image databases demonstrate the efficacy of the proposed method.

Journal ArticleDOI
TL;DR: In this article, the problem of estimating the subspace spanned by the principal eigenvectors of the population covariance matrix was studied in high dimensions, where the number of variables can be much larger than the total number of observations.
Abstract: We study sparse principal components analysis in high dimensions, where $p$ (the number of variables) can be much larger than $n$ (the number of observations), and analyze the problem of estimating the subspace spanned by the principal eigenvectors of the population covariance matrix. We introduce two complementary notions of $\ell_{q}$ subspace sparsity: row sparsity and column sparsity. We prove nonasymptotic lower and upper bounds on the minimax subspace estimation error for $0\leq q\leq1$. The bounds are optimal for row sparse subspaces and nearly optimal for column sparse subspaces, they apply to general classes of covariance matrices, and they show that $\ell_{q}$ constrained estimates can achieve optimal minimax rates without restrictive spiked covariance conditions. Interestingly, the form of the rates matches known results for sparse regression when the effective noise variance is defined appropriately. Our proof employs a novel variational $\sin\Theta$ theorem that may be useful in other regularized spectral estimation problems.

Journal ArticleDOI
TL;DR: This paper applied the GARCH-MIDAS (mixed data sampling) model to examine whether information contained in macroeconomic variables can help to predict short-term and long-term components of the return variance.
Abstract: This paper applies the GARCH-MIDAS (mixed data sampling) model to examine whether information contained in macroeconomic variables can help to predict short-term and long-term components of the return variance. A principal component analysis is used to incorporate the information contained in different variables. Our results show that including low-frequency macroeconomic information in the GARCH-MIDAS model improves the prediction ability of the model, particularly for the long-term variance component. Moreover, the GARCH-MIDAS model augmented with the first principal component outperforms all other specifications, indicating that the constructed principal component can be considered as a good proxy of the business cycle. Copyright (c) 2013 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, the authors investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output.
Abstract: The principal components analysis (PCA) algorithm is a standard tool for identifying good low-dimensional approximations to high-dimensional data. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.

Journal ArticleDOI
TL;DR: A brand new approach using Fruit fly optimization algorithm (FOA) is adopted to optimize artificial neural network model and results show that FOA-optimized GRNN model has the best detection capacity.
Abstract: When constructing classification and prediction models, most researchers used genetic algorithm, particle swarm optimization algorithm, or ant colony optimization algorithm to optimize parameters of artificial neural network models in their previous studies. In this paper, a brand new approach using Fruit fly optimization algorithm (FOA) is adopted to optimize artificial neural network model. First, we carried out principal component regression on the results data of a questionnaire survey on logistics quality and service satisfaction of online auction sellers to construct our logistics quality and service satisfaction detection model. Relevant principal components in the principal component regression analysis results were selected for independent variables, and overall satisfaction level toward auction sellers’ logistics service as indicated in the questionnaire survey was selected as a dependent variable for sample data of this study. In the end, FOA-optimized general regression neural network (FOAGRNN), PSO-optimized general regression neural network (PSOGRNN), and other data mining techniques for ordinary general regression neural network were used to construct a logistics quality and service satisfaction detection model. In the study, 4–6 principal components in principal component regression analysis were selected as independent variables of the model. Analysis results of the study show that of the four data mining techniques, FOA-optimized GRNN model has the best detection capacity.

Journal ArticleDOI
TL;DR: The Hopfield-Potts model is introduced, inspired by the statistical physics of disordered systems, and it is shown how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA.
Abstract: Various approaches have explored the covariation of residues in multiple-sequence alignments of homologous proteins to extract functional and structural information. Among those are principal component analysis (PCA), which identifies the most correlated groups of residues, and direct coupling analysis (DCA), a global inference method based on the maximum entropy principle, which aims at predicting residue-residue contacts. In this paper, inspired by the statistical physics of disordered systems, we introduce the Hopfield-Potts model to naturally interpolate between these two approaches. The Hopfield-Potts model allows us to identify relevant ‘patterns’ of residues from the knowledge of the eigenmodes and eigenvalues of the residue-residue correlation matrix. We show how the computation of such statistical patterns makes it possible to accurately predict residue-residue contacts with a much smaller number of parameters than DCA. This dimensional reduction allows us to avoid overfitting and to extract contact information from multiple-sequence alignments of reduced size. In addition, we show that low-eigenvalue correlation modes, discarded by PCA, are important to recover structural information: the corresponding patterns are highly localized, that is, they are concentrated in few sites, which we find to be in close contact in the three-dimensional protein fold.

Posted Content
Moritz Hardt1, Eric Price1
TL;DR: A new robust convergence analysis of the well-known power method for computing the dominant singular vectors of a matrix that is called the noisy power method is provided and shows that the error dependence of the algorithm on the matrix dimension can be replaced by an essentially tight dependence on the coherence of the matrix.
Abstract: We provide a new robust convergence analysis of the well-known power method for computing the dominant singular vectors of a matrix that we call the noisy power method. Our result characterizes the convergence behavior of the algorithm when a significant amount noise is introduced after each matrix-vector multiplication. The noisy power method can be seen as a meta-algorithm that has recently found a number of important applications in a broad range of machine learning problems including alternating minimization for matrix completion, streaming principal component analysis (PCA), and privacy-preserving spectral analysis. Our general analysis subsumes several existing ad-hoc convergence bounds and resolves a number of open problems in multiple applications including streaming PCA and privacy-preserving singular vector computation.

Journal ArticleDOI
TL;DR: This article proposes a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions, and applies this method to sparse observations of CD4 cell counts and to dense white-matter tract profiles.
Abstract: Functional principal components (FPC) analysis is widely used to decompose and express functional observations. Curve estimates implicitly condition on basis functions and other quantities derived from FPC decompositions; however these objects are unknown in practice. In this article, we propose a method for obtaining correct curve estimates by accounting for uncertainty in FPC decompositions. Additionally, pointwise and simultaneous confidence intervals that account for both model- and decomposition-based variability are constructed. Standard mixed model representations of functional expansions are used to construct curve estimates and variances conditional on a specific decomposition. Iterated expectation and variance formulas combine model-based conditional estimates across the distribution of decompositions. A bootstrap procedure is implemented to understand the uncertainty in principal component decomposition quantities. Our method compares favorably to competing approaches in simulation studies that include both densely and sparsely observed functions. We apply our method to sparse observations of CD4 cell counts and to dense white-matter tract profiles. Code for the analyses and simulations is publicly available, and our method is implemented in the R package refund on CRAN.

Journal ArticleDOI
TL;DR: This work proposes a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable, and achieves maximal robustness.
Abstract: Principal component analysis plays a central role in statistics, engineering, and science. Because of the prevalence of corrupted data in real-world applications, much research has focused on developing robust algorithms. Perhaps surprisingly, these algorithms are unequipped-indeed, unable-to deal with outliers in the high-dimensional setting where the number of observations is of the same magnitude as the number of variables of each observation, and the dataset contains some (arbitrarily) corrupted observations. We propose a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable. In particular, our algorithm achieves maximal robustness-it has a breakdown point of 50% (the best possible), while all existing algorithms have a breakdown point of zero. Moreover, our algorithm recovers the optimal solution exactly in the case where the number of corrupted points grows sublinearly in the dimension.

Journal ArticleDOI
TL;DR: It is found that pPCA produces a shape space that preserves the Procrustes distances between objects, that allows shape models to be constructed, and that produces scores that can be used as shape variables for most purposes.
Abstract: Phylogenetic Principal Components Analysis (pPCA) is a recently proposed method for ordinating multivariatedatainawaythattakesintoaccountthephylogeneticnon-independenceamongspecies means. We review this method in terms of geometric morphometric shape analysis and compare its properties to ordinary principal components analysis (PCA). We find that pPCA produces a shape space that preserves the Procrustes distances between objects, that allows shape models to be constructed, and that produces scores that can be used as shape variables for most purposes. Unlike ordinary PCA scores, however, the scores on pPC axes are correlated with one another and their variances do not correspond to the eigenvalues of the phylogenetically corrected axes. The pPC axes are oriented by the non-phylogenetic component of shape variation, but the positioning of the scores in the space retains phylogenetic covariance making the visual information presented in plots a hybrid of non-phylogenetic and phylogenetic. Presuming that all pPCA scores are used as shape variables, there is no dierence between them and PCA scores for the construction of distance-based trees (such as UPGMA), for morphological disparity, or for ordinary multivariate statistical analyses (so long as the algorithms are suitable for correlated variables). pPCA scores yielddierenttrait-basedtrees(suchasmaximumlikelihoodtreesforcontinuoustraits)becausethe scores are correlated and because the pPC axes dier from PC axes. pPCA eigenvalues represent the residual shape variance once the phylogenetic covariance has been removed (though there are scalingissues),andassuchtheyprovideinformationoncovariancethatisindependentofphylogeny. Tests for modularity on pPCA eigenvalues will therefore yield dierent results than ordinary PCA eigenvalues. pPCA can be considered another tool in the kit of geometric morphometrics, but one whose properties are more dicult to interpret than ordinary PCA.

Journal ArticleDOI
TL;DR: The proposed method decomposes the phase map into a set of values of uncorrelated variables called principal components, and then extracts the aberration terms from the first principal component obtained.
Abstract: We present an effective, fast, and straightforward phase aberration compensation method in digital holographic microscopy based on principal component analysis. The proposed method decomposes the phase map into a set of values of uncorrelated variables called principal components, and then extracts the aberration terms from the first principal component obtained. It is effective, fully automatic, and does not require any prior knowledge of the object and the setup. The great performance and limited computational complexity make our approach a very attractive and promising technique for compensating phase aberration in digital holography under time-critical environments.

Journal ArticleDOI
TL;DR: It is demonstrated that simple extensions of TRCA can provide most distinctive signals for two tasks and can integrate multiple modalities of information to remove task-unrelated artifacts.