scispace - formally typeset
Search or ask a question

Showing papers on "Principal component analysis published in 2010"


Journal ArticleDOI
TL;DR: Principal component analysis (PCA) as discussed by the authors is a multivariate technique that analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables, and its goal is to extract the important information from the table, to represent it as a set of new orthogonal variables called principal components, and display the pattern of similarity of the observations and of the variables as points in maps.
Abstract: Principal component analysis PCA is a multivariate technique that analyzes a data table in which observations are described by several inter-correlated quantitative dependent variables. Its goal is to extract the important information from the table, to represent it as a set of new orthogonal variables called principal components, and to display the pattern of similarity of the observations and of the variables as points in maps. The quality of the PCA model can be evaluated using cross-validation techniques such as the bootstrap and the jackknife. PCA can be generalized as correspondence analysis CA in order to handle qualitative variables and as multiple factor analysis MFA in order to handle heterogeneous sets of variables. Mathematically, PCA depends upon the eigen-decomposition of positive semi-definite matrices and upon the singular value decomposition SVD of rectangular matrices. Copyright © 2010 John Wiley & Sons, Inc.

6,398 citations


Journal ArticleDOI
TL;DR: In this article, a guideline for conducting factor analysis, a technique used to estimate the population-level factor structure underlying the given sample data, is provided, along with suggestions for how to carry out preliminary procedures, exploratory and confirmatory factor analyses (EFA and CFA) with SPSS and LISREL syntax examples.
Abstract: The current article provides a guideline for conducting factor analysis, a technique used to estimate the population-level factor structure underlying the given sample data. First, the distinction between exploratory and confirmatory factor analyses (EFA and CFA) is briefly discussed; along with this discussion, the notion of principal component analysis and why it does not provide a valid substitute of factor analysis is noted. Second, a step-by-step walk-through of conducting factor analysis is illustrated; through these walk-through instructions, various decisions that need to be made in factor analysis are discussed and recommendations provided. Specifically, suggestions for how to carry out preliminary procedures, EFA, and CFA are provided with SPSS and LISREL syntax examples. Finally, some critical issues concerning the appropriate (and not-so-appropriate) use of factor analysis are discussed along with the discussion of recommended practices.

1,079 citations


Journal ArticleDOI
TL;DR: In this work, a versatile signal processing and analysis framework for Electroencephalogram (EEG) was proposed and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients.
Abstract: In this work, we proposed a versatile signal processing and analysis framework for Electroencephalogram (EEG). Within this framework the signals were decomposed into the frequency sub-bands using DWT and a set of statistical features was extracted from the sub-bands to represent the distribution of wavelet coefficients. Principal components analysis (PCA), independent components analysis (ICA) and linear discriminant analysis (LDA) is used to reduce the dimension of data. Then these features were used as an input to a support vector machine (SVM) with two discrete outputs: epileptic seizure or not. The performance of classification process due to different methods is presented and compared to show the excellent of classification process. These findings are presented as an example of a method for training, and testing a seizure prediction method on data from individual petit mal epileptic patients. Given the heterogeneity of epilepsy, it is likely that methods of this type will be required to configure intelligent devices for treating epilepsy to each individual's neurophysiology prior to clinical operation.

1,010 citations


Journal ArticleDOI
TL;DR: Experimental results on benchmark test images demonstrate that the LPG-PCA method achieves very competitive denoising performance, especially in image fine structure preservation, compared with state-of-the-art Denoising algorithms.

654 citations


Book ChapterDOI
01 Jan 2010
TL;DR: In this article, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote the least square criterion for nonlinear regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels.
Abstract: INTRODUCTION Learning systems depend on three interrelated components: topologies, cost/performance functions, and learning algorithms. Topologies provide the constraints for the mapping, and the learning algorithms offer the means to find an optimal solution; but the solution is optimal with respect to what? Optimality is characterized by the criterion and in neural network literature, this is the least addressed component, yet it has a decisive influence in generalization performance. Certainly, the assumptions behind the selection of a criterion should be better understood and investigated. Traditionally, least squares has been the benchmark criterion for regression problems; considering classification as a regression problem towards estimating class posterior probabilities, least squares has been employed to train neural network and other classifier topologies to approximate correct labels. The main motivation to utilize least squares in regression simply comes from the intellectual comfort this criterion provides due to its success in traditional linear least squares regression applications – which can be reduced to solving a system of linear equations. For nonlinear regression, the assumption of Gaussianity for the measurement error combined with the maximum likelihood principle could be emphasized to promote this criterion. In nonparametric regression, least squares principle leads to the conditional expectation solution, which is intuitively appealing. Although these are good reasons to use the mean squared error as the cost, it is inherently linked to the assumptions and habits stated above. Consequently, there is information in the error signal that is not captured during the training of nonlinear adaptive systems under non-Gaussian distribution conditions when one insists on secondorder statistical criteria. This argument extends to other linear-second-order techniques such as principal component analysis (PCA), linear discriminant analysis (LDA), and canonical correlation analysis (CCA). Recent work tries to generalize these techniques to nonlinear scenarios by utilizing kernel techniques or other heuristics. This begs the question: what other alternative cost functions could be used to train adaptive systems and how could we establish rigorous techniques for extending useful concepts from linear and second-order statistical techniques to nonlinear and higher-order statistical learning methodologies?

615 citations


Proceedings Article
06 Dec 2010
TL;DR: In this paper, an efficient convex optimization-based algorithm called Outlier Pursuit is presented, which under mild assumptions on the uncorrupted points (satisfied, e.g., by the standard generative assumption in PCA problems) recovers the exact optimal low-dimensional subspace, and identifies the corrupted points.
Abstract: Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used techniques for dimensionality reduction: successful and efficiently computable, it is nevertheless plagued by a well-known, well-documented sensitivity to outliers. Recent work has considered the setting where each point has a few arbitrarily corrupted components. Yet, in applications of SVD or PCA such as robust collaborative filtering or bioinformatics, malicious agents, defective genes, or simply corrupted or contaminated experiments may effectively yield entire points that are completely corrupted. We present an efficient convex optimization-based algorithm we call Outlier Pursuit, that under some mild assumptions on the uncorrupted points (satisfied, e.g., by the standard generative assumption in PCA problems) recovers the exact optimal low-dimensional subspace, and identifies the corrupted points. Such identification of corrupted points that do not conform to the low-dimensional approximation, is of paramount interest in bioinformatics and financial applications, and beyond. Our techniques involve matrix decomposition using nuclear norm minimization, however, our results, setup, and approach, necessarily differ considerably from the existing line of work in matrix completion and matrix decomposition, since we develop an approach to recover the correct column space of the uncorrupted matrix, rather than the exact matrix itself.

590 citations


Journal Article
TL;DR: In this paper, a new approach to sparse principal component analysis (sparse PCA) is proposed, which is based on the maximization of a convex function on a compact set.
Abstract: In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant principal component of a data matrix, or more components at once, respectively. While the initial formulations involve nonconvex functions, and are therefore computationally intractable, we rewrite them into the form of an optimization program involving maximization of a convex function on a compact set. The dimension of the search space is decreased enormously if the data matrix has many more columns (variables) than rows. We then propose and analyze a simple gradient method suited for the task. It appears that our algorithm has best convergence properties in the case when either the objective function or the feasible set are strongly convex, which is the case with our single-unit formulations and can be enforced in the block case. Finally, we demonstrate numerically on a set of random and gene expression test problems that our approach outperforms existing algorithms both in quality of the obtained solution and in computational speed.

534 citations


Journal ArticleDOI
TL;DR: The experimental results have shown that the principal components selected by the separating hyperplanes allow robust reconstruction and interpretation of the data, as well as higher recognition rates using less linear features in situations where the differences between the sample groups are subtle and consequently most difficult for the standard and state-of-the-art PCA selection methods.

515 citations


Posted Content
TL;DR: This result shows that the proposed convex program recovers the low-rank matrix even though a positive fraction of its entries are arbitrarily corrupted, with an error bound proportional to the noise level, the first result that shows the classical Principal Component Analysis, optimal for small i.i.d. noise, can be made robust to gross sparse errors.
Abstract: In this paper, we study the problem of recovering a low-rank matrix (the principal components) from a high-dimensional data matrix despite both small entry-wise noise and gross sparse errors. Recently, it has been shown that a convex program, named Principal Component Pursuit (PCP), can recover the low-rank matrix when the data matrix is corrupted by gross sparse errors. We further prove that the solution to a related convex program (a relaxed PCP) gives an estimate of the low-rank matrix that is simultaneously stable to small entrywise noise and robust to gross sparse errors. More precisely, our result shows that the proposed convex program recovers the low-rank matrix even though a positive fraction of its entries are arbitrarily corrupted, with an error bound proportional to the noise level. We present simulation results to support our result and demonstrate that the new convex program accurately recovers the principal components (the low-rank matrix) under quite broad conditions. To our knowledge, this is the first result that shows the classical Principal Component Analysis (PCA), optimal for small i.i.d. noise, can be made robust to gross sparse errors; or the first that shows the newly proposed PCP can be made stable to small entry-wise perturbations.

470 citations


Journal ArticleDOI
TL;DR: An open source Matlab program, the ERP PCA (EP) Toolkit, is presented, intended to supplement existing ERP analysis programs by providing functions for conducting artifact correction, robust averaging, referencing and baseline correction, data editing and visualization, principal components analysis, and robust inferential statistical analysis.

463 citations


Proceedings ArticleDOI
13 Jun 2010
TL;DR: In this article, a convex program, named Principal Component Pursuit (PCP), is proposed to recover the low-rank matrix from a high-dimensional data matrix despite both small entry-wise noise and gross sparse errors.
Abstract: In this paper, we study the problem of recovering a low-rank matrix (the principal components) from a high-dimensional data matrix despite both small entry-wise noise and gross sparse errors. Recently, it has been shown that a convex program, named Principal Component Pursuit (PCP), can recover the low-rank matrix when the data matrix is corrupted by gross sparse errors. We further prove that the solution to a related convex program (a relaxed PCP) gives an estimate of the low-rank matrix that is simultaneously stable to small entry-wise noise and robust to gross sparse errors. More precisely, our result shows that the proposed convex program recovers the low-rank matrix even though a positive fraction of its entries are arbitrarily corrupted, with an error bound proportional to the noise level. We present simulation results to support our result and demonstrate that the new convex program accurately recovers the principal components (the low-rank matrix) under quite broad conditions. To our knowledge, this is the first result that shows the classical Principal Component Analysis (PCA), optimal for small i.i.d. noise, can be made robust to gross sparse errors; or the first that shows the newly proposed PCP can be made stable to small entry-wise perturbations.

BookDOI
15 Nov 2010
TL;DR: Principal Component Analysis (PCA) Data - Notation - Examples Objectives Studying Individuals Studying Variables Relationships between the Two Representations NI and NK Interpreting the Data Implementation with FactoMineR Additional Results.
Abstract: Principal Component Analysis (PCA) Data - Notation - Examples Objectives Studying Individuals Studying Variables Relationships between the Two Representations NI and NK Interpreting the Data Implementation with FactoMineR Additional Results Example: The Decathlon Dataset Example: The Temperature Dataset Example of Genomic Data: The Chicken Dataset Correspondence Analysis (CA) Data - Notation - Examples Objectives and the Independence Model Fitting the Clouds Interpreting the Data Supplementary Elements (= Illustrative) Implementation with FactoMineR CA and Textual Data Processing Example: The Olympic Games Dataset Example: The White Wines Dataset Example: The Causes of Mortality Dataset Multiple Correspondence Analysis (MCA) Data - Notation - Examples Objectives Defining Distances between Individuals and Distances between Categories CA on the Indicator Matrix Interpreting the Data Implementation with FactoMineR Addendum Example: The Survey on the Perception of Genetically Modified Organisms Example: The Sorting Task Dataset Clustering Data - Issues Formalising the Notion of Similarity Constructing an Indexed Hierarchy Ward's Method Direct Search for Partitions: K-means Algorithm Partitioning and Hierarchical Clustering Clustering and Principal Component Methods Example: The Temperature Dataset Example: The Tea Dataset Dividing Quantitative Variables into Classes Appendix Percentage of Inertia Explained by the First Component or by the First Plane R Software Bibliography of Software Packages Bibliography Index

Journal Article
TL;DR: A probabilistic formulation of PCA provides a good foundation for handling missing values, and formulas for doing that are provided, and a novel fast algorithm is introduced and extended to variational Bayesian learning.
Abstract: Principal component analysis (PCA) is a classical data analysis technique that finds linear transformations of data that retain the maximal amount of variance. We study a case where some of the data values are missing, and show that this problem has many features which are usually associated with nonlinear models, such as overfitting and bad locally optimal solutions. A probabilistic formulation of PCA provides a good foundation for handling missing values, and we provide formulas for doing that. In case of high dimensional and very sparse data, overfitting becomes a severe problem and traditional algorithms for PCA are very slow. We introduce a novel fast algorithm and extend it to variational Bayesian learning. Different versions of PCA are compared in artificial experiments, demonstrating the effects of regularization and modeling of posterior variance. The scalability of the proposed algorithm is demonstrated by applying it to the Netflix problem.

Journal ArticleDOI
TL;DR: The genetic underpinnings of an emerging phenotypic model where wheat domestication has transformed a long thin primitive grain to a wider and shorter modern grain are provided.
Abstract: Grain morphology in wheat (Triticum aestivum) has been selected and manipulated even in very early agrarian societies and remains a major breeding target. We undertook a large-scale quantitative analysis to determine the genetic basis of the phenotypic diversity in wheat grain morphology. A high-throughput method was used to capture grain size and shape variation in multiple mapping populations, elite varieties, and a broad collection of ancestral wheat species. This analysis reveals that grain size and shape are largely independent traits in both primitive wheat and in modern varieties. This phenotypic structure was retained across the mapping populations studied, suggesting that these traits are under the control of a limited number of discrete genetic components. We identified the underlying genes as quantitative trait loci that are distinct for grain size and shape and are largely shared between the different mapping populations. Moreover, our results show a significant reduction of phenotypic variation in grain shape in the modern germplasm pool compared with the ancestral wheat species, probably as a result of a relatively recent bottleneck. Therefore, this study provides the genetic underpinnings of an emerging phenotypic model where wheat domestication has transformed a long thin primitive grain to a wider and shorter modern grain.

Journal ArticleDOI
15 Aug 2010-Geoderma
TL;DR: In this paper, the performance of three calibration methods, namely, principal component regression (PCR), partial least squares regression (PLSR) and back propagation neural network (BPNN) analyses for the accuracy of measurement of selected soil properties, namely organic carbon (OC) and extractable forms of potassium (K), sodium (Na), magnesium (Mg) and phosphorous (P).

Journal ArticleDOI
01 May 2010-Ethology
TL;DR: In this paper, a non-technical guidelines for reporting the results of animal behavior research are proposed, including whether the correlation or covariance matrix was used, sample size, and how the number of factors was assessed, communalities when sample size is small, and details of factor rotation.
Abstract: Principal component (PCA) and factor analysis (FA) are widely used in animal behaviour research. However, many authors automatically follow questionable practices implemented by default in general-purpose statistical software. Worse still, the results of such analyses in research reports typically omit many crucial details which may hamper their evaluation. This article provides simple non-technical guidelines for PCA and FA. A standard for reporting the results of these analyses is suggested. Studies using PCA and FA must report: (1) whether the correlation or covariance matrix was used; (2) sample size, preferably as a footnote to the table of factor loadings; (3) indices of sampling adequacy; (4) how the number of factors was assessed; (5) communalities when sample size is small; (6) details of factor rotation; (7) if factor scores are computed, present determinacy indices; (8) preferably they should publish the original correlation matrix.

Journal ArticleDOI
TL;DR: The proposed optimal rainfall forecasting model can be derived from MANN coupled with SSA, and results show that advantages of MANN over other models are quite noticeable, particularly for daily rainfall forecasting.

Journal ArticleDOI
TL;DR: The effect of output Y on the X-space decomposition in PLS is analyzed and geometric properties of the PLS structure are revealed.

Journal ArticleDOI
01 Dec 2010
TL;DR: Three well-known feature selection methods, which are Principal Component Analysis (PCA), Genetic Algorithms (GA) and decision trees (CART), are used and the back-propagation neural network is developed for the prediction model.
Abstract: To effectively predict stock price for investors is a very important research problem. In literature, data mining techniques have been applied to stock (market) prediction. Feature selection, a pre-processing step of data mining, aims at filtering out unrepresentative variables from a given dataset for effective prediction. As using different feature selection methods will lead to different features selected and thus affect the prediction performance, the purpose of this paper is to combine multiple feature selection methods to identify more representative variables for better prediction. In particular, three well-known feature selection methods, which are Principal Component Analysis (PCA), Genetic Algorithms (GA) and decision trees (CART), are used. The combination methods to filter out unrepresentative variables are based on union, intersection, and multi-intersection strategies. For the prediction model, the back-propagation neural network is developed. Experimental results show that the intersection between PCA and GA and the multi-intersection of PCA, GA, and CART perform the best, which are of 79% and 78.98% accuracy respectively. In addition, these two combined feature selection methods filter out near 80% unrepresentative features from 85 original variables, resulting in 14 and 17 important features respectively. These variables are the important factors for stock prediction and can be used for future investment decisions.

Journal ArticleDOI
TL;DR: This work proposes new tools for visualizing large amounts of functional data in the form of smooth curves, including functional versions of the bagplot and boxplot, which make use of the first two robust principal component scores, Tukey's data depth and highest density regions.
Abstract: We propose new tools for visualizing large amounts of functional data in the form of smooth curves. The proposed tools include functional versions of the bagplot and boxplot, which make use of the first two robust principal component scores, Tukey’s data depth and highest density regions. By-products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with existing methods for detecting outliers in functional data, and show that our methods are better able to identify outliers. An R-package containing computer code and datasets is available in the online supplements.

Proceedings ArticleDOI
12 Nov 2010
TL;DR: The analysis clearly shows that PCA has the potential to perform feature selection and is able to select a number of important individuals from all the feature components and the devised algorithm is not only subject to the nature of PCA but also computationally efficient.
Abstract: Principal component analysis (PCA) has been widely applied in the area of computer science. It is well-known that PCA is a popular transform method and the transform result is not directly related to a sole feature component of the original sample. However, in this paper, we try to apply principal components analysis (PCA) to feature selection. The proposed method well addresses the feature selection issue, from a viewpoint of numerical analysis. The analysis clearly shows that PCA has the potential to perform feature selection and is able to select a number of important individuals from all the feature components. Our method assumes that different feature components of original samples have different effects on feature extraction result and exploits the eigenvectors of the covariance matrix of PCA to evaluate the significance of each feature component of the original sample. When evaluating the significance of the feature components, the proposed method takes a number of eigenvectors into account. Then it uses a reasonable scheme to perform feature selection. The devised algorithm is not only subject to the nature of PCA but also computationally efficient. The experimental results on face recognition show that when the proposed method is able to greatly reduce the dimensionality of the original samples, it also does not bring the decrease in the recognition accuracy.

Journal ArticleDOI
01 Aug 2010
TL;DR: Traditional L2-norm-based least squares criterion is sensitive to outliers, while the newly proposed L1-norm 2DPCA is robust.
Abstract: In this paper, we first present a simple but effective L1-norm-based two-dimensional principal component analysis (2DPCA). Traditional L2-norm-based least squares criterion is sensitive to outliers, while the newly proposed L1-norm 2DPCA is robust. Experimental results demonstrate its advantages.

Book
20 Jul 2010
TL;DR: A tutorial overview of several geometric methods for dimension reduction by dividing the methods into projective methods and methods that model the manifold on which the data lies.
Abstract: We give a tutorial overview of several geometric methods for dimension reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, canonical correlation analysis, oriented PCA, and several techniques for sufficient dimension reduction. For the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian

Journal ArticleDOI
TL;DR: Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row–column associations within high‐dimensional data matrices.
Abstract: Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row-column associations within high-dimensional data matrices. SSVD seeks a low-rank, checkerboard structured matrix approximation to data matrices. The desired checkerboard structure is achieved by forcing both the left- and right-singular vectors to be sparse, that is, having many zero entries. By interpreting singular vectors as regression coefficient vectors for certain linear regressions, sparsity-inducing regularization penalties are imposed to the least squares regression to produce sparse singular vectors. An efficient iterative algorithm is proposed for computing the sparse singular vectors, along with some discussion of penalty parameter selection. A lung cancer microarray dataset and a food nutrition dataset are used to illustrate SSVD as a biclustering method. SSVD is also compared with some existing biclustering methods using simulated datasets.

Journal ArticleDOI
TL;DR: A new monitoring technique using the Canonical Variate Analysis with UCLs derived from the estimated probability density function through kernel density estimations (KDEs) is proposed and applied to the simulated nonlinear Tennessee Eastman Process Plant.
Abstract: The Principal Component Analysis (PCA) and the Partial Least Squares (PLS) are two commonly used techniques for process monitoring. Both PCA and PLS assume that the data to be analysed are not self-correlated i.e. time-independent. However, most industrial processes are dynamic so that the assumption of time-independence made by the PCA and the PLS is invalid in nature. Dynamic extensions to PCA and PLS, so called DPCA and DPLS, have been developed to address this problem, however, unsatisfactorily. Nevertheless, the Canonical Variate Analysis (CVA) is a state-space-based monitoring tool, hence is more suitable for dynamic monitoring than DPCA and DPLS. The CVA is a linear tool and traditionally for simplicity, the upper control limit (UCL) of monitoring metrics associated with the CVA is derived based on a Gaussian assumption. However, most industrial processes are nonlinear and the Gaussian assumption is invalid for such processes so that CVA with a UCL based on this assumption may not be able to correctly identify underlying faults. In this work, a new monitoring technique using the CVA with UCLs derived from the estimated probability density function through kernel density estimations (KDEs) is proposed and applied to the simulated nonlinear Tennessee Eastman Process Plant. The proposed CVA with KDE approach is able to significantly improve the monitoring performance and detect faults earlier when compared to other methods also examined in this study.

Journal ArticleDOI
TL;DR: Empirical study on three real-world databases shows that PNMF can achieve the best or close to the best in clustering and run more efficiently than the compared NMF methods, especially for high-dimensional data.
Abstract: A variant of nonnegative matrix factorization (NMF) which was proposed earlier is analyzed here. It is called projective nonnegative matrix factorization (PNMF). The new method approximately factorizes a projection matrix, minimizing the reconstruction error, into a positive low-rank matrix and its transpose. The dissimilarity between the original data matrix and its approximation can be measured by the Frobenius matrix norm or the modified Kullback-Leibler divergence. Both measures are minimized by multiplicative update rules, whose convergence is proven for the first time. Enforcing orthonormality to the basic objective is shown to lead to an even more efficient update rule, which is also readily extended to nonlinear cases. The formulation of the PNMF objective is shown to be connected to a variety of existing NMF methods and clustering approaches. In addition, the derivation using Lagrangian multipliers reveals the relation between reconstruction and sparseness. For kernel principal component analysis (PCA) with the binary constraint, useful in graph partitioning problems, the nonlinear kernel PNMF provides a good approximation which outperforms an existing discretization approach. Empirical study on three real-world databases shows that PNMF can achieve the best or close to the best in clustering. The proposed algorithm runs more efficiently than the compared NMF methods, especially for high-dimensional data. Moreover, contrary to the basic NMF, the trained projection matrix can be readily used for newly coming samples and demonstrates good generalization.

Journal ArticleDOI
TL;DR: This paper proposes a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other and shows the usefulness of SELF through experiments with benchmark and real-world document classification datasets.
Abstract: When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.

Journal ArticleDOI
TL;DR: In this paper, the authors consider nonparametric estimation of the mean and covariance functions for functional/longitudinal data and derive almost sure rates of convergence for principal component analysis using the estimated covariance function.
Abstract: We consider nonparametric estimation of the mean and covariance functions for functional/longitudinal data. Strong uniform convergence rates are developed for estimators that are local-linear smoothers. Our results are obtained in a unified framework in which the number of observations within each curve/cluster can be of any rate relative to the sample size. We show that the convergence rates for the procedures depend on both the number of sample curves and the number of observations on each curve. For sparse functional data, these rates are equivalent to the optimal rates in nonparametric regression. For dense functional data, root-n rates of convergence can be achieved with proper choices of bandwidths. We further derive almost sure rates of convergence for principal component analysis using the estimated covariance function. The results are illustrated with simulation studies.

Journal ArticleDOI
TL;DR: A new formulation for multiway spectral clustering corresponds to a weighted kernel principal component analysis (PCA) approach based on primal-dual least-squares support vector machine (LS-SVM) formulations and exploits the structure of the eigenvectors and the corresponding projections when the clusters are well formed.
Abstract: A new formulation for multiway spectral clustering is proposed. This method corresponds to a weighted kernel principal component analysis (PCA) approach based on primal-dual least-squares support vector machine (LS-SVM) formulations. The formulation allows the extension to out-of-sample points. In this way, the proposed clustering model can be trained, validated, and tested. The clustering information is contained on the eigendecomposition of a modified similarity matrix derived from the data. This eigenvalue problem corresponds to the dual solution of a primal optimization problem formulated in a high-dimensional feature space. A model selection criterion called the balanced line fit (BLF) is also proposed. This criterion is based on the out-of-sample extension and exploits the structure of the eigenvectors and the corresponding projections when the clusters are well formed. The BLF criterion can be used to obtain clustering parameters in a learning framework. Experimental results with difficult toy problems and image segmentation show improved performance in terms of generalization to new samples and computation times.

Journal ArticleDOI
TL;DR: In this paper, Monte Carlo simulation is applied to compare principal component analysis applied to data envelopment analysis (PCA-DEA) and variable reduction based on partial covariance (VR).