Showing papers on "Principal component analysis published in 2019"

PDF

Open Access

Journal Article•DOI•

Robust Sparse Linear Discriminant Analysis

[...]

Jie Wen¹, Xiaozhao Fang², Jinrong Cui³, Lunke Fei², Ke Yan¹, Yan Chen, Yong Xu¹ - Show less +3 more•Institutions (3)

Harbin Institute of Technology¹, Guangdong University of Technology², South China Agricultural University³

01 Feb 2019-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: A novel feature extraction method called robust sparse linear discriminant analysis (RSLDA) is proposed to solve the above problems and achieves the competitive performance compared with other state-of-the-art feature extraction methods.

...read moreread less

Abstract: Linear discriminant analysis (LDA) is a very popular supervised feature extraction method and has been extended to different variants. However, classical LDA has the following problems: 1) The obtained discriminant projection does not have good interpretability for features; 2) LDA is sensitive to noise; and 3) LDA is sensitive to the selection of number of projection directions. In this paper, a novel feature extraction method called robust sparse linear discriminant analysis (RSLDA) is proposed to solve the above problems. Specifically, RSLDA adaptively selects the most discriminative features for discriminant analysis by introducing the $l_{2,1}$ norm. An orthogonal matrix and a sparse matrix are also simultaneously introduced to guarantee that the extracted features can hold the main energy of the original data and enhance the robustness to noise, and thus RSLDA has the potential to perform better than other discriminant methods. Extensive experiments on six databases demonstrate that the proposed method achieves the competitive performance compared with other state-of-the-art feature extraction methods. Moreover, the proposed method is robust to the noisy data.

...read moreread less

261 citations

Journal Article•DOI•

Dimensionality reduction with IG-PCA and ensemble classifier for network intrusion detection

[...]

Fadi Salo¹, Ali Bou Nassif², Ali Bou Nassif¹, Aleksander Essex¹•Institutions (2)

University of Western Ontario¹, University of Sharjah²

15 Jan 2019-Computer Networks

TL;DR: Experimental results show that the proposed hybrid dimensionality reduction method with the ensemble of the base learners contributes more critical features and significantly outperforms individual approaches, achieving high accuracy and low false alarm rates.

...read moreread less

200 citations

Journal Article•DOI•

Data-driven reduced order modeling for time-dependent problems

[...]

Mengwu Guo¹, Jan S. Hesthaven¹•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

01 Mar 2019-Computer Methods in Applied Mechanics and Engineering

TL;DR: The proposed approach provides a reliable and efficient tool for approximating parametrized time-dependent problems, and its effectiveness is illustrated by non-trivial numerical examples.

...read moreread less

181 citations

Journal Article•DOI•

Fault Diagnosis Method Based on Principal Component Analysis and Broad Learning System

[...]

Huimin Zhao¹, Jianjie Zheng², Junjie Xu¹, Wu Deng¹•Institutions (2)

Civil Aviation University of China¹, Dalian Jiaotong University²

16 Jul 2019-IEEE Access

TL;DR: Experimental results show that the PCA method can effectively eliminate the feature correlation and realize the dimension reduction of the feature matrix, the BLS can take on better adaptability, faster computation speed, and higher classification accuracy, and the PABSFD method can efficiently and accurately obtain the fault diagnosis results.

...read moreread less

Abstract: Traditional feature extraction methods are used to extract the features of signal to construct the fault feature matrix, which exists the complex structure, higher correlation, and redundancy. This will increase the complex fault classification and seriously affect the accuracy and efficiency of fault identification. In order to solve these problems, a new fault diagnosis (PABSFD) method based on the principal component analysis (PCA) and the broad learning system (BLS) is proposed for rotor system in this paper. In the proposed PABSFD method, the PCA with revealing the signal essence is used to reduce the dimension of the constructed feature matrix and decrease the linear feature correlation between data and eliminate the redundant attributes in order to obtain the low-dimensional feature matrix with retaining the essential features for the classification model. Then, the BLS with low time complexity and high classification accuracy is regarded as a classification model to realize the fault identification; it can efficiently accomplish the fault classification of rotor system. Finally, the actual vibration data of rotor system are selected to test and verify the effectiveness of the PABSFD method. The experimental results show that the PCA method can effectively eliminate the feature correlation and realize the dimension reduction of the feature matrix, the BLS can take on better adaptability, faster computation speed, and higher classification accuracy, and the PABSFD method can efficiently and accurately obtain the fault diagnosis results.

...read moreread less

170 citations

Journal Article•DOI•

Combining Principal Component Analysis, Discrete Wavelet Transform and XGBoost to trade in the financial markets

[...]

João Nobre¹, Rui Neves¹•Institutions (1)

Instituto Superior Técnico¹

01 Jul 2019-Expert Systems With Applications

TL;DR: This system is capable of outperforming the Buy and Hold (B&H) strategy in three of the five analyzed financial markets, achieving an average rate of return of 49.26% in the portfolio, while the B&H achieves on average 32.41%.

...read moreread less

Abstract: When investing in financial markets it is crucial to determine a trading signal that can provide the investor with the best entry and exit points of the financial market, however this is a difficult task and has become a very popular research topic in the financial area. This paper presents an expert system in the financial area that combines Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT), Extreme Gradient Boosting (XGBoost) and a Multi-Objective Optimization Genetic Algorithm (MOO-GA) in order to achieve high returns with a low level of risk. PCA is used to reduce the dimensionality of the financial input data set and the DWT is used to perform a noise reduction to every feature. The resultant data set is then fed to an XGBoost binary classifier that has its hyperparameters optimized by a MOO-GA. The importance of the PCA is analyzed and the results obtained show that it greatly improves the performance of the system. In order to improve even more the results obtained in the system using PCA, the PCA and the DWT are then applied together in one system and the results obtained show that this system is capable of outperforming the Buy and Hold (B&H) strategy in three of the five analyzed financial markets, achieving an average rate of return of 49.26% in the portfolio, while the B&H achieves on average 32.41%.

...read moreread less

169 citations

Journal Article•DOI•

Distributed estimation of principal eigenspaces.

[...]

Jianqing Fan¹, Dong Wang¹, Kaizheng Wang¹, Ziwei Zhu¹•Institutions (1)

Princeton University¹

31 Oct 2019-Annals of Statistics

TL;DR: It is shown that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data.

...read moreread less

Abstract: Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication cost can prohibit the computation of PCA in a central location and distributed algorithms for PCA are thus needed. This paper proposes and studies a distributed PCA algorithm: each node machine computes the top K eigenvectors and transmits them to the central server; the central server then aggregates the information from all the node machines and conducts a PCA based on the aggregated information. We investigate the bias and variance for the resulting distributed estimator of the top K eigenvectors. In particular, we show that for distributions with symmetric innovation, the empirical top eigenspaces are unbiased and hence the distributed PCA is "unbiased". We derive the rate of convergence for distributed PCA estimators, which depends explicitly on the effective rank of covariance, eigen-gap, and the number of machines. We show that when the number of machines is not unreasonably large, the distributed PCA performs as well as the whole sample PCA, even without full access of whole data. The theoretical results are verified by an extensive simulation study. We also extend our analysis to the heterogeneous case where the population covariance matrices are different across local machines but share similar top eigen-structures.

...read moreread less

138 citations

Journal Article•DOI•

Principal Component Analysis of High-Frequency Data

[...]

Yacine Ait-Sahalia¹, Dacheng Xiu²•Institutions (2)

Princeton University¹, University of Chicago²

02 Jan 2019-Journal of the American Statistical Association

TL;DR: In this article, the authors develop a methodology to conduct principal component analysis at high frequency and construct estimators of realized eigenvalues, eigenvectors, and principal components.

...read moreread less

Abstract: We develop the necessary methodology to conduct principal component analysis at high frequency. We construct estimators of realized eigenvalues, eigenvectors, and principal components, and provide ...

...read moreread less

131 citations

Journal Article•DOI•

Review for order reduction based on proper orthogonal decomposition and outlooks of applications in mechanical systems

[...]

Kuan Lu, Yulin Jin¹, Yulin Jin², Yushu Chen², Yongfeng Yang³, Lei Hou², Zhiyong Zhang⁴, Zhonggang Li², Chao Fu³ - Show less +5 more•Institutions (4)

Sichuan University¹, Harbin Institute of Technology², Northwestern Polytechnical University³, Nanjing University of Science and Technology⁴

15 May 2019-Mechanical Systems and Signal Processing

TL;DR: A review of proper orthogonal decomposition (POD) methods for order reduction in a variety of research areas is presented in this paper, where the historical development and basic mathematical formulation of the POD method are introduced.

...read moreread less

129 citations

Journal Article•DOI•

Improved logistic regression model for diabetes prediction by integrating PCA and K-means techniques

[...]

Changsheng Zhu¹, Christian Uwa Idemudia¹, Wenfang Feng¹•Institutions (1)

Lanzhou University of Technology¹

01 Jan 2019-Informatics in Medicine Unlocked

TL;DR: The proposed data mining based model comprises of PCA (principal component analysis), k-means and logistic regression algorithm, which is shown to be useful for automatically predicting diabetes using patient electronic health records data.

...read moreread less

114 citations

Book•

Applied Compositional Data Analysis: With Worked Examples in R

[...]

Peter Filzmoser, Karel Hron, Matthias Templ

10 Jul 2019

TL;DR: This paper presents a meta-analysis of high-dimensional compositional data using R for the first time to derive conclusions about the compositional properties of individual components in a discrete-time model.

...read moreread less

Abstract: Preface.- Acknowledgements.- Compositional data as a methodological concept.- Analyzing compositional data using R.- Geometrical properties of compositional data.- Exploratory data analysis and visualization.- First steps for a statistical analysis.- Cluster analysis.- Principal component analysis.- Correlation analysis.- Discriminant analysis.- Regression analysis.- Methods for high-dimensional compositional data.- Compositional tables.- Preprocessing issues.- Index.-

...read moreread less

114 citations

Proceedings Article•DOI•

Face Detection and Recognition Using OpenCV

[...]

Maliha Khan¹, Sudeshna Chakraborty¹, Rani Astya¹, Shaveta Khepra¹•Institutions (1)

Sharda University¹

01 Oct 2019

TL;DR: A camera-based real-time face recognition system and an algorithm is built by developing programming on OpenCV, Haar Cascade, Eigenface, Fisher Face, LBPH, and Python.

...read moreread less

Abstract: Face detection and picture or video recognition is a popular subject of research on biometrics. Face recognition in a real-time setting has an exciting area and a rapidly growing challenge. Framework for the use of face recognition application authentication. This proposes the PCA (Principal Component Analysis) facial recognition system. The key component analysis (PCA) is a statistical method under the broad heading of factor analysis. The aim of the PCA is to reduce the large amount of data storage to the size of the feature space that is required to represent the data economically. The wide 1-D pixel vector made of the 2-D face picture in compact main elements of the space function is designed for facial recognition by the PCA. This is called a projection of self-space. The proper space is determined with the identification of the covariance matrix’s own vectors, which are centered on a collection of fingerprint images. I build a camera-based real-time face recognition system and set an algorithm by developing programming on OpenCV, Haar Cascade, Eigenface, Fisher Face, LBPH, and Python.

...read moreread less

Journal Article•DOI•

Linear and nonlinear features and machine learning for wind turbine blade ice detection and diagnosis

[...]

Alfredo Arcos Jiménez, Fausto Pedro García Márquez, Victoria Borja Moraleda, Carlos Quiterio Gómez Muñoz¹•Institutions (1)

European University of Madrid¹

01 Mar 2019-Renewable Energy

TL;DR: This paper presents a novel approach to detect and classify ice thickness based on pattern recognition through guided ultrasonic waves and Machine Learning, and considers four feature extraction methods to validate the results.

...read moreread less

Journal Article•DOI•

Improved enrichment factor calculations through principal component analysis: Examples from soils near breccia pipe uranium mines, Arizona, USA.

[...]

Carleton R. Bern¹, Katie Walton-Day¹, David L. Naftz¹•Institutions (1)

United States Geological Survey¹

07 Feb 2019-Environmental Pollution

TL;DR: It is shown how carefully applied, classical principal component analysis examined via biplots can guide the selections of background compositions and reference elements in the calculation of the enrichment factor.

...read moreread less

Journal Article•DOI•

A comparison of dimension reduction techniques for support vector machine modeling of multi-parameter manufacturing quality prediction

[...]

Yun Bai¹, Yun Bai², Zhenzhong Sun², Bo Zeng², Jianyu Long², Lin Li², José Valente de Oliveira¹, Chuan Li² - Show less +4 more•Institutions (2)

University of the Algarve¹, Dongguan University of Technology²

01 Jun 2019-Journal of Intelligent Manufacturing

TL;DR: Comparison experiments indicate that the dimension reduction techniques have capacity for improving the SVM modeling performance indeed, and the Isomap–SVM model with the nonlinear global dimension reduction outperforms all the candidate models in terms of qualitative and quantitative analysis.

...read moreread less

Abstract: Manufacturing quality prediction model, as an effective measure to monitor the quality in advance, has been developed using various data-driven techniques. However, multi-parameter in multi-stage of the modern manufacturing industry brings about the curse of dimensionality, leading to the difficulties for feature extraction, learning and quality modeling. To address this issue, three dimension reduction techniques are investigated in this paper, i.e., principal component analysis (PCA), locally linear embedding (LLE), and isometric mapping (Isomap). Specifically, the PCA is a linear dimension reduction technique, the LLE is a nonlinear reduction technique with local perspective, and the Isomap is a nonlinear reduction technique from global perspective. After getting the low-dimensional information from the PCA, the LLE, and the Isomap methods respectively, a support vector machine (SVM) is utilized for modeling. To reveal the effectiveness of the dimension reduction techniques and compare the difference of the three dimension reduction techniques, two experimental manufacturing data are collected from a competition about manufacturing quality control in Tianchi Data Lab of China. The comparison experiments indicate that the dimension reduction techniques have capacity for improving the SVM modeling performance indeed, and the Isomap–SVM model with the nonlinear global dimension reduction outperforms all the candidate models in terms of qualitative and quantitative analysis.

...read moreread less

Journal Article•DOI•

A Review of Dimensionality Reduction Techniques for Efficient Computation

[...]

S. Velliangiri¹, S. Alagumuthukrishnan¹, S Iwin Thankumar joseph•Institutions (1)

CMR Institute of Technology¹

01 Jan 2019-Procedia Computer Science

TL;DR: Most widely used feature extraction techniquessuch as EMD, PCA, and feature selection techniques such as correlation, LDA, forward selection have been analyzed based on high performance and accuracy and discussed how dimension reduction is made in deep learning.

...read moreread less

Journal Article•DOI•

A New Formulation of Linear Discriminant Analysis for Robust Dimensionality Reduction

[...]

Haifeng Zhao¹, Zheng Wang², Feiping Nie²•Institutions (2)

Anhui University¹, Northwestern Polytechnical University²

01 Apr 2019-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes a new formulation of linear discriminant analysis via joint inline-formula-norm minimization on objective function to induce robustness, so as to efficiently alleviate the influence of outliers and improve the robustness of proposed method.

...read moreread less

Abstract: Dimensionality reduction is a critical technology in the domain of pattern recognition, and linear discriminant analysis (LDA) is one of the most popular supervised dimensionality reduction methods. However, whenever its distance criterion of objective function uses $L_2$ -norm, it is sensitive to outliers. In this paper, we propose a new formulation of linear discriminant analysis via joint $L_{2,1}$ -norm minimization on objective function to induce robustness, so as to efficiently alleviate the influence of outliers and improve the robustness of proposed method. An efficient iterative algorithm is proposed to solve the optimization problem and proved to be convergent. Extensive experiments are performed on an artificial data set, on UCI data sets, and on four face data sets, which sufficiently demonstrates the efficiency of comparing to other methods and robustness to outliers of our approach.

...read moreread less

Journal Article•DOI•

Use and Misuse of PCA for Measuring Well-Being

[...]

Matteo Mazziotta¹, Adriano Pareto¹•Institutions (1)

National Institute of Statistics¹

01 Apr 2019-Social Indicators Research

TL;DR: The use and misuse of PCA for measuring well-being is discussed, some applications to real data are shown, and the technique is shown to be suitable for all types of indicators.

...read moreread less

Abstract: The measurement of well-being of people is very difficult because it is characterized by a multiplicity of aspects or dimensions. Principal Components Analysis (PCA) is probably the most popular multivariate statistical technique for reducing data with many dimensions and, often, well-being indicators are reduced to a single index of well-being by using PCA. However, PCA is implicitly based on a reflective measurement model that is not suitable for all types of indicators. In this paper, we discuss the use and misuse of PCA for measuring well-being, and we show some applications to real data.

...read moreread less

Journal Article•DOI•

Bat algorithm with principal component analysis

[...]

Zhihua Cui¹, Feixiang Li¹, Wensheng Zhang²•Institutions (2)

Taiyuan University of Science and Technology¹, Chinese Academy of Sciences²

01 Mar 2019-International Journal of Machine Learning and Cybernetics

TL;DR: Two new variants using principal component analysis (PCA_BA and PCA_LBA) are designed in this paper to test and improve the global search ability of BA with large-scale problems, and a correlation threshold and generation threshold are determined using the golden section method.

...read moreread less

Abstract: The bat algorithm (BA) is a novel evolutionary optimization algorithm, most studies of which have been performed with low-dimensional problems. To test and improve the global search ability of BA with large-scale problems, two new variants using principal component analysis (PCA_BA and PCA_LBA) are designed in this paper. A correlation threshold and generation threshold are determined using the golden section method to enhance the effectiveness of this new strategy. To test performance, CEC’2008 large-scale benchmark functions are utilized and compared with other algorithms; simulation results indicate the validity of this modification.

...read moreread less

Journal Article•DOI•

Combining Hierarchical Clustering approaches using the PCA Method

[...]

Mohammad Jafarzadegan¹, Faramarz Safi-Esfahani¹, Zahra Beheshti¹•Institutions (1)

Islamic Azad University¹

27 Jun 2019-Expert Systems With Applications

TL;DR: The experimental results on popular available datasets show the superiority of the clustering accuracy of the proposed method over basic clustering methods such as single, average and centroid linkage and previously combined hierarchical clustering method.

...read moreread less

Abstract: In expert systems, data mining methods are algorithms that simulate humans’ problem-solving capabilities. Clustering methods as unsupervised machine learning methods are crucial approaches to categorize similar samples in the same categories. The use of different clustering algorithms to a given dataset produces clusters with different qualities. Hence, many researchers have applied clustering combination methods to reduce the risk of choosing an inappropriate clustering algorithm. In these methods, the outputs of several clustering algorithms are combined. In these research works, the input hierarchical clusterings are transformed to descriptor matrices and their combination is achieved by aggregating their descriptor matrices. In previous works, only element-wise aggregation operators have been used and the relation between the elements of each descriptor matrix has been ignored. However, the value of each element of the descriptor matrix is meaningful in comparison with its other elements. The current study proposes a novel method of combining hierarchical clustering approaches based on principle component analysis (PCA). PCA as an aggregator allows considering all elements of the descriptor matrices. In the proposed approach, basic clusters are made and transformed to descriptor matrices. Then, a final matrix is extracted from the descriptor matrices using PCA. Next, a final dendrogram is constructed from the matrix that is used to summarize the results of the diverse clustering. The experimental results on popular available datasets show the superiority of the clustering accuracy of the proposed method over basic clustering methods such as single, average and centroid linkage and previously combined hierarchical clustering methods. In addition, statistical tests show that the proposed method significantly outperformed hierarchical clustering combination methods with element-wise averaging operators in almost all tested datasets. Several experiments have also been conducted which confirm the robustness of the proposed method for its parameter setting.

...read moreread less

Journal Article•DOI•

Power Quality Disturbance Monitoring and Classification Based on Improved PCA and Convolution Neural Network for Wind-Grid Distribution Systems

[...]

Yue Shen, Muhammad Abubakar, Hui Liu, Fida Hussain

03 Apr 2019-Energies

TL;DR: A novel algorithm based on Improved Principal Component Analysis and 1-Dimensional Convolution Neural Network for detection and classification of PQDs is proposed and shows that the proposed method gives significantly higher classification accuracy.

...read moreread less

Abstract: The excessive use of power semiconductor devices in a grid utility increases the malfunction of the control system, produces power quality disturbances (PQDs) and reduces the electrical component life. The present work proposes a novel algorithm based on Improved Principal Component Analysis (IPCA) and 1-Dimensional Convolution Neural Network (1-D-CNN) for detection and classification of PQDs. Firstly, IPCA is used to extract the statistical features of PQDs such as Root Mean Square, Skewness, Range, Kurtosis, Crest Factor, Form Factor. IPCA is decomposed into four levels. The principal component (PC) is obtained by IPCA, and it contains a maximum amount of original data as compare to PCA. 1-D-CNN is also used to extract features such as mean, energy, standard deviation, Shannon entropy, and log-energy entropy. The statistical analysis is employed for optimal feature selection. Secondly, these improved features of the PQDs are fed to the 1-D-CNN-based classifier to gain maximum classification accuracy. The proposed IPCA-1-D-CNN is utilized for classification of 12 types of synthetic and simulated single and multiple PQDs. The simulated PQDs are generated from a modified IEEE bus system with wind energy penetration in the balanced distribution system. Finally, the proposed IPCA-1-D-CNN algorithm has been tested with noise (50 dB to 20 dB) and noiseless environment. The obtained results are compared with SVM and other existing techniques. The comparative results show that the proposed method gives significantly higher classification accuracy.

...read moreread less

Journal Article•DOI•

Motor Fault Detection and Feature Extraction Using RNN-Based Variational Autoencoder

[...]

Yang Huang¹, Chiun-Hsun Chen¹, Chi-Jui Huang¹•Institutions (1)

National Chiao Tung University¹

12 Sep 2019-IEEE Access

TL;DR: This work proposed a two-stage machine learning analysis architecture which can accurately predict the motor fault modes only by using motor vibration time-domain signals without any complicated preprocessing and improves the prediction accuracy evaluated by several classification algorithms.

...read moreread less

Abstract: In most of the fault detection methods, the time domain signals collected from the mechanical equipment usually need to be transformed into frequency domain or other high-level data, highly relying on professional knowledge such as signal processing and fault pattern recognition. Contrary to those existing approaches, we proposed a two-stage machine learning analysis architecture which can accurately predict the motor fault modes only by using motor vibration time-domain signals without any complicated preprocessing. In the first stage, the method RNN-based VAE was proposed which is highly suitable for dimension reduction of time series data. In addition to reducing the dimension of sequential data from 150*3 to 25 dimensions, our method furthermore improves the prediction accuracy evaluated by several classification algorithms. While other dimension reduction methods such as Autoencoder and Variational Autoencoder cannot improve the classification accuracy effectively or even decreased. It indicates that the sequential data after dimension reduction via the RNN-based VAE still can maintain the high-dimensional data information. Furthermore, the experimental results demonstrate that it can be well applied to time series data dimension reduction and shows a significant improvement of the prediction performance, even with a simple double-layer Neural Network can reach over 99% of accuracy. In the second stage, Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) are used to further perform the second dimension reduction, such that the different or unknown fault modes can be clearly visualized and detected.

...read moreread less

Journal Article•DOI•

Be careful with your principal components.

[...]

Mats Björklund¹•Institutions (1)

Uppsala University¹

02 Sep 2019-Evolution

TL;DR: A number of simple test statistics appropriate for testing PC's are reviewed and a real‐world example is used to illustrate how this can be done using randomization tests.

...read moreread less

Abstract: Principal components analysis (PCA) is a common method to summarize a larger set of correlated variables into a smaller and more easily interpretable axes of variation. However, the different components need to be distinct from each other to be interpretable otherwise they only represent random directions. This is a fundamental assumption of PCA and, thus, needs to be tested every time. Sample correlation matrices will always result in a pattern of decreasing eigenvalues even if there is no structure. Tests are, therefore, needed to discern real patterns from illusionary ones. Furthermore, the loadings of the vectors need to be larger than expected by random data to be useful in the calculation of PC-scores. PC-scores calculated from nondistinct PC's have very large standard errors and cannot be used for biological interpretations. I give a number of examples to illustrate the potential problems with PCA. Robustness of the PC's increases with increasing sample size but not with the number of traits. I review a few simple test statistics appropriate for testing PC's and use a real-world example to illustrate how this can be done using randomization tests. PCA can be very useful but great care is needed to avoid spurious results.

...read moreread less

Journal Article•DOI•

Deep Principal Component Analysis Based on Layerwise Feature Extraction and Its Application to Nonlinear Process Monitoring

[...]

Xiaogang Deng¹, Xuemin Tian¹, Sheng Chen², Chris Harris²•Institutions (2)

China University of Petroleum¹, University of Southampton²

09 Oct 2019-IEEE Transactions on Control Systems and Technology

TL;DR: A hierarchical statistical model structure to extract multilayer data features, including both the linear and nonlinear principal components, is designed, motivated by the deep learning strategy, to reduce the computation complexity in nonlinear feature extraction.

...read moreread less

Abstract: In order to deeply exploit intrinsic data feature information hidden among the process data, an improved kernel principal component analysis (KPCA) method is proposed, which is referred to as deep principal component analysis (DePCA). Specifically, motivated by the deep learning strategy, we design a hierarchical statistical model structure to extract multilayer data features, including both the linear and nonlinear principal components. To reduce the computation complexity in nonlinear feature extraction, the feature-samples’ selection technique is applied to build the sparse kernel model for DePCA. To integrate the monitoring statistics at each feature layer, Bayesian inference is used to transform the monitoring statistics into fault probabilities, and then, two probability-based DePCA monitoring statistics are constructed by weighting the fault probabilities at all the feature layers. Two case studies involving a simulated nonlinear system and the benchmark Tennessee Eastman process demonstrate the superior fault detection performance of the proposed DePCA method over the traditional KPCA-based methods.

...read moreread less

Journal Article•DOI•

Multivariate time series clustering based on common principal component analysis

[...]

Hailin Li¹, Hailin Li²•Institutions (2)

College of Business Administration¹, Huaqiao University²

15 Jul 2019-Neurocomputing

TL;DR: The experimental results in the various datasets demonstrate that Mc2PCA is superior to the traditional methods for multivariate time series clustering.

...read moreread less

Journal Article•DOI•

Partially Linear Functional Additive Models for Multivariate Functional Data

[...]

Raymond K. W. Wong¹, Yehua Li², Zhengyuan Zhu³•Institutions (3)

Texas A&M University¹, University of California, Riverside², Iowa State University³

02 Jan 2019-Journal of the American Statistical Association

TL;DR: In this paper, a class of partially linear functional additive models (PLFAM) is proposed to predict a scalar response by both parametric effects of a multivariate predictor and nonparametric effect of a multi-dimensional functional predictor.

...read moreread less

Abstract: We investigate a class of partially linear functional additive models (PLFAM) that predicts a scalar response by both parametric effects of a multivariate predictor and nonparametric effects of a multivariate functional predictor. We jointly model multiple functional predictors that are cross-correlated using multivariate functional principal component analysis (mFPCA), and model the nonparametric effects of the principal component scores as additive components in the PLFAM. To address the high-dimensional nature of functional data, we let the number of mFPCA components diverge to infinity with the sample size, and adopt the component selection and smoothing operator (COSSO) penalty to select relevant components and regularize the fitting. A fundamental difference between our framework and the existing high-dimensional additive models is that the mFPCA scores are estimated with error, and the magnitude of measurement error increases with the order of mFPCA. We establish the asymptotic convergence ...

...read moreread less

Journal Article•DOI•

Application of Reduced-Order Models based on PCA & Kriging for the development of digital twins of reacting flow applications

[...]

Gianmarco Aversano¹, Aurélie Bellemans¹, Zhiyi Li¹, Axel Coussement¹, Olivier Gicquel, Alessandro Parente¹ - Show less +2 more•Institutions (1)

École Polytechnique¹

02 Feb 2019-Computers & Chemical Engineering

TL;DR: Variations of the classical PCA approach, namely Local and Constrained PCA, are presented and demonstrated on 1D and 2D flames produced by OpenSmoke++ and OpenFoam, for which accurate surrogate models have been developed.

...read moreread less

Journal Article•DOI•

Sequential three-way decisions in multi-category image recognition with deep features based on distance factor

[...]

Andrey V. Savchenko¹, Andrey V. Savchenko²•Institutions (2)

St. Petersburg Department of Steklov Institute of Mathematics¹, National Research University – Higher School of Economics²

01 Jul 2019-Information Sciences

TL;DR: The proposed algorithm based on sequential three-way decisions and a formal description of granular computing decreases the running time in 1.5–10 times when compared to conventional classifiers and the known multi-class decision-theoretic rough sets.

...read moreread less

Journal Article•DOI•

An enhanced PCA-based chiller sensor fault detection method using ensemble empirical mode decomposition based denoising

[...]

Guannan Li¹, Yunpeng Hu•Institutions (1)

Wuhan University of Science and Technology¹

15 Jan 2019-Energy and Buildings

TL;DR: This study presented an enhanced PCA-based sensor fault detection method using ensemble empirical mode decomposition (EEMD) denoising and revealed that EEMD-PCA showed better detection performance than PCA for 8 critical sensors.

...read moreread less

Journal Article•DOI•

Large-dimensional factor modeling based on high-frequency observations

[...]

Markus Pelger¹•Institutions (1)

Stanford University¹

01 Jan 2019-Journal of Econometrics

TL;DR: In this article, the authors developed a statistical theory to estimate an unknown factor structure based on financial high-frequency data and derived an estimator for the number of factors and consistent and asymptotically mixed-normal estimators of the loadings and factors under the assumption of a large number of cross-sectional and highfrequency observations.

...read moreread less

Journal Article•DOI•

A Novel Hybrid Method Integrating ICA-PCA With Relevant Vector Machine for Multivariate Process Monitoring

[...]

Yuan Xu¹, Sheng-Qi Shen¹, Yan-Lin He¹, Qunxiong Zhu¹•Institutions (1)

Beijing University of Chemical Technology¹

01 Jul 2019-IEEE Transactions on Control Systems and Technology

TL;DR: This brief proposes an independent component analysis-principal component analysis (ICA-PCA) integrating with relevance vector machine (RVM) for multivariate process monitoring using Bayesian-based classifier named RVM to simultaneously extract the non-Gaussian and Gaussian information of multivariate processes.

...read moreread less

Abstract: This brief proposes an independent component analysis-principal component analysis (ICA-PCA) integrating with relevance vector machine (RVM) for multivariate process monitoring. Given the fact that the distribution of industrial process variables is mostly non-Gaussian and PCA cannot well deal with the non-Gaussian part. A hybrid ICA-PCA method is proposed to simultaneously extract the non-Gaussian and Gaussian information of multivariate processes. ICA is first used to monitor the non-Gaussian part of the process and then the Gaussian part of the residual process can be extracted using PCA. After feature extraction, a Bayesian-based classifier named RVM is established to make fault detection for the sake of both preventing the chosen of threshold as in traditional method and compensating for the single statistic. The performance of the proposed approach is validated using the Tennessee Eastman process. Simulation results verified the effectiveness of the proposed method.

...read moreread less

Collapse