scispace - formally typeset
Journal ArticleDOI

A robust PCR method for high‐dimensional regressors

Reads0
Chats0
TLDR
In this article, the authors proposed a robust principal component regression (RPCR) method for multivariate calibration model, which combines principal component analysis (PCA) on the regressors with least square regression.
Abstract
We consider the multivariate calibration model which assumes that the concentrations of several constituents of a sample are linearly related to its spectrum. Principal component regression (PCR) is widely used for the estimation of the regression parameters in this model. In the classical approach it combines principal component analysis (PCA) on the regressors with least squares regression. However, both stages yield very unreliable results when the data set contains outlying observations. We present a robust PCR (RPCR) method which also consists of two parts. First we apply a robust PCA method for high-dimensional data on the regressors, then we regress the response variables on the scores using a robust regression method. A robust RMSECV value and a robust R 2 value are proposed as exploratory tools to select the number of principal components. The prediction error is also estimated in a robust way. Moreover, we introduce several diagnostic plots which are helpful to visualize and classify the outliers. The robustness of RPCR is demonstrated through simulations and the analysis of a real data set.

read more

Citations
More filters
Journal ArticleDOI

ROBPCA: A New Approach to Robust Principal Component Analysis

TL;DR: The ROBPCA approach, which combines projection pursuit ideas with robust scatter matrix estimation, yields more accurate estimates at noncontaminated datasets and more robust estimates at contaminated data.
Journal ArticleDOI

LIBRA: a MATLAB library for robust analysis

TL;DR: A MATLAB library of robust statistical methods used by chemometricians, statisticians, chemists, and engineers is introduced and many graphical tools to detect and classify the outliers are provided.
Journal ArticleDOI

High-Breakdown Robust Multivariate Methods

TL;DR: In this paper, the authors focus on high-breakdown methods, which can deal with a substantial fraction of outliers in the data, and give an overview of recent high breakdown robust methods for multivariate settings such as covariance estimation, multiple and multivariate regression, discriminant analysis, principal components and multiivariate calibration.
Journal ArticleDOI

Increased sensitivity in neuroimaging analyses using robust regression

TL;DR: This work shows that robust iteratively reweighted least squares (IRLS) at the 2nd level is a computationally efficient technique that both increases statistical power and decreases false positive rates in the presence of outliers and provides software to implement IRLS in group neuroimaging analyses.
Journal ArticleDOI

Robust methods for partial least squares regression

TL;DR: In this article, the authors introduce robustified versions of the SIMPLS algorithm, which are constructed from a robust covariance matrix for high-dimensional data and robust linear regression, and introduce robust RMSECV and RMSEP values for model calibration and model validation.
References
More filters
Book

Applied Multivariate Statistical Analysis

TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Journal ArticleDOI

Applied Multivariate Statistical Analysis.

TL;DR: In this article, the authors present an overview of the basic concepts of multivariate analysis, including matrix algebra and random vectors, as well as a strategy for analyzing multivariate models.
Book

Robust Regression and Outlier Detection

TL;DR: This paper presents the results of a two-year study of the statistical treatment of outliers in the context of one-Dimensional Location and its applications to discrete-time reinforcement learning.
Journal ArticleDOI

Least Median of Squares Regression

TL;DR: In this paper, the median of the squared residuals is used to resist the effect of nearly 50% of contamination in the data in the special case of simple least square regression, which corresponds to finding the narrowest strip covering half of the observations.