Sparse partial least squares regression for simultaneous dimension reduction and variable selection

doi:10.1111/J.1467-9868.2009.00723.X

Open AccessJournal ArticleDOI

Sparse partial least squares regression for simultaneous dimension reduction and variable selection

Hyonho Chun, +1 more

- 01 Jan 2010 -

Journal of The Royal Statistical Society...

- Vol. 72, Iss: 1, pp 3-25

TLDR

This work provides an efficient implementation of sparse partial least squares regression and compares it with well‐known variable selection and dimension reduction approaches via simulation experiments and illustrates the practical utility in a joint analysis of gene expression and genomewide binding data.

Abstract:

Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data.

Citations

PDF

Open Access

More filters

Book

Applied Predictive Modeling

Max Kuhn, +1 more

TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.

...read moreread less

Journal ArticleDOI

Do we need hundreds of classifiers to solve real world classification problems

Manuel Fernández-Delgado, +3 more

- 01 Jan 2014 -

Journal of Machine Learning Research

TL;DR: The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in theTop-20, respectively).

...read moreread less

Journal ArticleDOI

A review of variable selection methods in Partial Least Squares Regression

Tahir Mehmood, +3 more

- 15 Aug 2012 -

Chemometrics and Intelligent Laboratory ...

TL;DR: A review of available methods for variable selection within one of the many modeling approaches for high-throughput data, Partial Least Squares Regression, to get an understanding of the characteristics of the methods and to get a basis for selecting an appropriate method for own use.

...read moreread less

Journal ArticleDOI

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems

Kim-Anh Lê Cao, +2 more

- 22 Jun 2011 -

BMC Bioinformatics

TL;DR: A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework and has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets.

...read moreread less

Journal ArticleDOI

Sparse Discriminant Analysis

Line Katrine Harder Clemmensen, +3 more

- 01 Nov 2011 -

Technometrics

TL;DR: This work proposes sparse discriminantAnalysis, a method for performing linear discriminant analysis with a sparseness criterion imposed such that classification and feature selection are performed simultaneously in the high-dimensional setting.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

Yoav Benjamini, +1 more

- 01 Jan 1995 -

Journal of the royal statistical society...

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.

...read moreread less

Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996 -

Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Book

Matrix computations

Gene H. Golub

Journal ArticleDOI

Regularization and variable selection via the elastic net

Hui Zou, +1 more

- 01 Apr 2005 -

Journal of The Royal Statistical Society...

TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.

...read moreread less

Journal ArticleDOI

Least angle regression

Bradley Efron, +19 more

- 01 Apr 2004 -

Annals of Statistics

TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

...read moreread less

Collapse

Sparse partial least squares regression for simultaneous dimension reduction and variable selection

Citations

Applied Predictive Modeling

Do we need hundreds of classifiers to solve real world classification problems

A review of variable selection methods in Partial Least Squares Regression

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems

Sparse Discriminant Analysis

References

Controlling the false discovery rate: a practical and powerful approach to multiple testing

Regression Shrinkage and Selection via the Lasso

Matrix computations

Regularization and variable selection via the elastic net

Least angle regression

Related Papers (5)

Regression Shrinkage and Selection via the Lasso

Regularization and variable selection via the elastic net

Regularization Paths for Generalized Linear Models via Coordinate Descent

PLS-regression: a basic tool of chemometrics

Random Forests