A Selective Overview of Variable Selection in High Dimensional Feature Space.

Open AccessJournal Article

A Selective Overview of Variable Selection in High Dimensional Feature Space.

Jianqing Fan, +1 more

- 01 Jan 2010 -

Statistica Sinica

- Vol. 20, Iss: 1, pp 101-148

TLDR

In this paper, a brief account of the recent developments of theory, methods, and implementations for high-dimensional variable selection is presented, with emphasis on independence screening and two-scale methods.

Abstract:

High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Confidence intervals for low dimensional parameters in high dimensional linear models

Cun-Hui Zhang, +1 more

- 01 Jan 2014 -

Journal of The Royal Statistical Society...

TL;DR: In this article, the authors proposed a method to construct confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model by turning the regression data into an approximate Gaussian sequence of point estimators of individual regression coefficients.

...read moreread less

Journal ArticleDOI

On asymptotically optimal confidence regions and tests for high-dimensional models

Sara van de Geer, +3 more

- 03 Mar 2013 -

arXiv: Statistics Theory

TL;DR: A general method for constructing confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in a high-dimensional model and develops the corresponding theory which includes a careful analysis for Gaussian, sub-Gaussian and bounded correlated designs.

...read moreread less

BookDOI

Simultaneous Statistical Inference

Thorsten Dickhaus

TL;DR: A variety of classical and modern type I and type II error rates in multiple hypotheses testing are defined, some relationships between them are analyzed, and different ways to cope with structured systems of hypotheses are considered.

...read moreread less

Journal ArticleDOI

On asymptotically optimal confidence regions and tests for high-dimensional models

Sara van de Geer, +3 more

- 01 Jun 2014 -

Annals of Statistics

TL;DR: In this paper, a general method for constructing confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in a high-dimensional model is proposed, which can be easily adjusted for multiplicity taking dependence among tests into account.

...read moreread less

Journal ArticleDOI

Estimation of (near) low-rank matrices with noise and high-dimensional scaling

Sahand Negahban, +1 more

- 01 Apr 2011 -

Annals of Statistics

TL;DR: Simulations show excellent agreement with the high-dimensional scaling of the error predicted by the theory, and illustrate their consequences for a number of specific learning models, including low-rank multivariate or multi-task regression, system identification in vector autoregressive processes, and recovery of low- rank matrices from random projections.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Controlling the false discovery rate: a practical and powerful approach to multiple testing

Yoav Benjamini, +1 more

- 01 Jan 1995 -

Journal of the royal statistical society...

TL;DR: In this paper, a different approach to problems of multiple significance testing is presented, which calls for controlling the expected proportion of falsely rejected hypotheses -the false discovery rate, which is equivalent to the FWER when all hypotheses are true but is smaller otherwise.

...read moreread less

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Journal ArticleDOI

A new look at the statistical model identification

Hirotugu Akaike

- 01 Dec 1974 -

IEEE Transactions on Automatic Control

TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.

...read moreread less

Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996 -

Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Journal ArticleDOI

Estimating the Dimension of a Model

Gideon Schwarz

- 01 Mar 1978 -

Annals of Statistics

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.

...read moreread less