scispace - formally typeset
Search or ask a question
Author

Jinchi Lv

Other affiliations: Princeton University
Bio: Jinchi Lv is an academic researcher from University of Southern California. The author has contributed to research in topics: Estimator & Curse of dimensionality. The author has an hindex of 29, co-authored 86 publications receiving 7469 citations. Previous affiliations of Jinchi Lv include Princeton University.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, the authors introduce the concept of sure screening and propose a sure screening method that is based on correlation learning, called sure independence screening, to reduce dimensionality from high to a moderate scale that is below the sample size.
Abstract: Summary. Variable selection plays an important role in high dimensional statistical modelling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality p, accuracy of estimation and computational cost are two top concerns. Recently, Candes and Tao have proposed the Dantzig selector using L1-regularization and showed that it achieves the ideal risk up to a logarithmic factor log (p). Their innovative procedure and remarkable result are challenged when the dimensionality is ultrahigh as the factor log (p) can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method that is based on correlation learning, called sure independence screening, to reduce dimensionality from high to a moderate scale that is below the sample size. In a fairly general asymptotic framework, correlation learning is shown to have the sure screening property for even exponentially growing dimensionality. As a methodological extension, iterative sure independence screening is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be accomplished by a well-developed method such as smoothly clipped absolute deviation, the Dantzig selector, lasso or adaptive lasso. The connections between these penalized least squares methods are also elucidated.

2,204 citations

Posted Content
TL;DR: The concept of sure screening is introduced and a sure screening method that is based on correlation learning, called sure independence screening, is proposed to reduce dimensionality from high to a moderate scale that is below the sample size.
Abstract: Variable selection plays an important role in high dimensional statistical modeling which nowadays appears in many areas and is key to various scientific discoveries. For problems of large scale or dimensionality $p$, estimation accuracy and computational cost are two top concerns. In a recent paper, Candes and Tao (2007) propose the Dantzig selector using $L_1$ regularization and show that it achieves the ideal risk up to a logarithmic factor $\log p$. Their innovative procedure and remarkable result are challenged when the dimensionality is ultra high as the factor $\log p$ can be large and their uniform uncertainty principle can fail. Motivated by these concerns, we introduce the concept of sure screening and propose a sure screening method based on a correlation learning, called the Sure Independence Screening (SIS), to reduce dimensionality from high to a moderate scale that is below sample size. In a fairly general asymptotic framework, the correlation learning is shown to have the sure screening property for even exponentially growing dimensionality. As a methodological extension, an iterative SIS (ISIS) is also proposed to enhance its finite sample performance. With dimension reduced accurately from high to below sample size, variable selection can be improved on both speed and accuracy, and can then be accomplished by a well-developed method such as the SCAD, Dantzig selector, Lasso, or adaptive Lasso. The connections of these penalized least-squares methods are also elucidated.

1,917 citations

Journal Article
TL;DR: In this paper, a brief account of the recent developments of theory, methods, and implementations for high-dimensional variable selection is presented, with emphasis on independence screening and two-scale methods.
Abstract: High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.

892 citations

Journal ArticleDOI
TL;DR: This work examines covariance matrix estimation in the asymptotic framework that the dimensionality p tends to [infinity] as the sample size n increases and identifies situations under which the factor approach increases performance substantially or marginally.

678 citations

Journal ArticleDOI
TL;DR: It is shown that in the context of generalized linear models, such methods possess model selection consistency with oracle properties even for dimensionality of nonpolynomial order of sample size, for a class of penalized likelihood approaches using folded-concave penalty functions, which were introduced to ameliorate the bias problems of convex penalty functions.
Abstract: Penalized likelihood methods are fundamental to ultrahigh dimensional variable selection. How high dimensionality such methods can handle remains largely unknown. In this paper, we show that in the context of generalized linear models, such methods possess model selection consistency with oracle properties even for dimensionality of nonpolynomial (NP) order of sample size, for a class of penalized likelihood approaches using folded-concave penalty functions, which were introduced to ameliorate the bias problems of convex penalty functions. This fills a long-standing gap in the literature where the dimensionality is allowed to grow slowly with the sample size. Our results are also applicable to penalized likelihood with the L1-penalty, which is a convex function at the boundary of the class of folded-concave penalty functions under consideration. The coordinate optimization is implemented for finding the solution paths, whose performance is evaluated by a few simulation examples and the real data analysis.

390 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In many important statistical applications, the number of variables or parameters p is much larger than the total number of observations n as discussed by the authors, and it is possible to estimate β reliably based on the noisy data y.
Abstract: In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y=Xβ+z, where β∈Rp is a parameter vector of interest, X is a data matrix with possibly far fewer rows than columns, n≪p, and the zi’s are i.i.d. N(0, σ^2). Is it possible to estimate β reliably based on the noisy data y?

3,539 citations

Posted Content
TL;DR: In this paper, the authors investigated conditions sufficient for identification of average treatment effects using instrumental variables and showed that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect.
Abstract: We investigate conditions sufficient for identification of average treatment effects using instrumental variables. First we show that the existence of valid instruments is not sufficient to identify any meaningful average treatment effect. We then establish that the combination of an instrument and a condition on the relation between the instrument and the participation status is sufficient for identification of a local average treatment effect for those who can be induced to change their participation status by changing the value of the instrument. Finally we derive the probability limit of the standard IV estimator under these conditions. It is seen to be a weighted average of local average treatment effects.

3,154 citations

Journal ArticleDOI
TL;DR: The need to develop appropriate and efficient analytical methods to leverage massive volumes of heterogeneous data in unstructured text, audio, and video formats is highlighted and the need to devise new tools for predictive analytics for structured big data is reinforced.

2,962 citations

Journal ArticleDOI
TL;DR: It is proved that at a universal penalty level, the MC+ has high probability of matching the signs of the unknowns, and thus correct selection, without assuming the strong irrepresentable condition required by the LASSO.
Abstract: We propose MC+, a fast, continuous, nearly unbiased and accurate method of penalized variable selection in high-dimensional linear regression. The LASSO is fast and continuous, but biased. The bias of the LASSO may prevent consistent variable selection. Subset selection is unbiased but computationally costly. The MC+ has two elements: a minimax concave penalty (MCP) and a penalized linear unbiased selection (PLUS) algorithm. The MCP provides the convexity of the penalized loss in sparse regions to the greatest extent given certain thresholds for variable selection and unbiasedness. The PLUS computes multiple exact local minimizers of a possibly nonconvex penalized loss function in a certain main branch of the graph of critical points of the penalized loss. Its output is a continuous piecewise linear path encompassing from the origin for infinite penalty to a least squares solution for zero penalty. We prove that at a universal penalty level, the MC+ has high probability of matching the signs of the unknowns, and thus correct selection, without assuming the strong irrepresentable condition required by the LASSO. This selection consistency applies to the case of $p\gg n$, and is proved to hold for exactly the MC+ solution among possibly many local minimizers. We prove that the MC+ attains certain minimax convergence rates in probability for the estimation of regression coefficients in $\ell_r$ balls. We use the SURE method to derive degrees of freedom and $C_p$-type risk estimates for general penalized LSE, including the LASSO and MC+ estimators, and prove their unbiasedness. Based on the estimated degrees of freedom, we propose an estimator of the noise level for proper choice of the penalty level.

2,727 citations