Open AccessPosted Content
Strong rules for discarding predictors in lasso-type problems
Robert Tibshirani,Jacob Bien,Jerome H. Friedman,Trevor Hastie,Noah Simon,Jonathan Taylor,Ryan J. Tibshirani +6 more
Reads0
Chats0
TLDR
In this paper, the authors propose strong rules for discarding predictors in lasso regression and related problems, for computational efficiency, complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions.Abstract:
We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui et al (2010) propose "SAFE" rules that guarantee that a coefficient will be zero in the solution, based on the inner products of each predictor with the outcome. In this paper we propose strong rules that are not foolproof but rarely fail in practice. These can be complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions to provide safe rules that offer substantial speed and space savings in a variety of statistical convex optimization problems.read more
Citations
More filters
Sure independence screening for ultrahigh dimensional feature space Discussion
P Bickel,Peter Bühlmann,Qiwei Yao,Richard J. Samworth,Peter Hall,D. M. Titterington,JH Xue,Christoforos Anagnostopoulos,DK Tasoullis,Wenyang Zhang,YC Xia,IM Johnstone,S. Richardson,L Bottolo,JT Kent,K Adragni,RD Cook,U Gather,C Guddat,Eitan Greenshtein,Gareth M. James,Peter Radchenko,Chenlei Leng,HS Wang,E Levina,J Zhu,RZ Li,YF Liu,N. T. Longford,WQ Luo,PD Baxter,CC Taylor,James Stephen Marron,JS Morris,Christian P. Robert,KM Yu,Cun-Hui Zhang,Hao Helen Zhang,HH Zhou,XH Lin,H Zou +40 more
Dissertation
Accelerating sparse inverse problems using structured approximations
TL;DR: In this paper, a particular family of dictionaries, written as a sum of Kronecker products, is proposed, and stable screening tests are developed to safely identify and discard useless atoms (columns of the dictionary matrix which do not correspond to the solution support).
Dissertation
Genetic risk score based on statistical learning
TL;DR: In this paper, the authors used ex-treme gradient boosting for imputing genotyped variants, feature engineering to cap-ture recessive and dominant effects in penalized regression, and parameter tuning and stacked regressions to improve polygenic prediction.
Machine Learning on Graphs
TL;DR: The contribution of this thesis is the derivation of the deterministic variational inference update equations for doing inference on the SHDPHMM, an improvement over the Markov Chain Monte Carlo algorithm proposed by Fox as it allows for direct assessment of convergence and can run faster.
Dissertation
Computational Curation of Open Science Data
TL;DR: Computational Curation of Open Science Data is presented as a probabilistic procedure to estimate the probability that a particular type of data will be chosen for an particular science research project.