scispace - formally typeset
Open AccessPosted Content

Strong rules for discarding predictors in lasso-type problems

Reads0
Chats0
TLDR
In this paper, the authors propose strong rules for discarding predictors in lasso regression and related problems, for computational efficiency, complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions.
Abstract
We consider rules for discarding predictors in lasso regression and related problems, for computational efficiency. El Ghaoui et al (2010) propose "SAFE" rules that guarantee that a coefficient will be zero in the solution, based on the inner products of each predictor with the outcome. In this paper we propose strong rules that are not foolproof but rarely fail in practice. These can be complemented with simple checks of the Karush- Kuhn-Tucker (KKT) conditions to provide safe rules that offer substantial speed and space savings in a variety of statistical convex optimization problems.

read more

Citations
More filters
Dissertation

Accelerating sparse inverse problems using structured approximations

TL;DR: In this paper, a particular family of dictionaries, written as a sum of Kronecker products, is proposed, and stable screening tests are developed to safely identify and discard useless atoms (columns of the dictionary matrix which do not correspond to the solution support).
Dissertation

Genetic risk score based on statistical learning

TL;DR: In this paper, the authors used ex-treme gradient boosting for imputing genotyped variants, feature engineering to cap-ture recessive and dominant effects in penalized regression, and parameter tuning and stacked regressions to improve polygenic prediction.

Machine Learning on Graphs

TL;DR: The contribution of this thesis is the derivation of the deterministic variational inference update equations for doing inference on the SHDPHMM, an improvement over the Markov Chain Monte Carlo algorithm proposed by Fox as it allows for direct assessment of convergence and can run faster.
Dissertation

Computational Curation of Open Science Data

TL;DR: Computational Curation of Open Science Data is presented as a probabilistic procedure to estimate the probability that a particular type of data will be chosen for an particular science research project.
Related Papers (5)