EfficientL 1 regularized logistic regression

Home
/
Papers
/
EfficientL 1 regularized logistic regression

Proceedings Article•

EfficientL 1 regularized logistic regression

Sun-In Lee¹, Honglak Lee¹, Pieter Abbeel¹, Andrew Y. Ng¹•Institutions (1)

16 Jul 2006-pp 401-408

TL;DR: Theoretical results show that the proposed efficient algorithm for L1 regularized logistic regression is guaranteed to converge to the global optimum, and experiments show that it significantly outperforms standard algorithms for solving convex optimization problems.

read less

Abstract: L1 regularized logistic regression is now a workhorse of machine learning: it is widely used for many classification problems, particularly ones with many features. L1 regularized logistic regression requires solving a convex optimization problem. However, standard algorithms for solving convex optimization problems do not scale well enough to handle the large datasets encountered in many practical settings. In this paper, we propose an efficient algorithm for L1 regularized logistic regression. Our algorithm iteratively approximates the objective function by a quadratic approximation at the current point, while maintaining the L1 constraint. In each iteration, it uses the efficient LARS (Least Angle Regression) algorithm to solve the resulting L1 constrained quadratic optimization problem. Our theoretical results show that our algorithm is guaranteed to converge to the global optimum. Our experiments show that our algorithm significantly outperforms standard algorithms for solving convex optimization problems. Moreover, our algorithm outperforms four previously published algorithms that were specifically designed to solve the L1 regularized logistic regression problem.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Discovery and predictive modeling of urine microbiome, metabolite and cytokine biomarkers in hospitalized patients with community acquired pneumonia

[...]

Joseph F. Pierre¹, Oguz Akbilgic², Heather S. Smallwood¹, Xueyuan Cao¹, Elizabeth A. Fitzpatrick¹, Senen Pena³, Stephen Furmanek³, Julio A. Ramirez³, Colleen B. Jonsson¹ - Show less +5 more•Institutions (3)

University of Tennessee Health Science Center¹, Loyola University Chicago², University of Louisville³

07 Aug 2020-Scientific Reports

TL;DR: Urine from patients hospitalized with pneumonia may serve as a reliable and accessible sample to evaluate biomarkers that may diagnose etiology and predict clinical outcomes, and predictive modeling of 291 microbial and metabolite values achieved a + 90% accuracy in predicting specific pneumonia etiology.

...read moreread less

Abstract: Pneumonia is the leading cause of infectious related death costing 12 billion dollars annually in the United States alone. Despite improvements in clinical care, total mortality remains around 4%, with inpatient mortality reaching 5–10%. For unknown reasons, mortality risk remains high even after hospital discharge and there is a need to identify those patients most at risk. Also of importance, clinical symptoms alone do not distinguish viral from bacterial infection which may delay appropriate treatment and may contribute to short-term and long-term mortality. Biomarkers have the potential to provide point of care diagnosis, identify high-risk patients, and increase our understanding of the biology of disease. However, there have been mixed results on the diagnostic performance of many of the analytes tested to date. Urine represents a largely untapped source for biomarker discovery and is highly accessible. To test this hypothesis, we collected urine from hospitalized patients with community-acquired pneumonia (CAP) and performed a comprehensive screen for urinary tract microbiota signatures, metabolite, and cytokine profiles. CAP patients were diagnosed with influenza or bacterial (Streptococcus pneumoniae and Staphylococcus aureus) etiologies and compared with healthy volunteers. Microbiome signatures showed marked shifts in taxonomic levels in patients with bacterial etiology versus influenza and CAP versus normal. Predictive modeling of 291 microbial and metabolite values achieved a + 90% accuracy with LASSO in predicting specific pneumonia etiology. This study demonstrates that urine from patients hospitalized with pneumonia may serve as a reliable and accessible sample to evaluate biomarkers that may diagnose etiology and predict clinical outcomes.

...read moreread less

9 citations

Proceedings Article•DOI•

Model-driven parametric monitoring of high-dimensional nonlinear functional profiles

[...]

Gang Liu¹, Chen Kan¹, Yun Chen¹, Hui Yang¹•Institutions (1)

University of South Florida¹

01 Jan 2014

TL;DR: Experimental results on real-world data from patient monitoring showed that the proposed methodology outperforms traditional methods and effectively identify a sparse set of sensitive features from high-dimensional datasets for process monitoring and fault diagnostics.

...read moreread less

Abstract: In order to cope with system complexity and dynamic environments, modern industries are investing in a variety of sensor networks and data acquisition systems to increase information visibility. Multi-sensor systems bring the proliferation of high-dimensional functional profiles that capture rich information on the evolving dynamics of natural and engineered processes. This provides an unprecedented opportunity for online monitoring of operational quality and integrity of complex systems. However, the classical methodology of statistical process control is not concerned about high-dimensional sensor signals and is limited in the capability to perform multi-sensor fault diagnostics. It is not uncommon that multi-dimensional sensing capabilities are not fully utilized for decision making. This paper presents a new model-driven parametric monitoring strategy for the detection of dynamic fault patterns in high-dimensional functional profiles that are nonlinear and nonstationary. First, we developed a sparse basis function model of high-dimensional functional profiles, thereby reducing the large amount of data to a parsimonious set of model parameters (i.e., weight, shifting and scaling factors) while preserving the information. Further, we utilized the lasso-penalized logistic regression model to select a low-dimensional set of sensitive predictors for fault diagnostics. Experimental results on real-world data from patient monitoring showed that the proposed methodology outperforms traditional methods and effectively identify a sparse set of sensitive features from high-dimensional datasets for process monitoring and fault diagnostics.

...read moreread less

9 citations

Posted Content•

Fast Newton Method for Sparse Logistic Regression

[...]

Rui Wang, Naihua Xiu, Shenglong Zhou

09 Jan 2019

TL;DR: The proposed method FNSLR, an abbreviation for Newton method for sparse logistic regression, enjoys a very low computational complexity, local quadratic convergence rate and termination within finite steps.

...read moreread less

Abstract: Sparse logistic regression has been developed tremendously in recent two decades, from its origination the $\ell_1$-regularized version by Tibshirani(1996) to the sparsity constrained models by Bahmani, Raj, and Boufounos (2013); Plan and Vershynin (2013). This paper is carried out on the sparsity constrained logistic regression through the classical Newton method. We begin with analysing its first optimality condition to acquire a strong $\tau$-stationary point for some $\tau>0$. This point enables us to equivalently derive a stationary equation system which is able to be efficiently solved by Newton method. The proposed method FNSLR, an abbreviation for Newton method for sparse logistic regression, enjoys a very low computational complexity, local quadratic convergence rate and termination within finite steps. Numerical experiments on random data and real data demonstrate its superior performance when against with seven state-of-the-art solvers.

...read moreread less

9 citations

Cites methods from "EfficientL 1 regularized logistic r..."

...…Lassplore (Liu et al., 2009a) or SLEP (Liu et al., 2009b); When ν = 0, namely, the `1 constrained logistic regression, min z∈Rp `(z), s.t. ‖z‖1 ≤ t, (5) Lee et al. (2006) developed the IRLS-LARS scheme where LARS (see Efron et al., 2004) was first introduced to address an `1 constrained least…...
[...]

Journal Article•DOI•

Inferring Networks from Multiple Samples with Consensus LASSO

[...]

Nathalie Villa-Vialaneix¹, Matthieu Vignes², Nathalie Viguerie³, Magali San Cristobal⁴•Institutions (4)

University of Paris¹, Institut national de la recherche agronomique², Paul Sabatier University³, Institut national des sciences appliquées de Toulouse⁴

01 Mar 2014-Quality Technology and Quantitative Management

TL;DR: This work introduces a novel method for inferring networks from samples obtained in various but related experimental conditions based on a double penalization, which aims at controlling the global sparsity of the solution whilst a second penalty is used to make condition-specific networks consistent with a consensual network.

...read moreread less

Abstract: Networks are very useful tools to decipher complex regulatory relationships between genes in an organism. Most work address this issue in the context of i.i.d., treated vs. control or time-series samples. However, many data sets include expression obtained for the same cell type of an organism, but in several conditions. We introduce a novel method for inferring networks from samples obtained in various but related experimental conditions. This approach is based on a double penalization: a first penalty aims at controlling the global sparsity of the solution whilst a second penalty is used to make condition-specific networks consistent with a consensual network. This ''consensual network'' is introduced to represent the dependency structure between genes, which is shared by all conditions. We show that different ''consensus'' penalty can be used, some integrating prior (e.g., bibliographic) knowledge and others that are adapted along the optimization scheme. In all situations, the proposed double penalty can be expressed in terms of a LASSO problem and hence, solved using standard approaches which address quadratic problems with $L_1$-regularization. This approach is combined with a bootstrap approach and is made available in the R package therese. Our proposal is illustrated on simulated datasets and compared with independent estimations and alternative methods. It is also applied to a real dataset to emphasize the differences in regulatory networks before and after a low-calorie diet.

...read moreread less

9 citations

Additional excerpts

...Others proposed to use differentiable approximations of P , such as [14] that takes advantage of the approximation ‖β‖1 ≃ ∑ j √ β(2) j + ǫ....
[...]

Journal Article•DOI•

Identification of regulatory modules that stratify lupus disease mechanism through integrating multi-omics data

[...]

Ting-You Wang¹, Yong-Fei Wang¹, Yan Zhang¹, Jiangshan Jane Shen², Jiangshan Jane Shen¹, Mengbiao Guo¹, Jing Yang¹, Yun lung Lau¹, Wanling Yang¹ - Show less +5 more•Institutions (2)

Li Ka Shing Faculty of Medicine, University of Hong Kong¹, Jining Medical University²

06 Mar 2020-Molecular therapy. Nucleic acids

TL;DR: A hierarchical regulatory cascade with TFs regulated by DAGs, which in turn regulates gene expression is identified, which revealed SLE pathogenesis pathways, including the complement cascade, cell cycle regulation, NETosis, and epigenetic regulation.

...read moreread less

Abstract: Although recent advances in genetic studies have shed light on systemic lupus erythematosus (SLE), its detailed mechanisms remain elusive. In this study, using datasets on SLE transcriptomic profiles, we identified 750 differentially expressed genes (DEGs) in T and B lymphocytes and peripheral blood cells. Using transcription factor (TF) binding data derived from chromatin immunoprecipitation sequencing (ChIP-seq) experiments from the Encyclopedia of DNA Elements (ENCODE) project, we inferred networks of co-regulated genes (NcRGs) based on binding profiles of the upregulated DEGs by significantly enriched TFs. Modularization analysis of NcRGs identified co-regulatory modules among the DEGs and master TFs vital for each module. Remarkably, the co-regulatory modules stratified the common SLE interferon (IFN) signature and revealed SLE pathogenesis pathways, including the complement cascade, cell cycle regulation, NETosis, and epigenetic regulation. By integrative analyses of disease-associated genes (DAGs), DEGs, and enriched TFs, as well as proteins interacting with them, we identified a hierarchical regulatory cascade with TFs regulated by DAGs, which in turn regulates gene expression. Integrative analysis of multi-omics data provided valuable molecular insights into the molecular mechanisms of SLE.

...read moreread less

9 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
…
35
36
37
38
39
40
41
…
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Regression Shrinkage and Selection via the Lasso

[...]

Robert Tibshirani

01 Jan 1996-Journal of the royal statistical society series b-methodological

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

...read moreread less

40,785 citations

"EfficientL 1 regularized logistic r..." refers methods in this paper

...(Tibshirani 1996) Several algorithms have been developed to solve L1 constrained least squares problems....
[...]
...See, Tibshirani (1996) for details.)...
[...]
...(Tibshirani 1996) Several algorithms have been developed to solve L1 constrained least squares problems....
[...]

Book•

Convex Optimization

[...]

Stephen Boyd¹, Lieven Vandenberghe²•Institutions (2)

Stanford University¹, University of California, Los Angeles²

01 Mar 2004

TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.

...read moreread less

Abstract: Convex optimization problems arise frequently in many different fields. A comprehensive introduction to the subject, this book shows in detail how such problems can be solved numerically with great efficiency. The focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them. The text contains many worked examples and homework exercises and will appeal to students, researchers and practitioners in fields such as engineering, computer science, mathematics, statistics, finance, and economics.

...read moreread less

33,341 citations

Book•

Generalized Linear Models

[...]

Peter McCullagh¹, John A. Nelder•Institutions (1)

Imperial College London¹

01 Jan 1983

TL;DR: In this paper, a generalization of the analysis of variance is given for these models using log- likelihoods, illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables), and gamma (variance components).

...read moreread less

Abstract: The technique of iterative weighted linear regression can be used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log- likelihoods. These generalized linear models are illustrated by examples relating to four distributions; the Normal, Binomial (probit analysis, etc.), Poisson (contingency tables) and gamma (variance components).

...read moreread less

23,215 citations

UCI Repository of machine learning databases

[...]

Catherine Blake

01 Jan 1998

12,940 citations

"EfficientL 1 regularized logistic r..." refers methods in this paper

...We tested each algorithm’s performance on 12 different datasets, consisting of 9 UCI datasets (Newman et al. 1998), one artificial dataset called Madelon from the NIPS 2003 workshop on feature extraction,3 and two gene expression datasets (Microarray 1 and 2).4 Table 2 gives details on the number…...
[...]
...We tested each algorithm’s performance on 12 different real datasets, consisting of 9 UCI datasets (Newman et al. 1998) and 3 gene expression datasets (Microarray 1, 2 and 3) 3....
[...]

Journal Article•DOI•

Generalized Linear Models

[...]

Eric R. Ziegel

01 Aug 2002-Technometrics

TL;DR: This is the rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.

...read moreread less

Abstract: This is the rst book on generalized linear models written by authors not mostly associated with the biological sciences. Subtitled “With Applications in Engineering and the Sciences,” this book’s authors all specialize primarily in engineering statistics. The rst author has produced several recent editions of Walpole, Myers, and Myers (1998), the last reported by Ziegel (1999). The second author has had several editions of Montgomery and Runger (1999), recently reported by Ziegel (2002). All of the authors are renowned experts in modeling. The rst two authors collaborated on a seminal volume in applied modeling (Myers and Montgomery 2002), which had its recent revised edition reported by Ziegel (2002). The last two authors collaborated on the most recent edition of a book on regression analysis (Montgomery, Peck, and Vining (2001), reported by Gray (2002), and the rst author has had multiple editions of his own regression analysis book (Myers 1990), the latest of which was reported by Ziegel (1991). A comparable book with similar objectives and a more speci c focus on logistic regression, Hosmer and Lemeshow (2000), reported by Conklin (2002), presumed a background in regression analysis and began with generalized linear models. The Preface here (p. xi) indicates an identical requirement but nonetheless begins with 100 pages of material on linear and nonlinear regression. Most of this will probably be a review for the readers of the book. Chapter 2, “Linear Regression Model,” begins with 50 pages of familiar material on estimation, inference, and diagnostic checking for multiple regression. The approach is very traditional, including the use of formal hypothesis tests. In industrial settings, use of p values as part of a risk-weighted decision is generally more appropriate. The pedagologic approach includes formulas and demonstrations for computations, although computing by Minitab is eventually illustrated. Less-familiar material on maximum likelihood estimation, scaled residuals, and weighted least squares provides more speci c background for subsequent estimation methods for generalized linear models. This review is not meant to be disparaging. The authors have packed a wealth of useful nuggets for any practitioner in this chapter. It is thoroughly enjoyable to read. Chapter 3, “Nonlinear Regression Models,” is arguably less of a review, because regression analysis courses often give short shrift to nonlinear models. The chapter begins with a great example on the pitfalls of linearizing a nonlinear model for parameter estimation. It continues with the effective balancing of explicit statements concerning the theoretical basis for computations versus the application and demonstration of their use. The details of maximum likelihood estimation are again provided, and weighted and generalized regression estimation are discussed. Chapter 4 is titled “Logistic and Poisson Regression Models.” Logistic regression provides the basic model for generalized linear models. The prior development for weighted regression is used to motivate maximum likelihood estimation for the parameters in the logistic model. The algebraic details are provided. As in the development for linear models, some of the details are pushed into an appendix. In addition to connecting to the foregoing material on regression on several occasions, the authors link their development forward to their following chapter on the entire family of generalized linear models. They discuss score functions, the variance-covariance matrix, Wald inference, likelihood inference, deviance, and overdispersion. Careful explanations are given for the values provided in standard computer software, here PROC LOGISTIC in SAS. The value in having the book begin with familiar regression concepts is clearly realized when the analogies are drawn between overdispersion and nonhomogenous variance, or analysis of deviance and analysis of variance. The authors rely on the similarity of Poisson regression methods to logistic regression methods and mostly present illustrations for Poisson regression. These use PROC GENMOD in SAS. The book does not give any of the SAS code that produces the results. Two of the examples illustrate designed experiments and modeling. They include discussion of subset selection and adjustment for overdispersion. The mathematic level of the presentation is elevated in Chapter 5, “The Family of Generalized Linear Models.” First, the authors unify the two preceding chapters under the exponential distribution. The material on the formal structure for generalized linear models (GLMs), likelihood equations, quasilikelihood, the gamma distribution family, and power functions as links is some of the most advanced material in the book. Most of the computational details are relegated to appendixes. A discussion of residuals returns one to a more practical perspective, and two long examples on gamma distribution applications provide excellent guidance on how to put this material into practice. One example is a contrast to the use of linear regression with a log transformation of the response, and the other is a comparison to the use of a different link function in the previous chapter. Chapter 6 considers generalized estimating equations (GEEs) for longitudinal and analogous studies. The rst half of the chapter presents the methodology, and the second half demonstrates its application through ve different examples. The basis for the general situation is rst established using the case with a normal distribution for the response and an identity link. The importance of the correlation structure is explained, the iterative estimation procedure is shown, and estimation for the scale parameters and the standard errors of the coef cients is discussed. The procedures are then generalized for the exponential family of distributions and quasi-likelihood estimation. Two of the examples are standard repeated-measures illustrations from biostatistical applications, but the last three illustrations are all interesting reworkings of industrial applications. The GEE computations in PROC GENMOD are applied to account for correlations that occur with multiple measurements on the subjects or restrictions to randomizations. The examples show that accounting for correlation structure can result in different conclusions. Chapter 7, “Further Advances and Applications in GLM,” discusses several additional topics. These are experimental designs for GLMs, asymptotic results, analysis of screening experiments, data transformation, modeling for both a process mean and variance, and generalized additive models. The material on experimental designs is more discursive than prescriptive and as a result is also somewhat theoretical. Similar comments apply for the discussion on the quality of the asymptotic results, which wallows a little too much in reports on various simulation studies. The examples on screening and data transformations experiments are again reworkings of analyses of familiar industrial examples and another obvious motivation for the enthusiasm that the authors have developed for using the GLM toolkit. One can hope that subsequent editions will similarly contain new examples that will have caused the authors to expand the material on generalized additive models and other topics in this chapter. Designating myself to review a book that I know I will love to read is one of the rewards of being editor. I read both of the editions of McCullagh and Nelder (1989), which was reviewed by Schuenemeyer (1992). That book was not fun to read. The obvious enthusiasm of Myers, Montgomery, and Vining and their reliance on their many examples as a major focus of their pedagogy make Generalized Linear Models a joy to read. Every statistician working in any area of applied science should buy it and experience the excitement of these new approaches to familiar activities.

...read moreread less

10,520 citations