scispace - formally typeset
Search or ask a question
Author

Małgorzata Bogdan

Other affiliations: Purdue University, Jan Długosz University, Lund University  ...read more
Bio: Małgorzata Bogdan is an academic researcher from University of Wrocław. The author has contributed to research in topics: Model selection & Bayesian information criterion. The author has an hindex of 19, co-authored 84 publications receiving 1632 citations. Previous affiliations of Małgorzata Bogdan include Purdue University & Jan Długosz University.


Papers
More filters
Journal ArticleDOI
TL;DR: SLOPE as mentioned in this paper is the solution to the sorted L-one penalized estimator, where the regularizer is a sorted l1 norm, which penalizes the regression coefficients according to their rank: the higher the rank, stronger the signal, the larger the penalty.
Abstract: We introduce a new estimator for the vector of coefficients β in the linear model y = Xβ + z, where X has dimensions n × p with p possibly larger than n. SLOPE, short for Sorted L-One Penalized Estimation, is the solution to [Formula: see text]where λ1 ≥ λ2 ≥ … ≥ λ p ≥ 0 and [Formula: see text] are the decreasing absolute values of the entries of b. This is a convex program and we demonstrate a solution algorithm whose computational complexity is roughly comparable to that of classical l1 procedures such as the Lasso. Here, the regularizer is a sorted l1 norm, which penalizes the regression coefficients according to their rank: the higher the rank-that is, stronger the signal-the larger the penalty. This is similar to the Benjamini and Hochberg [J. Roy. Statist. Soc. Ser. B57 (1995) 289-300] procedure (BH) which compares more significant p-values with more stringent thresholds. One notable choice of the sequence {λ i } is given by the BH critical values [Formula: see text], where q ∈ (0, 1) and z(α) is the quantile of a standard normal distribution. SLOPE aims to provide finite sample guarantees on the selected model; of special interest is the false discovery rate (FDR), defined as the expected proportion of irrelevant regressors among all selected predictors. Under orthogonal designs, SLOPE with λBH provably controls FDR at level q. Moreover, it also appears to have appreciable inferential properties under more general designs X while having substantial power, as demonstrated in a series of experiments running on both simulated and real data.

303 citations

Journal ArticleDOI
01 Jun 2004-Genetics
TL;DR: This work investigates the behavior of the Schwarz Bayesian information criterion (BIC) by explaining the phenomenon of the overestimation and proposing a novel modification of BIC that allows the detection of main effects and pairwise interactions in a backcross population.
Abstract: The problem of locating multiple interacting quantitative trait loci (QTL) can be addressed as a multiple regression problem, with marker genotypes being the regressor variables. An important and difficult part in fitting such a regression model is the estimation of the QTL number and respective interactions. Among the many model selection criteria that can be used to estimate the number of regressor variables, none are used to estimate the number of interactions. Our simulations demonstrate that epistatic terms appearing in a model without the related main effects cause the standard model selection criteria to have a strong tendency to overestimate the number of interactions, and so the QTL number. With this as our motivation we investigate the behavior of the Schwarz Bayesian information criterion (BIC) by explaining the phenomenon of the overestimation and proposing a novel modification of BIC that allows the detection of main effects and pairwise interactions in a backcross population. Results of an extensive simulation study demonstrate that our modified version of BIC performs very well in practice. Our methodology can be extended to general populations and higher-order interactions.

186 citations

Journal ArticleDOI
TL;DR: It is demonstrated that true features and null features are always interspersed on the Lasso path, and that this phenomenon occurs no matter how strong the effect sizes are.
Abstract: In regression settings where explanatory variables have very low correlations and there are relatively few effects, each of large magnitude, we expect the Lasso to find the important variables with few errors, if any. This paper shows that in a regime of linear sparsity—meaning that the fraction of variables with a nonvanishing effect tends to a constant, however small—this cannot really be the case, even when the design variables are stochastically independent. We demonstrate that true features and null features are always interspersed on the Lasso path, and that this phenomenon occurs no matter how strong the effect sizes are. We derive a sharp asymptotic trade-off between false and true positive rates or, equivalently, between measures of type I and type II errors along the Lasso path. This trade-off states that if we ever want to achieve a type II error (false negative rate) under a critical value, then anywhere on the Lasso path the type I error (false positive rate) will need to exceed a given threshold so that we can never have both errors at a low level at the same time. Our analysis uses tools from approximate message passing (AMP) theory as well as novel elements to deal with a possibly adaptive selection of the Lasso regularizing parameter.

181 citations

Posted Content
TL;DR: In this article, a new estimator called SLOPE is proposed for sparse regression and variable selection, which is inspired by modern ideas in multiple testing, such as the BHq procedure [Benjamini and Hochberg, 1995].
Abstract: We introduce a novel method for sparse regression and variable selection, which is inspired by modern ideas in multiple testing. Imagine we have observations from the linear model y = X beta + z, then we suggest estimating the regression coefficients by means of a new estimator called SLOPE, which is the solution to minimize 0.5 ||y - Xb\|_2^2 + lambda_1 |b|_(1) + lambda_2 |b|_(2) + ... + lambda_p |b|_(p); here, lambda_1 >= \lambda_2 >= ... >= \lambda_p >= 0 and |b|_(1) >= |b|_(2) >= ... >= |b|_(p) is the order statistic of the magnitudes of b. The regularizer is a sorted L1 norm which penalizes the regression coefficients according to their rank: the higher the rank, the larger the penalty. This is similar to the famous BHq procedure [Benjamini and Hochberg, 1995], which compares the value of a test statistic taken from a family to a critical threshold that depends on its rank in the family. SLOPE is a convex program and we demonstrate an efficient algorithm for computing the solution. We prove that for orthogonal designs with p variables, taking lambda_i = F^{-1}(1-q_i) (F is the cdf of the errors), q_i = iq/(2p), controls the false discovery rate (FDR) for variable selection. When the design matrix is nonorthogonal there are inherent limitations on the FDR level and the power which can be obtained with model selection methods based on L1-like penalties. However, whenever the columns of the design matrix are not strongly correlated, we demonstrate empirically that it is possible to select the parameters lambda_i as to obtain FDR control at a reasonable level as long as the number of nonzero coefficients is not too large. At the same time, the procedure exhibits increased power over the lasso, which treats all coefficients equally. The paper illustrates further estimation properties of the new selection rule through comprehensive simulation studies.

107 citations

Journal ArticleDOI
24 Jan 2022-Galaxies
TL;DR: In this article , a Markov-Chain Monte Carlo analysis (MCMC) was used to obtain the value of H0 assuming Gaussian priors to restrict the parameters spaces to values we expect from our prior knowledge of current cosmological models and to avoid phantom Dark Energy models with w <−1.
Abstract: The difference from 4 to 6 σ in the Hubble constant (H0) between the values observed with the local (Cepheids and Supernovae Ia, SNe Ia) and the high-z probes (Cosmic Microwave Background obtained by the Planck data) still challenges the astrophysics and cosmology community. Previous analysis has shown that there is an evolution in the Hubble constant that scales as f(z)=H0/(1+z)η, where H0 is H0(z=0) and η is the evolutionary parameter. Here, we investigate if this evolution still holds by using the SNe Ia gathered in the Pantheon sample and the Baryon Acoustic Oscillations. We assume H0=70kms−1Mpc−1 as the local value and divide the Pantheon into three bins ordered in increasing values of redshift. Similar to our previous analysis but varying two cosmological parameters contemporaneously (H0, Ω0m in the ΛCDM model and H0, wa in the w0waCDM model), for each bin we implement a Markov-Chain Monte Carlo analysis (MCMC) obtaining the value of H0 assuming Gaussian priors to restrict the parameters spaces to values we expect from our prior knowledge of the current cosmological models and to avoid phantom Dark Energy models with w<−1. Subsequently, the values of H0 are fitted with the model f(z). Our results show that a decreasing trend with η∼10−2 is still visible in this sample. The η coefficient reaches zero in 2.0 σ for the ΛCDM model up to 5.8 σ for w0waCDM model. This trend, if not due to statistical fluctuations, could be explained through a hidden astrophysical bias, such as the effect of stretch evolution, or it requires new theoretical models, a possible proposition is the modified gravity theories, f(R). This analysis is meant to further cast light on the evolution of H0 and it does not specifically focus on constraining the other parameters. This work is also a preparatory to understand how the combined probes still show an evolution of the H0 by redshift and what is the current status of simulations on GRB cosmology to obtain the uncertainties on the Ω0m comparable with the ones achieved through SNe Ia.

105 citations


Cited by
More filters
01 Aug 2000
TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

4,833 citations

Book
27 Nov 2013
TL;DR: The many different interpretations of proximal operators and algorithms are discussed, their connections to many other topics in optimization and applied mathematics are described, some popular algorithms are surveyed, and a large number of examples of proxiesimal operators that commonly arise in practice are provided.
Abstract: This monograph is about a class of optimization algorithms called proximal algorithms. Much like Newton's method is a standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at a higher level of abstraction than classical algorithms like Newton's method: the base operation is evaluating the proximal operator of a function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting a point onto a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here, we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that commonly arise in practice.

3,627 citations

Journal ArticleDOI
TL;DR: This paper re-examine the Bayesian paradigm for model selection and proposes an extended family of Bayesian information criteria, which take into account both the number of unknown parameters and the complexity of the model space.
Abstract: SUMMARY The ordinary Bayesian information criterion is too liberal for model selection when the model space is large. In this paper, we re-examine the Bayesian paradigm for model selection and propose an extended family of Bayesian information criteria, which take into account both the number of unknown parameters and the complexity of the model space. Their consistency is established, in particular allowing the number of covariates to increase to infinity with the sample size. Their performance in various situations is evaluated by simulation studies. It is demonstrated that the extended Bayesian information criteria incur a small loss in the positive selection rate but tightly control the false discovery rate, a desirable property in many applications. The extended Bayesian information criteria are extremely useful for variable selection in problems with a moderate sample size but with a huge number of covariates, especially in genome-wide association studies, which are now an active area in genetics research.

1,472 citations

Journal ArticleDOI
TL;DR: In this article, the authors proposed a new approach to sparsity called the horseshoe estimator, which is a member of the same family of multivariate scale mixtures of normals.
Abstract: This paper proposes a new approach to sparsity called the horseshoe estimator. The horseshoe is a close cousin of other widely used Bayes rules arising from, for example, double-exponential and Cauchy priors, in that it is a member of the same family of multivariate scale mixtures of normals. But the horseshoe enjoys a number of advantages over existing approaches, including its robustness, its adaptivity to dierent sparsity patterns, and its analytical tractability. We prove two theorems that formally characterize both the horseshoe’s adeptness at large outlying signals, and its super-ecient rate of convergence to the correct estimate of the sampling density in sparse situations. Finally, using a combination of real and simulated data, we show that the horseshoe estimator corresponds quite closely to the answers one would get by pursuing a full Bayesian model-averaging approach using a discrete mixture prior to model signals and noise.

1,260 citations