scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Variable selection via the composite likelihood method for multilevel longitudinal data with missing responses and covariates

TL;DR: A unified penalized composite likelihood framework is developed to handle data with missingness and variable selection issues and is justified both rigorously with theoretical results and numerically with simulation studies.
About: This article is published in Computational Statistics & Data Analysis.The article was published on 2019-07-01. It has received None citations till now. The article focuses on the topics: Missing data & Covariate.
References
More filters
Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations

Journal ArticleDOI
TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Abstract: Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized likelihood approaches are proposed to handle these kinds of problems. The proposed methods select variables and estimate coefficients simultaneously. Hence they enable us to construct confidence intervals for estimated parameters. The proposed approaches are distinguished from others in that the penalty functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing penalized likelihood functions. The proposed ideas are widely applicable. They are readily applied to a variety of ...

8,314 citations

Journal ArticleDOI
Hui Zou1
TL;DR: A new version of the lasso is proposed, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the ℓ1 penalty, and the nonnegative garotte is shown to be consistent for variable selection.
Abstract: The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain scenarios where the lasso is inconsistent for variable selection. We then propose a new version of the lasso, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the l1 penalty. We show that the adaptive lasso enjoys the oracle properties; namely, it performs as well as if the true underlying model were given in advance. Similar to the lasso, the adaptive lasso is shown to be near-minimax optimal. Furthermore, the adaptive lasso can be solved by the same efficient algorithm for solving the lasso. We also discuss the extension of the adaptive lasso in generalized linear models and show that the oracle properties still hold under mild regularity conditions. As a bypro...

6,765 citations

Journal ArticleDOI
TL;DR: Methods that simultaneously model the data and the drop-out process within a unified model-based framework are discussed, and possible extensions outlined.
Abstract: Subjects often drop out of longitudinal studies prematurely, yielding unbalanced data with unequal numbers of measures for each subject. Modern software programs for handling unbalanced longitudinal data improve on methods that discard the incomplete cases by including all the data, but also yield biased inferences under plausible models for the drop-out process. This article discusses methods that simultaneously model the data and the drop-out process within a unified model-based framework. Models are classified into two broad classes—random-coefficient selection models and random-coefficient pattern-mixture models—depending on how the joint distribution of the data and drop-out mechanism is factored. Inference is likelihood-based, via maximum likelihood or Bayesian methods. A number of examples in the literature are placed in this framework, and possible extensions outlined. Data collection on the nature of the drop-out process is advocated to guide the choice of model. In cases where the drop-...

1,469 citations

Journal Article
TL;DR: A survey of recent developments in the theory and application of composite likelihood is provided in this paper, building on the review paper of Varin(2008), where a range of application areas, including geostatistics, spatial extremes, and space-time mod- els, as well as clustered and longitudinal data and time series are considered.
Abstract: A survey of recent developments in the theory and application of com- posite likelihood is provided, building on the review paper of Varin(2008). A range of application areas, including geostatistics, spatial extremes, and space-time mod- els, as well as clustered and longitudinal data and time series are considered. The important area of applications to statistical genetics is omitted, in light ofLarribe and Fearnhead(2011). Emphasis is given to the development of the theory, and the current state of knowledge on e!ciency and robustness of composite likelihood inference.

1,034 citations