scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection

TL;DR: A data‐driven weighted linear combination of convex loss functions, together with weighted L1‐penalty is proposed and established a strong oracle property of the method proposed that has both the model selection consistency and estimation efficiency for the true non‐zero coefficients.
Abstract: In high-dimensional model selection problems, penalized least-square approaches have been extensively used. This paper addresses the question of both robustness and efficiency of penalized model selection methods, and proposes a data-driven weighted linear combination of convex loss functions, together with weighted L1-penalty. It is completely data-adaptive and does not require prior knowledge of the error distribution. The weighted L1-penalty is used both to ensure the convexity of the penalty term and to ameliorate the bias caused by the L1-penalty. In the setting with dimensionality much larger than the sample size, we establish a strong oracle property of the proposed method that possesses both the model selection consistency and estimation efficiency for the true non-zero coefficients. As specific examples, we introduce a robust method of composite L1-L2, and optimal composite quantile method and evaluate their performance in both simulated and real data examples.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this article , a Bayesian regularized composite quantile regression (CQR) method with group bridge penalty is adopted to conduct covariate selection and estimation in CQR.
Abstract: Bayesian regularized composite quantile regression (CQR) method with group bridge penalty is adopted to conduct covariate selection and estimation in CQR. MCMC algorithm was improved for posterior inference employing a scale mixture of normal of the asymmetric Laplace distribution (ALD). The suggested algorithm uses priors for the coefficients of regression, which are scale mixtures of multivariate uniform distributions with a particular Gamma distribution as a mixing distribution. Simulation results and analyses of real data show that the suggested MCMC sampler has excellent mixing feature and outperforms the current approaches in terms of prediction accuracy and model selection.
Journal Article
TL;DR: A new Bayesian lasso inference scheme for variable selection in composite quantile regression model (C Quantile Reg) is proposed to construct a hierarchical structure within the Gibbs sampling under the assumption that the residual term comes from skew Laplace distribution.
Abstract: In this paper, we propose a new Bayesian lasso inference scheme for variable selection in composite quantile regression model (C Quantile Reg). The suggested approach is to construct a hierarchical structure within the Gibbs sampling under the assumption that the residual term comes from skew Laplace distribution (asymmetric Laplace distribution) and assign scale mixture uniform (SMU) as prior distributions on the coefficients of composite quantile regression model. Our proposed method was compared to some other existing methods by testing the performance of these methods through simulation studies and real data examples.
01 Jan 2012
TL;DR: In this paper, the authors proposed a rank-based estimator for the non-concave penalized composite likelihood (NP-dimensionality) and showed that the proposed estimator can be efficiently estimated by using a unified regularized rank estimation scheme which does not require estimating those unknown transformation functions in the nonparanormal graphical model.
Abstract: High-dimensional graphical models are important tools for characterizing complex interactions within a large-scale system In this thesis, our emphasis is to utilize the increasingly popular regularization technique to learn sparse graphical models, and our focus is on two types of graphs: Ising model for binary data and nonparanormal graphical model for continuous data In the first part, we propose an efficient procedure for learning a sparse Ising model based on a non-concave penalized composite likelihood, which extends the methodology and theory of non-concave penalized likelihood An efficient solution path algorithm is devised by using a novel coordinate-minorization-ascent algorithm Asymptotic oracle properties of our proposed estimator are established with NP-dimensionality We demonstrate its finite sample performance via simulation studies and real applications to study the Human Immunodeficiency Virus type 1 protease structure In the second part, we study the nonparanormal graphical model that is much more robust than the Gaussian graphical model while retains the good interpretability of the latter In this thesis we show that the nonparanormal graphical model can be efficiently estimated by using a unified regularized rank estimation scheme which does not require estimating those unknown transformation functions in the nonparanormal graphical model In particular, we study the rank-based Graphical LASSO, the rank-based Dantzig selector and the rank-based CLIME We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size It is shown that the proposed rank-based estimators work as well as their oracle counterparts in both simulated and real data
DissertationDOI
10 Aug 2014
TL;DR: In this article, a quantile regression for single index model associated with variable selection technique is carried out in terms of financial data in this paper, the evaluation would be conducted by Backtesting.
Abstract: In den Finanzmarkt gibt es viele verschiedene Risikofaktoren rund um ein festgelegtes Finanzunternehmen. Zum Beispiel, Kreditrisiko, Liquiditatsrisiko und das Marktrisiko. Andere Unternehmen konnen sich auch auf diese unternehmen beeinflussen. Um die relevanten Risikofaktoren zu identifizieren und die mogliche Ansteckungseffekt zu erkennen. Die Auswirkungen von anderen Unternehmen zu dieser bestimmten unternehmen sind wichtig. Der Conditonal Value at Risk (CoVaR) konnen diese Risiken zu messen und wird in diesem Papier aufgebracht werden. Um CoVaR abzuschatzen, Quantilsregresssion ist eine grundlegende Methode. Denn die Auswirkung von anderen Risikofaktoren auf diese angegebene Unternehmen ist oft nicht linear, das Single Index Model (SIM) als semiparametrischer Schatzung spielt eine wichtige Rolle. Auswahl der relevanten Risikofaktoren kann durch Variable Auswahltechnik gelost werden. Kurz gesagt, Quantilsregression fur Single Index Model mit Variablen Auswahltechnik wurde in Bezug auf die Finanzdaten in dieser Papier durchgefuhrt werden, wurde die Bewertung durch Backtesting durchgefuhrt werden.%%%%In financial market there are many different risk factors surrounding a specified financial firm. For example, credit risk, liquidity risk and market risk. Other firms can affect this firm as well. To identify the relevant risk factors and to detect the possible contagion effects from other firms to this specified firm are important. Conditional value at risk (CoVaR) can measure these risks and will be applied in this paper. To estimate CoVaR quantile regresssion is a basic method. Since the impact from other risk factors to this specified financial firm is often nonlinear, single index model (SIM) as a semiparametric estimation plays an important role. Selecting the relevant risk factors can be solved by variable selection technique. Briefly, quantile regression for single index model associated with variable selection technique would be carried out in terms of financial data in this paper, the evaluation would be conducted by Backtesting.

Cites background from "Penalized Composite Quasi-Likelihoo..."

  • ...4: The true link functions (black) and the estimated link functions (red) with β∗> (1) = (5, 4, 3, 2, 1)....

    [...]

  • ...15 In different β∗ (1) case three different β ∗ (1)s are given as follows: (a) β ∗> (1) = (5, 5, 5, 5, 5), (b) β∗> (1) = (5, 4, 3, 2, 1), (c) β ∗> (1) = (5, 2, 1, 0....

    [...]

  • ...Three different β∗ (1): β ∗> (1) = (5, 5, 5, 5, 5), β ∗> (1) = (5, 4, 3, 2, 1) and β ∗> (1) = (5, 2, 1, 0....

    [...]

  • ...4 The true link functions (black) and the estimated link functions (red) with β∗> (1) = (5, 4, 3, 2, 1)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations

Journal ArticleDOI
TL;DR: In this article, penalized likelihood approaches are proposed to handle variable selection problems, and it is shown that the newly proposed estimators perform as well as the oracle procedure in variable selection; namely, they work as well if the correct submodel were known.
Abstract: Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally expensive and ignore stochastic errors in the variable selection process. In this article, penalized likelihood approaches are proposed to handle these kinds of problems. The proposed methods select variables and estimate coefficients simultaneously. Hence they enable us to construct confidence intervals for estimated parameters. The proposed approaches are distinguished from others in that the penalty functions are symmetric, nonconcave on (0, ∞), and have singularities at the origin to produce sparse solutions. Furthermore, the penalty functions should be bounded by a constant to reduce bias and satisfy certain conditions to yield continuous solutions. A new algorithm is proposed for optimizing penalized likelihood functions. The proposed ideas are widely applicable. They are readily applied to a variety of ...

8,314 citations

Journal ArticleDOI
TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.
Abstract: The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be applied. Typically we have available a large collection of possible covariates from which we hope to select a parsimonious set for the efficient prediction of a response variable. Least Angle Regression (LARS), a new model selection algorithm, is a useful and less greedy version of traditional forward selection methods. Three main properties are derived: (1) A simple modification of the LARS algorithm implements the Lasso, an attractive version of ordinary least squares that constrains the sum of the absolute regression coefficients; the LARS modification calculates all possible Lasso estimates for a given problem, using an order of magnitude less computer time than previous methods. (2) A different LARS modification efficiently implements Forward Stagewise linear regression, another promising new model selection method; this connection explains the similar numerical results previously observed for the Lasso and Stagewise, and helps us understand the properties of both methods, which are seen as constrained versions of the simpler LARS algorithm. (3) A simple approximation for the degrees of freedom of a LARS estimate is available, from which we derive a Cp estimate of prediction error; this allows a principled choice among the range of possible LARS estimates. LARS and its variants are computationally efficient: the paper describes a publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates.

7,828 citations


"Penalized Composite Quasi-Likelihoo..." refers background or methods in this paper

  • ...…(16) can be recast as a penalized weighted least square regression argmin β n∑ i=1 w1∣∣∣Yi −XTi β̂ (0) ∣∣∣ + w2 ( Yi −XTi β )2 + n p∑ j=1 γλ(|β(0)j |)|βj | which can be efficiently solved by pathwise coordinate optimization (Friedman et al., 2008) or least angle regression (Efron et al., 2004)....

    [...]

  • ...) are all nonnegative. This class of problems can be solved with fast and efficient computational algorithms such as pathwise coordinate optimization (Friedman et al., 2008) and least angle regression (Efron et al., 2004). One particular example is the combination of L 1 and L 2 regressions, in which K= 2, ρ 1(t) = |t−b 0|andρ 2(t) = t2. Here b 0 denotes themedian of error distributionε. Iftheerror distribution is sym...

    [...]

  • ...i=1 w 1 Yi −XT i βˆ (0) +w 2 Yi −XT i β 2 +n Xp j=1 γλ(|β (0) j |)|βj| which can be efficiently solved by pathwise coordinate optimization (Friedman et al., 2008) or least angle regression (Efron et al., 2004). If b 0 6= 0, the penalized least-squares problem ( 16) is somewhat different from (5) since we have an additional parameter b 0. Using the same arguments, and treating b 0 as an additional parameter ...

    [...]

  • ...This class of problems can be solved with fast and efficient computational algorithms such as pathwise coordinate optimization (Friedman et al., 2008) and least angle regression (Efron et al., 2004)....

    [...]

Journal ArticleDOI
Hui Zou1
TL;DR: A new version of the lasso is proposed, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the ℓ1 penalty, and the nonnegative garotte is shown to be consistent for variable selection.
Abstract: The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain scenarios where the lasso is inconsistent for variable selection. We then propose a new version of the lasso, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the l1 penalty. We show that the adaptive lasso enjoys the oracle properties; namely, it performs as well as if the true underlying model were given in advance. Similar to the lasso, the adaptive lasso is shown to be near-minimax optimal. Furthermore, the adaptive lasso can be solved by the same efficient algorithm for solving the lasso. We also discuss the extension of the adaptive lasso in generalized linear models and show that the oracle properties still hold under mild regularity conditions. As a bypro...

6,765 citations

Journal ArticleDOI
TL;DR: In this article, a new approach toward a theory of robust estimation is presented, which treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators that are asyptotically most robust (in a sense to be specified) among all translation invariant estimators.
Abstract: This paper contains a new approach toward a theory of robust estimation; it treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators—intermediaries between sample mean and sample median—that are asymptotically most robust (in a sense to be specified) among all translation invariant estimators. For the general background, see Tukey (1960) (p. 448 ff.)

5,628 citations