Penalized Composite Quasi-Likelihood for Ultrahigh-Dimensional Variable Selection
Citations
8 citations
Cites background or methods from "Penalized Composite Quasi-Likelihoo..."
...Bradic et al. (2011) consider a weighted penalized CQR estimator and its oracle properties when the error distribution is unknown....
[...]
...In Bradic et al. (2011), the optimal value of the weights is given by ν = A−1f to achieve the lower bound for the variance, (f>A−1f)−1. While such weights may be negative and thus lead to a nonconvex objective function that is hard to optimize, an alternative weight vector ν+ is obtained by minimizing ν>Aν subject to having all weights nonnegative and f>ν = 1. There is no explicit expression for the nonnegative optimal weights ν+. The authors show by simulations that both types of optimal weights outperform the equally-weighted estimator. Noteworthy, Bradic et al. (2011) comment upon the computational complexity of the composite quantile estimation method with a large number of quantiles, but report that usually k ≤ 10 suffices....
[...]
...Under the above-mentioned assumptions and the assumptions of Theorem 2 of Bradic et al. (2011), for the model-averaged penalized quantile predictions it holds that √ n( 1 n Ua(X > a Xa) −1U>a ) −1/2{ω1Ua(β̂a,τ1,pen − βa) + . . .+ ωkUa(β̂a,τk,pen − βa)} →d Nr(0, (ω>Ωω)Ir) where β̂a,τl,pen is the…...
[...]
...For high-dimensional models, only the composite estimator has been considered (Bradic et al., 2011)....
[...]
...Consider now a sparse high-dimensional linear model as in Bradic et al. (2011), Y = Xβ + ε (5) with independent and identically distributed mean-zero errors ε and with p, the number of columns of X large relative to the sample size n, allowing for an exponential order such that log(p) = O(nδ) with…...
[...]
8 citations
7 citations
Cites methods from "Penalized Composite Quasi-Likelihoo..."
...Following Bradic et al. (2011), we set the number of quantiles to beK D 9 and the quantile vector T D ....
[...]
...Following Bradic et al. (2011), we set the number of quantiles to beK D 9 and the quantile vector T D .0:1; 0:2; : : : ; 0:9/....
[...]
7 citations
7 citations
Cites background or methods from "Penalized Composite Quasi-Likelihoo..."
...The criterion for the choice of weights is to maximize the efficiency of the estimator Bradic, Fan, and Wang (2011)....
[...]
...Bradic, Fan, and Wang (2011) chooses the weight vector by minimizing the scalar function....
[...]
...They considered the composite loss function as an approximation to the unknown log-likelihood function of the error distribution Bradic, Fan, and Wang (2011) while ACME considers each loss component as a model targeting different profiles of the conditional distribution....
[...]
...For completely overlapping models, Bradic, Fan, and Wang (2011) and Zou and Yuan (2008) used composite loss functions with the goal of improving efficiency of the regression parameter estimators....
[...]
...We also compared with penalized composite quasi-likelihood (PCQ) in Bradic, Fan, and Wang (2011), which was developed for a classical linear model....
[...]
References
40,785 citations
8,314 citations
7,828 citations
"Penalized Composite Quasi-Likelihoo..." refers background or methods in this paper
...…(16) can be recast as a penalized weighted least square regression argmin β n∑ i=1 w1∣∣∣Yi −XTi β̂ (0) ∣∣∣ + w2 ( Yi −XTi β )2 + n p∑ j=1 γλ(|β(0)j |)|βj | which can be efficiently solved by pathwise coordinate optimization (Friedman et al., 2008) or least angle regression (Efron et al., 2004)....
[...]
...) are all nonnegative. This class of problems can be solved with fast and efficient computational algorithms such as pathwise coordinate optimization (Friedman et al., 2008) and least angle regression (Efron et al., 2004). One particular example is the combination of L 1 and L 2 regressions, in which K= 2, ρ 1(t) = |t−b 0|andρ 2(t) = t2. Here b 0 denotes themedian of error distributionε. Iftheerror distribution is sym...
[...]
...i=1 w 1 Yi −XT i βˆ (0) +w 2 Yi −XT i β 2 +n Xp j=1 γλ(|β (0) j |)|βj| which can be efficiently solved by pathwise coordinate optimization (Friedman et al., 2008) or least angle regression (Efron et al., 2004). If b 0 6= 0, the penalized least-squares problem ( 16) is somewhat different from (5) since we have an additional parameter b 0. Using the same arguments, and treating b 0 as an additional parameter ...
[...]
...This class of problems can be solved with fast and efficient computational algorithms such as pathwise coordinate optimization (Friedman et al., 2008) and least angle regression (Efron et al., 2004)....
[...]
6,765 citations
5,628 citations