scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 2013"


Journal ArticleDOI
Simon N. Wood1
TL;DR: The problem of testing smooth components of an extended generalized additive model for equality to zero is considered and a Wald-type test of fe0 is proposed, showing that care must be taken in selecting the rank used in the test statistic.
Abstract: The problem of testing smooth components of an extended generalized additive model for equality to zero is considered. Confidence intervals for such components exhibit good across-the-function coverage probabilities if based on the approximate result \hat f(i) ~ N{f(i),V_f(i,i)}, where f is the vector of evaluated values for the smooth component of interest and V_f is the covariance matrix for f according to the Bayesian view of the smoothing process. Based on this result, a Wald-type test of f=0 is proposed. It is shown that care must be taken in selecting the rank used in the test statistic. The method complements previous work by extending applicability beyond the Gaussian case, while considering tests of zero effect rather than testing the parametric hypothesis given by the null space of the component’s smoothing penalty. The proposed p-values are routine and efficient to compute from a fitted model, without requiring extra model fits or null distribution simulation.

297 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of detecting associations between random vectors of any dimension is considered and a powerful test that is applicable in all dimensions and consistent against all alternatives is proposed. But the test has a simple form, is easy to implement, and has good power.
Abstract: SUMMARY We consider the problem of detecting associations between random vectors of any dimension. Few tests of independence exist that are consistent against all dependent alternatives. We propose a powerful test that is applicable in all dimensions and consistent against all alternatives. The test has a simple form, is easy to implement, and has good power.

232 citations


Journal ArticleDOI
TL;DR: This work proposes an alternative to Q- and A-learning that maximizes a doubly robust augmented inverse probability weighted estimator for population mean outcome over a restricted class of regimes.
Abstract: A dynamic treatment regime is a list of sequential decision rules for assigning treatment based on a patient's history. Q- and A-learning are two main approaches for estimating the optimal regime, i.e., that yielding the most beneficial outcome in the patient population, using data from a clinical trial or observational study. Q-learning requires postulated regression models for the outcome, while A-learning involves models for that part of the outcome regression representing treatment contrasts and for treatment assignment. We propose an alternative to Q- and A-learning that maximizes a doubly robust augmented inverse probability weighted estimator for population mean outcome over a restricted class of regimes. Simulations demonstrate the method's performance and robustness to model misspecification, which is a key concern.

197 citations


Journal ArticleDOI
TL;DR: In this article, an adaptive nuclear norm penalization approach was proposed for low-rank matrix approximation, and it was used to develop a new reduced rank estimation method for high-dimensional multivariate regression.
Abstract: We propose an adaptive nuclear norm penalization approach for low-rank matrix approximation, and use it to develop a new reduced rank estimation method for high-dimensional multivariate regression. The adaptive nuclear norm is defined as the weighted sum of the singular values of the matrix, and it is generally non-convex under the natural restriction that the weight decreases with the singular value. However, we show that the proposed non-convex penalized regression method has a global optimal solution obtained from an adaptively soft-thresholded singular value decomposition. The method is computationally efficient, and the resulting solution path is continuous. The rank consistency of and prediction/estimation performance bounds for the estimator are established for a high-dimensional asymptotic regime. Simulation studies and an application in genetics demonstrate its efficacy.

194 citations


Journal ArticleDOI
TL;DR: Genton et al. as discussed by the authors generalize their results to the Brown-Resnick model and show that the efficiency gain is substantial only for very smooth processes, which are generally unrealistic in applications.
Abstract: SUMMARY Genton et al. (2011) investigated the gain in efficiency when triplewise, rather than pairwise, likelihood is used to fit the popular Smith max-stable model for spatial extremes. We generalize their results to the Brown–Resnick model and show that the efficiency gain is substantial only for very smooth processes, which are generally unrealistic in applications.

158 citations


Journal ArticleDOI
TL;DR: In this article, the authors derived sufficient conditions on the prior specification that guarantee convergence to a true density at a rate that is minimax optimal for the smoothness class to which the true density belongs.
Abstract: SUMMARY We show that rate-adaptive multivariate density estimation can be performed using Bayesian methods based on Dirichlet mixtures of normal kernels with a prior distribution on the kernel’s covariance matrix parameter. We derive sufficient conditions on the prior specification that guarantee convergence to a true density at a rate that is minimax optimal for the smoothness class to which the true density belongs. No prior knowledge of smoothness is assumed. The sufficient conditions are shown to hold for the Dirichlet location mixture-of-normals prior with a Gaussian base measure and an inverse Wishart prior on the covariance matrix parameter. Locally H¨ older

152 citations


Journal ArticleDOI
TL;DR: The authors proposed an estimator that is more robust than doubly robust estimators, based on weighting complete cases using weightsotherthaninverse probability when estimating the population mean of a response variable subject to ignorable missingness.
Abstract: SUMMARY We propose an estimator that is more robust than doubly robust estimators, based on weighting completecasesusingweightsotherthaninverseprobabilitywhenestimatingthepopulationmean of a response variable subject to ignorable missingness. We allow multiple models for both the propensity score and the outcome regression. Our estimator is consistent if any of the multiple models is correctly specified. Such multiple robustness against model misspecification is a significant improvement over double robustness, which allows only one propensity score model and one outcome regression model. Our estimator attains the semiparametric efficiency bound when one propensity score model and one outcome regression model are correctly specified, without requiring knowledge of which models are correct.

147 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace and demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.
Abstract: Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.

127 citations


Journal ArticleDOI
TL;DR: A new variable screening technique for binary classification based on the Kolmogorov--Smirnov statistic is proposed and it is proved that this so-called Kolmogsorov filter enjoys the sure screening property under much weakened model assumptions.
Abstract: Variable screening techniques have been proposed to mitigate the impact of high dimensionality in classification problems, including t-test marginal screening (Fan & Fan, 2008) and maximum marginal likelihood screening (Fan & Song, 2010). However, these methods rely on strong modelling assumptions that are easily violated in real applications. To circumvent the parametric modelling assumptions, we propose a new variable screening technique for binary classification based on the Kolmogorov--Smirnov statistic. We prove that this so-called Kolmogorov filter enjoys the sure screening property under much weakened model assumptions. We supplement our theoretical study by a simulation study. Copyright 2013, Oxford University Press.

126 citations


Journal ArticleDOI
TL;DR: A covariate-adjusted precision matrix estimation method using a constrained ℓ1 minimization, which can be easily implemented by linear programming and results in significant improvements in both precision Matrix estimation and graphical structure selection when compared to the standard Gaussian graphical model assuming constant means.
Abstract: Motivated by analysis of genetical genomics data, we introduce a sparse high dimensional multivariate regression model for studying conditional independence relationships among a set of genes adjusting for possible genetic effects. The precision matrix in the model specifies a covariate-adjusted Gaussian graph, which presents the conditional dependence structure of gene expression after the confounding genetic effects on gene expression are taken into account. We present a covariate-adjusted precision matrix estimation method using a constrained l1 minimization, which can be easily implemented by linear programming. Asymptotic convergence rates in various matrix norms and sign consistency are established for the estimators of the regression coefficients and the precision matrix, allowing both the number of genes and the number of the genetic variants to diverge. Simulation shows that the proposed method results in significant improvements in both precision matrix estimation and graphical structure selection when compared to the standard Gaussian graphical model assuming constant means. The proposed method is also applied to analyze a yeast genetical genomics data for the identification of the gene network among a set of genes in the mitogen-activated protein kinase pathway.

103 citations


Journal ArticleDOI
Marco Frei1, Hans R. Künsch1
TL;DR: This work proposes automatic choices of the parameter Gamma such that the update stays as close as possible to the particle filter update subject to avoiding degeneracy, and shows that this procedure leads to updates that are able to handle non-Gaussian features of the forecast sample even in high-dimensional situations.
Abstract: In many applications of Monte Carlo nonlinear filtering, the propagation step is computationally expensive, and hence, the sample size is limited. With small sample sizes, the update step becomes crucial. Particle filtering suffers from the well-known problem of sample degeneracy. Ensemble Kalman filtering avoids this, at the expense of treating non-Gaussian features of the forecast distribution incorrectly. Here we introduce a procedure which makes a continuous transition indexed by gamma in [0,1] between the ensemble and the particle filter update. We propose automatic choices of the parameter gamma such that the update stays as close as possible to the particle filter update subject to avoiding degeneracy. In various examples, we show that this procedure leads to updates which are able to handle non-Gaussian features of the prediction sample even in high-dimensional situations.

Journal ArticleDOI
Simon N. Wood1
TL;DR: In this paper, the authors exploit the link between random effects and penalized regression to develop a simple test for a zero effect, which can be used with generalized linear mixed models, including those estimated by penalized quasilikelihood.
Abstract: SUMMARY Testing that random effects are zero is difficult, because the null hypothesis restricts the corresponding variance parameter to the edge of the feasible parameter space. In the context of generalized linear mixed models, this paper exploits the link between random effects and penalized regression to develop a simple test for a zero effect. The idea is to treat the variance components not being tested as fixed at their estimates and then to express the likelihood ratio as a readily computed quadratic form in the predicted values of the random effects. Under the null hypothesis this has the distribution of a weighted sum of squares of independent standardnormalrandomvariables.Thetestcanbeusedwithgeneralizedlinearmixedmodels, including those estimated by penalized quasilikelihood.

Journal ArticleDOI
TL;DR: In this article, the authors established the consistency of the maximum likelihood estimate in the β$-model when the number of vertices goes to infinity, by approximating the inverse of the Fisher information matrix.
Abstract: Chatterjee, Diaconis and Sly (2011) recently established the consistency of the maximum likelihood estimate in the $\beta$-model when the number of vertices goes to infinity. By approximating the inverse of the Fisher information matrix, we obtain its asymptotic normality under mild conditions. Simulation studies and a data example illustrate the theoretical results.

Journal ArticleDOI
TL;DR: In this article, asymptotic results for a Gaussian process over a fixed domain with Mat´ ern covariance function are extended to the case of jointly estimating the range and the variance of the process.
Abstract: SUMMARY Two canonical problems in geostatistics are estimating the parameters in a specified family of stochastic process models and predicting the process at new locations. We show that asymptotic results for a Gaussian process over a fixed domain with Mat´ ern covariance function, previously proven only in the case of a fixed range parameter, can be extended to the case of jointly estimating the range and the variance of the process. Moreover, we show that intuition and approximations derived from asymptotics using a fixed range parameter can be problematic when applied to finite samples, even for large sample sizes. In contrast, we show via simulation that performance is improved and asymptotic approximations are applicable for smaller sample sizes when the parameters are jointly estimated. These effects are particularly apparent when the process is mean square differentiable or the effective range of spatial correlation is small.

Journal ArticleDOI
TL;DR: In this article, a framework for conditional simulations of max-stable processes and closed forms for Brown-Resnick and Schlather processes is proposed and tested on simulated data and given an application to extreme rainfall around Zurich and extreme temperature in Switzerland.
Abstract: SUMMARY Since many environmental processes such as heat waves or precipitation are spatial in extent, it is likely that a single extreme event affects several loca tions and the areal modelling of extremes is therefore essential if the spatial dependence of e xtremes has to be appropriately taken into account. This paper proposes a framework for conditional simulations of max-stable processes and give closed forms for Brown‐Resnick and Schlather processes. We test the method on simulated data and give an application to extreme rainfall around Zurich and extreme temperature in Switzerland. Results show that the proposed framework provides accurate conditional simulations and can handle real-sized problems.

Journal ArticleDOI
TL;DR: A novel framework is proposed that encompasses most of the Gaussian Markov random field-based multivariate disease mapping models in the literature, and allows comparison of all these models in a common context, thus helping to understand them better.
Abstract: This paper deals with multivariate disease mapping. We propose a novel framework that encompasses most of the models already proposed. Our framework starts with a simple identity, reformulating Kronecker products of covariance matrices as simple matrix products. This formula is computationally convenient, and its generalizations reproduce most of the proposals in the disease mapping literature. Use of the identity leads to a flexible, general and computationally convenient modelling framework, making it possible to combine spatial dependence structures and different relationships between diseases with limited effort. Moreover, as the proposed modelling framework covers most of the Gaussian Markov random field-based multivariate disease mapping models in the literature, it allows comparison of all these models in a common context, thus helping us to understand them better. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: A method to estimate the weighting function from data using incomplete data and covariates measured with error and the results of a simulation study show that the estimator is consistent and has no bias and small variance.
Abstract: Inverse probability-weighted estimators are widely used in applications where data are missing due to nonresponse or censoring and in the estimation of causal effects from observational studies. Current estimators rely on ignorability assumptions for response indicators or treatment assignment and outcomes being conditional on observed covariates which are assumed to be measured without error. However, measurement error is common for the variables collected in many applications. For example, in studies of educational interventions, student achievement as measured by standardized tests is almost always used as the key covariate for removing hidden biases, but standardized test scores may have substantial measurement errors. We provide several expressions for a weighting function that can yield a consistent estimator for population means using incomplete data and covariates measured with error. We propose a method to estimate the weighting function from data. The results of a simulation study show that the estimator is consistent and has no bias and small variance.

Journal ArticleDOI
TL;DR: A proportion adaptive segment selection procedure that automatically adjusts to the unknown proportions of the carriers of the segment variants that is applied to analyze neuroblastoma samples and identifies a large number of copy number variants that are missed by single-sample methods.
Abstract: Copy number variant is an important type of genetic structural variation appearing in germline DNA, ranging from common to rare in a population. Both rare and common copy number variants have been reported to be associated with complex diseases, so it is therefore important to simultaneously identify both based on a large set of population samples. We develop a proportion adaptive segment selection procedure that automatically adjusts to the unknown proportions of the carriers of the segment variants. We characterize the detection boundary that separates the region where a segment variant is detectable by some method from the region where it cannot be detected. Although the detection boundaries are very different for the rare and common segment variants, it is shown that the proposed procedure can reliably identify both whenever they are detectable. Compared with methods for single sample analysis, this procedure gains power by pooling information from multiple samples. The method is applied to analyze neuroblastoma samples and identifies a large number of copy number variants that are missed by single-sample methods.

Journal ArticleDOI
TL;DR: In this article, a class of flexible functional nonlinear regression models, where random predictor curves are coupled with scalar responses, is introduced. But these models are not asymptotically normal.
Abstract: SUMMARY We introduce continuously additive models, which can be viewed as extensions of additive regression models with vector predictors to the case of infinite-dimensional predictors. This approach produces a class of flexible functional nonlinear regression models, where random predictor curves are coupled with scalar responses. In continuously additive modelling, integrals taken over a smooth surface along graphs of predictor functions relate the predictors to the responses in a nonlinear fashion. We use tensor product basis expansions to fit the smooth regression surface that characterizes the model. In a theoretical investigation, we show that the predictions obtained from fitting continuously additive models are consistent and asymptotically normal. We also consider extensions to generalized responses. The proposed class of models outperforms existing functional regression models in simulations and real-data examples.

Journal ArticleDOI
TL;DR: A strong orthogonal array of strength t enjoys better space-filling properties than a comparable OO array in all dimensions lower than t while retaining the space filling properties of the latter in t dimensions as mentioned in this paper.
Abstract: This paper introduces, constructs and studies a new class of arrays, called strong orthogonal arrays, as suitable designs for computer experiments. A strong orthogonal array of strength t enjoys better space-filling properties than a comparable orthogonal array in all dimensions lower than t while retaining the space-filling properties of the latter in t dimensions. Latin hypercubes based on strong orthogonal arrays of strength t are more space-filling than comparable orthogonal array-based Latin hypercubes in all g dimensions for any 2 ≤ g ≤ t - 1. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, the authors investigate the asymptotic behavior of posterior distributions of regression coefficients in high-dimensional linear models as the number of dimensions grows with number of observations and show that posterior distribution concentrates in neighbourhoods of the true parameter under simple sufficient conditions.
Abstract: SUMMARY We investigate the asymptotic behaviour of posterior distributions of regression coefficients in highdimensional linear models as the number of dimensions grows with the number of observations. We show that the posterior distribution concentrates in neighbourhoods of the true parameter under simple sufficient conditions. These conditions hold under popular shrinkage priors given some sparsity assumptions. Somekeywords:Bayesianlasso;GeneralizeddoubleParetoprior;Heavytail;High-dimensionaldata;Horseshoeprior; Posterior consistency; Shrinkage estimation.

Journal ArticleDOI
TL;DR: In this article, the authors make two contributions to the computational geometry of decomposable graphs, aimed primarily at facilitating statistical inference about such graphs where they arise as assumed conditional independence structures in stochastic models.
Abstract: This paper makes two contributions to the computational geometry of decomposable graphs, aimed primarily at facilitating statistical inference about such graphs where they arise as assumed conditional independence structures in stochastic models. The rst of these provides sucient conditions under which it is possible to completely connect two disconnected cliques of vertices, or perform the reverse procedure, yet maintain decomposability of the graph. The second is a new Markov chain Monte Carlo sampler for arbitrary positive distributions on decomposable graphs, taking a junction tree representing the graph as its state variable. The resulting methodology is illustrated with two numerical experiments. eld, model determination.

Journal ArticleDOI
TL;DR: In the challenging high-dimensional settings considered in this paper the saddlepoint approximations perform very well in all examples considered and are equivalent to a multivariate saddlepoint density approximation for the joint distribution of a set of quadratic forms in normal variables.
Abstract: In an earlier paper Kume & Wood (2005) showed how the normalizing constant of the Fisher– Bingham distribution on a sphere can be approximated with high accuracy using a univariate saddlepoint density approximation. In this sequel, we extend the approach to a more general setting and derive saddlepoint approximations for the normalizing constants of multicomponent Fisher– Bingham distributions on Cartesian products of spheres, and Fisher–Bingham distributions on Stiefel manifolds. In each case, the approximation for the normalizing constant is essentially a multivariate saddlepoint density approximation for the joint distribution of a set of quadratic forms in normal variables. Both first-order and second-order saddlepoint approximations are considered. Computational algorithms, numerical results and theoretical properties of the approximations are presented. In the challenging high-dimensional settings considered in this paper the saddlepoint approximations perform very well in all examples considered. Some key words: Directional data; Fisher matrix distribution; Kent distribution; Orientation statistics.

Journal ArticleDOI
TL;DR: This paper proposes an estimating equation approach with a new weight function, and establishes the consistency and asymptotic normality of the resulting estimator.
Abstract: The case-cohort study design, used to reduce costs in large cohort studies, is a random sample of the entire cohort, named the subcohort, augmented with subjects having the disease of interest but not in the subcohort sample. When several diseases are of interest, several case-cohort studies may be conducted using the same subcohort, with each disease analyzed separately, ignoring the additional exposure measurements collected on subjects with the other diseases. This is not an efficient use of the data, and in this paper, we propose more efficient estimators. We consider both joint and separate analyses for the multiple diseases. We propose an estimating equation approach with a new weight function, and we establish the consistency and asymptotic normality of the resulting estimator. Simulation studies show that the proposed methods using all available information gain efficiency. We apply our proposed method to the data from the Busselton Health Study.

Journal ArticleDOI
TL;DR: It is shown that under certain monotonicity assumptions about its effect on the treatment and on the outcome, an effect measure controlling for the mismeasured confounder will fall between its corresponding crude and the true effect measures.
Abstract: Suppose we are interested in the effect of a binary treatment on an outcome where that relationship is confounded by an ordinal confounder. We assume that the true confounder is not observed, rather we observe a nondifferentially mismeasured version of it. We show that under certain monotonicity assumptions about its effect on the treatment and on the outcome, an effect measure controlling for the mismeasured confounder will fall between its corresponding crude and the true effect measures. We present results for coarsened, and, under further assumptions, for multiple misclassified confounders.

Journal ArticleDOI
TL;DR: In this paper, the authors use the conditional bias associated with a sample unit to derive robust estimators that are obtained by downweighting the most influential sample units, and show that the resulting estimator is consistent.
Abstract: We argue that the conditional bias associated with a sample unit can be a useful measure of influence in finite population sampling. We use the conditional bias to derive robust estimators that are obtained by downweighting the most influential sample units. Under the model-based approach to inference, our proposed robust estimator is closely related to the well-known estimator of Chambers (1986). Under the design-based approach, it possesses the desirable feature of being applicable with most sampling designs used in practice. For stratified simple random sampling, it is essentially equivalent to the estimator of Kokic & Bell (1994). The proposed robust estimator depends on a tuning constant. In this paper, we propose a method for determining the tuning constant and show that the resulting estimator is consistent. Results from a simulation study suggest that our approach improves the efficiency of standard nonrobust estimators when the population contains units that may be influential if selected in the sample.

Journal ArticleDOI
TL;DR: In this paper, the authors use the theory of normal variance-mean mixtures to derive a data augmentation scheme that unifies a wide class of statistical models under a single framework.
Abstract: We use the theory of normal variance-mean mixtures to derive a data-augmentation scheme that unies a wide class of statistical models under a single framework. This generalizes existing theory on normal variance mixtures for priors in regression and classication. It also allows variants of the expectation-maximization algorithm to be brought to bear on a much wider range of models than previously appreciated. We demonstate the resulting gains in accuracy and stability on several examples, including sparse quantile regression and binary logistic regression.

Journal ArticleDOI
TL;DR: In this article, a penalized likelihood approach to nonparametric multivariate spectral analysis through the minimization of the penalized Whittle negative log-likelihood is proposed, which allows for varying levels of smoothness among spectral components while accounting for the positive definiteness of spectral matrices.
Abstract: Nonparametric estimation procedures that can flexibly account for varying levels of smoothness among different functional parameters, such as penalized likelihoods, have been developed in a variety of settings. However, geometric constraints on power spectra have limited the development of such methods when estimating the power spectrum of a vector-valued time series. This article introduces a penalized likelihood approach to nonparametric multivariate spectral analysis through the minimization of a penalized Whittle negative loglikelihood. This likelihood is derived from the large-sample distribution of the periodogram and includes a penalty function that forms a measure of regularity on multivariate power spectra. The approach allows for varying levels of smoothness among spectral components while accounting for the positive definiteness of spectral matrices and the Hermitian and periodic structures of power spectra as functions of frequency. The consistency of the proposed estimator is derived and its empirical performance is demonstrated in a simulation study and in an analysis of indoor air quality. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, the authors apply concepts from partial identification to the domain of finite population sampling and propose a method for interval estimation of a population mean when the probabilities of sample selection lie within a posited interval.
Abstract: Applying concepts from partial identification to the domain of finite population sampling, we propose a method for interval estimation of a population mean when the probabilities of sample selection lie within a posited interval. The interval estimate is derived from sharp bounds on the Hajek (1971) estimator of the population mean. We demonstrate the method's utility for sensitivity analysis by applying it to a sample of needles collected as part of a syringe tracking and testing programme in New Haven, Connecticut. Copyright 2013, Oxford University Press.

Journal ArticleDOI
TL;DR: In this article, two ways of modifying the inverse probability weights to improve efficiency while retaining consistency are investigated, and the robustness of the estimators to misspecification of the auxiliary weight model or the regression model of interest is discussed.
Abstract: Sampling related to the outcome variable of a regression analysis conditional on covariates is called informative sampling and may lead to bias in ordinary least squares estimation. Weighting by the reciprocal of the inclusion probability approximately removes such bias but may inflate variance. This paper investigates two ways of modifying such weights to improve efficiency while retaining consistency. One approach is to multiply the inverse probability weights by functions of the covariates. The second is to smooth the weights given values of the outcome variable and covariates. Optimal ways of constructing weights by these two approaches are explored. Both approaches require the fitting of auxiliary weight models. The asymptotic properties of the resulting estimators are investigated and linearization variance estimators are obtained. The approach is extended to pseudo maximum likelihood estimation for generalized linear models. The properties of the different weighted estimators are compared in a limited simulation study. The robustness of the estimators to misspecification of the auxiliary weight model or of the regression model of interest is discussed.