scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 2004"


Journal ArticleDOI
TL;DR: The asymptotic properties of formal maximum likelihood estimators in applications in which only a single qx1 vector of observations is observed are examined, and conditions under which consistent estimators of parameters result from the approximate likelihood using only pairwise joint distributions are studied.
Abstract: For likelihood-based inference involving distributions in which high-dimensional dependencies are present it may be useful to use approximate likelihoods based, for example, on the univariate or bivariate marginal distributions. The asymptotic properties of formal maximum likelihood estimators in such cases are outlined. In particular, applications in which only a single q x I vector of observations is observed are examined. Conditions under which consistent estimators of parameters result from the approximate likelihood using only pairwise joint distributions are studied. Some examples are analysed in detail.

448 citations


Journal ArticleDOI
TL;DR: The cube method as discussed by the authors selects approximately balanced samples with equal or unequal inclusion probabilities and any number of auxiliary variables, depending on the correlations of these variables with the controlled variables, i.e., the correlation of the variables of interest with the control variables.
Abstract: A balanced sampling design is defined by the property that the Horvitz-Thompson estimators of the population totals of a set of auxiliary variables equal the known totals of these variables. Therefore the variances of estimators of totals of all the variables of interest are reduced, depending on the correlations of these variables with the controlled variables. In this paper, we develop a general method, called the cube method, for selecting approximately balanced samples with equal or unequal inclusion probabilities and any number of auxiliary variables.

242 citations


Journal ArticleDOI
TL;DR: In this article, a covariance selection model is defined in terms of the Markov properties, i.e. conditional independences associated with G, which in turn are equivalent to specified zeros among the set of pairwise partial correlation coefficients.
Abstract: ggA multivariate Gaussian graphical Markov model for an undirected graph G, also called a covariance selection model or concentration graph model, is defined in terms of the Markov properties, i.e. conditional independences associated with G, which in turn are equivalent to specified zeros among the set of pairwise partial correlation coefficients. By means of Fisher's z-transformation and Sidak's correlation inequality, conservative simultaneous confidence intervals for the entire set of partial correlations can be obtained, leading to a simple method for model selection that controls the overall error rate for incorrect edge inclusion. The simultaneous p-values corresponding to the partial correlations are partitioned into three disjoint sets, a significant set S, an indeterminate set I and a nonsignificant set N. Our model selection method selects two graphs, a graph G SI whose edges correspond to the set S∪I, and a more conservative graph G S whose edges correspond to S only. Similar considerations apply to covariance graph models, which are defined in terms of marginal independence rather than conditional independence. The method is applied to some well-known examples and to simulated data.

213 citations


Journal ArticleDOI
TL;DR: In this paper, the authors extend the idea of cross-validation to choose the smoothing parameters of the double-kernel local linear regression for estimating a conditional density, which optimises the estimated conditional density function by minimising the integrated squared error.
Abstract: SUMMARY We extend the idea of crossvalidation to choose the smoothing parameters of the 'double-kernel' local linear regression for estimating a conditional density. Our selection rule optimises the estimated conditional density function by minimising the integrated squared error. We also discuss three other bandwidth selection rules, an ad hoc method used by Fan et al. (1996), a bootstrap method of Hall et al. (1999) for bandwidth selection in the estimation of conditional distribution functions, modified by Bashtannyk & Hyndman (2001) to cover conditional density functions, and finally a simple approach proposed by Hyndman & Yao (2002). The performance of the new approach is compared with these three methods by simulation studies, and our method performs outstandingly well. The method is illustrated by an application to estimating the transition density and the Value-at-Risk of treasury-bill data.

146 citations


Journal ArticleDOI
TL;DR: Fractional hot deck imputation as discussed by the authors replaces each missing observation with a set of imputed values and assigns a weight to each imputed value, and a consistent replication variance estimation procedure for estimators computed with fractional imputation.
Abstract: To compensate for item nonresponse, hot deck imputation procedures replace missing values with values that occur in the sample. Fractional hot deck imputation replaces each missing observation with a set of imputed values and assigns a weight to each imputed value. Under the model in which observations in an imputation cell are independently and identically distributed, fractional hot deck imputation is shown to be an effective imputation procedure. A consistent replication variance estimation procedure for estimators computed with fractional imputation is suggested. Simulations show that fractional imputation and the suggested variance estimator are superior to multiple imputation estimators in general, and much superior to multiple imputation for estimating the variance of a domain mean.

143 citations


Journal ArticleDOI
TL;DR: In this paper, a general class of semiparametric transformation cure models is studied for the analysis of survival data with long-term survivors. And the resulting estimators are asymptotically normal with variance-covariance matrix that has a closed form and can be consistently estimated by the usual plug-in method.
Abstract: A general class of semiparametric transformation cure models is studied for the analysis of survival data with long-term survivors. It combines a logistic regression for the probability of event occurrence with the class of transformation models for the time of occurrence. Included as special cases are the proportional hazards cure model ( Farewell, 1982; Kuk & Chen, 1992; Sy & Taylor, 2000; Peng & Dear, 2000) and the proportional odds cure model. Generalised estimating equations are proposed for parameter estimation. It is shown that the resulting estimators are asymptotically normal, with variance-covariance matrix that has a closed form and can be consistently estimated by the usual plug-in method. Simulation studies show that the proposed approach is appropriate for practical use. An application to data from a breast cancer study is given to illustrate the methodology.

143 citations


Journal ArticleDOI
TL;DR: In this paper, a Bayesian information criterion is proposed to evaluate models estimated by the maximum penalised likelihood method or the method of regularisation, which is applied to the choice of smoothing parameters and the number of basis functions in radial basis function net models.
Abstract: SUMMARY By extending Schwarz's (1978) basic idea we derive a Bayesian information criterion which enables us to evaluate models estimated by the maximum penalised likelihood method or the method of regularisation. The proposed criterion is applied to the choice of smoothing parameters and the number of basis functions in radial basis function net work models. Monte Carlo experiments were conducted to examine the performance of the nonlinear modelling strategy of estimating the weight parameters by regularisation and then determining the adjusted parameters by the Bayesian information criterion. The simulation results show that our modelling procedure performs well in various situations.

140 citations


Journal ArticleDOI
TL;DR: In this article, a quantitative measure, design sensitivity, is proposed for measuring the contribution such strategies make in distinguishing causal effects from hidden biases, and several common strategies are then evaluated in terms of their contribution to design sensitivity.
Abstract: SUMMARY Outside the field of statistics, the literature on observational studies offers advice about research designs or strategies for judging whether or not an association is causal, such as multiple operationalism or a dose-response relationship. These useful suggestions are typically informal and qualitative. A quantitative measure, design sensitivity, is proposed for measuring the contribution such strategies make in distinguishing causal effects from hidden biases. Several common strategies are then evaluated in terms of their contribution to design sensitivity. A related method for computing the power of a sensitivity analysis is also developed.

136 citations


Journal ArticleDOI
TL;DR: In this article, the authors pointed out a problem in the multi-move sampler as proposed by Shephard & Pitt (1997) and provided an alternative correct formulation for this problem.
Abstract: This note points out a problem in the multi-move sampler as proposed by Shephard & Pitt (1997) and provides an alternative correct formulation.

134 citations


Journal ArticleDOI
TL;DR: In this article, the bias of the estimator is of order h(3) when a symmetric kernel is used, where h is the bandwidth, and the variance is of n(-1) and efficient in the semiparametric sense.
Abstract: Motivated by two practical problems, we propose a new procedure for estimating a semivarying-coefficient model. Asymptotic properties are established which show that the bias of the parameter estimator is of order h(3) when a symmetric kernel is used, where h is the bandwidth, and the variance is of order n(-1) and efficient in the semiparametric sense. Undersmoothing is unnecessary for the root-n consistency of the estimators. Therefore, commonly used bandwidth selection methods can be employed. A model selection method is also developed. Simulations demonstrate how the proposed method works. Some insights are obtained into the two motivating problems by using the proposed models.

133 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the estimation of the effect of the received treatment in randomised clinical trials with non-compliance and a dichotomous outcome using structural mean models.
Abstract: In this paper we consider the estimation of the effect of received treatment in randomised clinical trials with non-compliance and a dichotomous outcome using structural mean models. We allow both for the assigned and received treatments to be continuous, categorical or ordinal and for the possibility that the assigned treatment has a direct effect on the outcome through pathways other than the received treatment. We also consider the application of our results to observational studies. The parameters f of a structural mean model measure, on an appropriate scale, how the effect of the received treatment on the treated population varies across levels of pre-treatment covariates. Thus, these models are useful for assessing whether or not the effect of received treatment is modified by baseline covariates. Robins (1989, 1994) showed that, when the randomisation probabilities are known, both additive and multiplicative structural mean models that respectively impose a linear

Journal ArticleDOI
TL;DR: This work proposes prior probability models for variance-covariance matrices that allow a researcher to represent substantive prior information about the strength of correlations among a set of variables, and discusses appropriate posterior simulation schemes to implement posterior inference in the proposed models.
Abstract: We propose prior probability models for variance-covariance matrices in order to address two important issues. First, the models allow a researcher to represent substantive prior information about the strength of correlations among a set of variables. Secondly, even in the absence of such information, the increased flexibility of the models mitigates dependence on strict parametric assumptions in standard prior models. For example, the model allows a posteriori different levels of uncertainty about correlations among different subsets of variables. We achieve this by including a clustering mechanism in the prior probability model. Clustering is with respect to variables and pairs of variables. Our approach leads to shrinkage towards a mixture structure implied by the clustering. We discuss appropriate posterior simulation schemes to implement posterior inference in the proposed models, including the evaluation of normalising constants that are functions of parameters of interest. The normalising constants result from the restriction that the correlation matrix be positive definite. We discuss examples based on simulated data, a stock return dataset and a population genetics dataset.

Journal ArticleDOI
TL;DR: In this article, the authors show that smoothing spline estimators are asymptotically equivalent to a seemingly unrelated kernel estimator for any working covariance matrix and that both estimators can be obtained iteratively by applying conventional kernel or spline smoothing to pseudo-observations.
Abstract: correlation. We show that a smoothing spline estimator is asymptotically equivalent to a recently proposed seemingly unrelated kernel estimator of Wang (2003) for any working covariance matrix. We show that both estimators can be obtained iteratively by applying conventional kernel or spline smoothing to pseudo-observations. This result allows us to study the asymptotic properties of the smoothing spline estimator by deriving its asymptotic bias and variance. We show that smoothing splines are consistent for an arbitrary working covariance and have the smallest variance when assuming the true covariance. We further show that both the seemingly unrelated kernel estimator and the smoothing spline estimator are nonlocal unless working independence is assumed but have asymptotically negligible bias. Their finite sample performance is compared through simulations. Our results justify the use of efficient, non-local estimators such as smoothing splines for clustered/longitudinal data.

Journal ArticleDOI
TL;DR: In this article, the projected estimating function is defined as the projection of the score function on to a given estimating function, which is defined by the projected estimation function corresponding to parameters of interest and nuisance parameters, and a numerical assessment is conducted to investigate the improvement of the asymptotic efficiency of estimators.
Abstract: SUMMARY This paper is concerned with a paradox associated with parameter estimation in the presence of nuisance parameters. In a statistical model with unknown nuisance parameters, the efficiency of an estimator of a parameter usually increases when the nuisance para meters are known. However the opposite phenomenon can sometimes occur. In this paper, we elucidate the occurrence of this paradox by examining estimating functions. In particular, we focus on the projected estimating function, which is defined by the projection of the score function on to a given estimating function. A sufficient condition for the paradox to occur is the orthogonality of the two components of the projected estimating functions corresponding to parameters of interest and nuisance parameters. In addition, a numerical assessment is conducted in the context of a simple model to investigate the improvement of the asymptotic efficiency of estimators.

Journal ArticleDOI
TL;DR: In this article, the Lagrangian model is used to compute the normalising constant of an unnormalised joint likelihood expressible as a product of factors, which can be used for computing the normalizing constant and other summations.
Abstract: Let n S-valued categorical variables be jointly distributed according to a distribution known only up to an unknown normalising constant. For an unnormalised joint likelihood expressible as a product of factors, we give an algebraic recursion which can be used for computing the normalising constant and other summations. A saving in computation is achieved when each factor contains a lagged subset of the components combining in the joint distribution, with maximum computational efficiency as the subsets attain their minimum size. If each subset contains at most r+1 of the n components in the joint distribution, we term this a lag-r model, whose normalising constant can be computed using a forward recursion in O(Sr+1) computations, as opposed to O(Sn) for the direct computation. We show how a lag-r model represents a Markov random field and allows a neighbourhood structure to be related to the unnormalised joint likelihood. We illustrate the method by showing how the normalising constant of the Ising or autologistic model can be computed.

Journal ArticleDOI
TL;DR: In this paper, the authors analyse the likelihood of a two-equation seemingly unrelated regressions model and demonstrate that its likelihood may have up to five stationary points, and thus there may be up to three local modes.
Abstract: SUMMARY We analyse the simplest two-equation seemingly unrelated regressions model and demonstrate that its likelihood may have up to five stationary points, and thus there may be up to three local modes. Consequently the estimates obtained via iterative estimation methods may depend on starting values. We further show that the probability of multi modality vanishes asymptotically. Monte Carlo simulations suggest that multimodality rarely occurs if the seemingly unrelated regressions model is true, but can become more frequent if the model is misspecified. The existence of multimodality in the likelihood for seemingly unrelated regressions models contradicts several claims in the literature.

Journal ArticleDOI
TL;DR: In this paper, the authors showed that the posterior distribution of the log odds ratios in a Bayesian analysis can be computed using a relatively simple model, the logistic regression model, which treats data as though generated prospectively and which does not involve nuisance parameters for the exposure distribution.
Abstract: SUMMARY The natural likelihood to use for a case-control study is a 'retrospective' likelihood, i.e. a likelihood based on the probability of exposure given disease status. Prentice & Pyke (1979) showed that, when a logistic regression form is assumed for the probability of disease given exposure, the maximum likelihood estimators and asymptotic covariance matrix of the log odds ratios obtained from the retrospective likelihood are the same as those obtained from the 'prospective' likelihood, i.e. that based on probability of disease given exposure. We prove a similar result for the posterior distribution of the log odds ratios in a Bayesian analysis. This means that the Bayesian analysis of case-control studies may be done using a relatively simple model, the logistic regression model, which treats data as though generated prospectively and which does not involve nuisance parameters for the exposure distribution.

Journal ArticleDOI
TL;DR: In this paper, a class of semiparametric estimators are proposed in the general setting of functional measurement error models, which follow from estimating equations that are based on the semi-parametric efficient score derived under a possibly incorrect distri butional assumption for the unobserved'measured with error' covariates.
Abstract: SUMMARY A class of semiparametric estimators are proposed in the general setting of functional measurement error models. The estimators follow from estimating equations that are based on the semiparametric efficient score derived under a possibly incorrect distri butional assumption for the unobserved 'measured with error' covariates. It is shown that

Journal ArticleDOI
TL;DR: In this paper, the authors study methods for constructing confidence intervals and confidence bands for estimators of receiver operating characteristics, and show that substantial undersmoothing is necessary if coverage properties are not to be impaired.
Abstract: SUMMARY We study methods for constructing confidence intervals and confidence bands for estimators of receiver operating characteristics. Particular emphasis is placed on the way in which smoothing should be implemented, when estimating either the characteristic itself or its variance. We show that substantial undersmoothing is necessary if coverage properties are not to be impaired. A theoretical analysis of the problem suggests an empirical, plug-in rule for bandwidth choice, optimising the coverage accuracy of interval estimators. The performance of this approach is explored. Our preferred technique is based on asymptotic approximation, rather than a more sophisticated approach using the bootstrap, since the latter requires a multiplicity of smoothing parameters all of which must be chosen in nonstandard ways. It is shown that the asymptotic method can give very good performance.

Journal ArticleDOI
TL;DR: In this paper, a case-cohort design for failure time data from the Atherosclerosis Risk in Communities (ARCC) study is presented, in which covariates are assembled only for a subco-hort randomly selected from the entire cohort, and any additional cases outside the sub-co hort.
Abstract: SUMMARY In a case-cohort design introduced by Prentice (1986), covariates are assembled only for a subcohort randomly selected from the entire cohort, and any additional cases outside the subcohort. Semiparametric transformation models are considered here for failure time data from the case-cohort design. Weighted estimating equations are proposed for esti mation of the regression parameters. The estimation procedure of survival probability at given covariate levels is also provided. Asymptotic properties are derived for the estimators using finite population sampling theory, U-statistics theory and martingale convergence results. The finite-sample properties of the proposed estimators, as well as the efficiency relative to the full cohort estimators, are assessed via simulation studies. A case-cohort dataset from the Atherosclerosis Risk in Communities study is used to illustrate the estimating procedure.

Journal ArticleDOI
TL;DR: In this article, the authors considered marginal additive hazards models for multivariate survival data in which individuals may experience events of several types and there may also be correlation between individuals and proposed a resampling technique for constructing simultaneous confidence bands for the survival curve of a specific subject.
Abstract: SUMMARY Marginal additive hazards models are considered for multivariate survival data in which individuals may experience events of several types and there may also be correlation between individuals. Estimators are proposed for the parameters of such models and for the baseline hazard functions. The estimators of the regression coefficients are shown asymptotically to follow a multivariate normal distribution with a sandwich type covariance matrix that can be consistently estimated. The estimated baseline and subject-specific cumulative hazard processes are shown to converge weakly to a zero mean Gaussian random field. The weak convergence properties for the corresponding survival processes are established. A resampling technique is proposed for constructing simultaneous confidence bands for the survival curve of a specific subject. The method ology is extended to a multivariate version of a class of partly parametric additive hazards model. Simulation studies are conducted to assess finite sample properties, and the method is illustrated with an application to development of coronary heart diseases and cardiovascular accidents in the Framingham Heart Study.

Journal ArticleDOI
TL;DR: In this article, the authors show that the quadratic inference functions lead to bounded influence functions and the corresponding M-estimator has a redescending property, but the generalised estimating equation approach does not.
Abstract: In the presence of data contamination or outliers, some empirical studies have indicated that the two methods of generalised estimating equations and quadratic inference functions appear to have rather different robustness behaviour. This paper presents a theoretical investigation from the perspective of the influence function to identify the causes for the difference. We show that quadratic inference functions lead to bounded influence functions and the corresponding M-estimator has a redescending property, but the generalised estimating equation approach does not. We also illustrate that, unlike generalised estimating equations, quadratic inference functions can still provide consistent estimators even if part of the data is contaminated. We conclude that the quadratic inference function is a preferable method to the generalised estimating equation as far as robustness is concerned. This conclusion is supported by simulations and real-data examples.

Journal ArticleDOI
TL;DR: In this paper, a bias correction method for the generalised likelihood ratio test of Fan et al. was proposed, which can be applied to test whether or not a residual series is white noise.
Abstract: There are few techniques available for testing whether or not a family of parametric times series models fits a set of data reasonably well without serious restrictions on the forms of alternative models. In this paper, we consider generalised likelihood ratio tests of whether or not the spectral density function of a stationary time series admits certain parametric forms. We propose a bias correction method for the generalised likelihood ratio test of Fan et al. (2001). In particular, our methods can be applied to test whether or not a residual series is white noise. Sampling properties of the proposed tests are established. A bootstrap approach is proposed for estimating the null distribution of the test statistics. Simulation studies investigate the accuracy of the proposed bootstrap estimate and compare the power of the various ways of constructing the generalised likelihood ratio tests as well as some classic methods like the Cramer-von Mises and Ljung-Box tests. Our results favour the newly proposed bias reduction method using the local likelihood estimator.

Journal ArticleDOI
TL;DR: In this article, the properties of ordinary and generalised least squares estimators in a simple linear regression with stationary autocorrelated errors were studied, and explicit expressions for the variances of the regression parameter estimators were derived for some common time series autoregressive structures, including a first-order autoregression and general moving averages.
Abstract: This paper studies properties of ordinary and generalised least squares estimators in a simple linear regression with stationary autocorrelated errors. Explicit expressions for the variances of the regression parameter estimators are derived for some common time series autocorrelation structures, including a first-order autoregression and general moving averages. Applications of the results include confidence intervals and an example where the variance of the trend slope estimator does not increase with increasing autocorrelation.

Journal ArticleDOI
TL;DR: In this article, the authors analyse doubly censored data using semiparametric transformation models and derive the asymptotic distributions of the proposed estimators, and illustrate their approach with a viral load dataset from a recent AIDS clinical trial.
Abstract: SUMMARY We analyse doubly censored data using semiparametric transformation models. We provide inference procedures for the regression parameters and derive the asymptotic distributions of the proposed estimators. Procedures for model checking and model selection are also discussed. We illustrate our approach with a viral-load dataset from a recent AIDS clinical trial.

Journal ArticleDOI
TL;DR: In this paper, two methods of model selection are discussed for changepoint-like problems, especially those arising in genetic linkage analysis, and compared theoretically and on examples from the literature.
Abstract: SUMMARY Two methods of model selection are discussed for changepoint-like problems, especially those arising in genetic linkage analysis. The first is a method that selects the model with the smallest p-value, while the second is a modification of the Bayes information criterion. The methods are compared theoretically and on examples from the literature. For these examples, they are roughly comparable although the p-value-based method is somewhat more liberal in selecting a high-dimensional model.

Journal ArticleDOI
TL;DR: In this paper, a modification of the Robbins-Monro procedure was proposed for binary data under some reasonable approximations, and the improvement obtained by using the optimal procedure for the estimation of extreme quantiles is substantial.
Abstract: SUMMARY The Robbins-Monro procedure does not perform well in the estimation of extreme quantiles, because the procedure is implemented using asymptotic results, which are not suitable for binary data. Here we propose a modification of the Robbins-Monro procedure and derive the optimal procedure for binary data under some reasonable approximations. The improvement obtained by using the optimal procedure for the estimation of extreme quantiles is substantial.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed an EM algorithm for estimating the parameters of a weakly parameterised competing risks model with masked causes of failure and second-stage data, which is applied to a real dataset and the asymptotic and robustness properties of the estimators are investigated through simulation.
Abstract: SUMMARY In this paper we propose inference methods based on the EM algorithm for estimating the parameters of a weakly parameterised competing risks model with masked causes of failure and second-stage data. With a carefully chosen definition of complete data, the maximum likelihood estimation of the cause-specific hazard functions and of the masking probabilities is performed via an EM algorithm. Both the E- and M-steps can be solved in closed form under the full model and under some restricted models of interest. We illustrate the flexibility of the method by showing how grouped data and tests of common hypotheses in the literature on missing cause of death can be handled. The method is applied to a real dataset and the asymptotic and robustness properties of the estimators are investigated through simulation.

Journal ArticleDOI
TL;DR: In this article, the authors proposed estimating equations for fitting proportional hazards regression models to the gap times and applied them to renal failure data to assess the association between demographic covariates and both the time until wait-listing and the time from waiting to kidney transplantation.
Abstract: Sequentially ordered multivariate failure time data are often observed in biomedical studies and inter-event, or gap, times are often of interest. Generally, standard hazard regression methods cannot be applied to the gap times because of identifiability issues and induced dependent censoring. We propose estimating equations for fitting proportional hazards regression models to the gap times. Model parameters are shown to be consistent and asymptotically normal. Simulation studies reveal the appropriateness of the asymptotic approximations in finite samples. The proposed methods are applied to renal failure data to assess the association between demographic covariates and both time until wait-listing and time from wait-listing to kidney transplantation.

Journal ArticleDOI
TL;DR: In this paper, a functional generalised linear model is proposed which includes extensions of standard models in multi-state survival analysis, and the estimators are the basis for new tests of the covariate effects and for the estimation of models in which greater structure is imposed on the parameters.
Abstract: SUMMARY We consider regression for response and covariates which are temporal processes observed over intervals. A functional generalised linear model is proposed which includes extensions of standard models in multi-state survival analysis. Simple nonparametric esti mators of time-indexed parameters are developed using 'working independence' estimating equations and are shown to be uniformly consistent and to converge weakly to Gaussian processes. The procedure does not require smoothing or a Markov assumption, unlike approaches based on transition intensities. The usual definition of optimal estimating equations for parametric models is then generalised to the functional model and the optimum is identified in a class of functional generalised estimating equations. Simulations demonstrate large efficiency gains relative to working independence at times where censoring is heavy. The estimators are the basis for new tests of the covariate effects and for the estimation of models in which greater structure is imposed on the parameters, providing novel goodness-of-fit tests. The methodology's practical utility is illustrated in a data analysis.