scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 2001"


Journal ArticleDOI
TL;DR: In this article, it was shown that the likelihood ratio statistic based on the Kullback-Leibler information criterion of the null hypothesis that a random sample is drawn from a k 0 -component normal mixture distribution against the alternative hypothesis that the sample was drawn from an k 1 -component normalized mixture distribution is asymptotically distributed as a weighted sum of independent chi-squared random variables with one degree of freedom, under general regularity conditions.
Abstract: We demonstrate that, under a theorem proposed by Vuong, the likelihood ratio statistic based on the Kullback-Leibler information criterion of the null hypothesis that a random sample is drawn from a k 0 -component normal mixture distribution against the alternative hypothesis that the sample is drawn from a k 1 -component normal mixture distribution is asymptotically distributed as a weighted sum of independent chi-squared random variables with one degree of freedom, under general regularity conditions. We report simulation studies of two cases where we are testing a single normal versus a two-component normal mixture and a two-component normal mixture versus a three-component normal mixture. An empirical adjustment to the likelihood ratio statistic is proposed that appears to improve the rate of convergence to the limiting distribution.

3,531 citations


Journal ArticleDOI
TL;DR: A Bayesian method, for fitting curves to data drawn from an exponential family, that uses splines for which the number and locations of knots are free parameters, which performs well and is illustrated in two neuroscience applications.
Abstract: We describe a Bayesian method, for fitting curves to data drawn from an exponential family, that uses splines for which the number and locations of knots are free parameters. The method uses reversible jump Markov chain Monte Carlo to change the knot configurations and a locality heuristic to speed up mixing. For nonnormal models, we approximate the integrated likelihood ratios needed to compute acceptance probabilities by using the Bayesian information criterion, BIC, under priors that make this approximation accurate. Our technique is based on a marginalised chain on the knot number and locations, but we provide methods for inference about the regression coefficients, and functions of them, in both normal and nonnormal models. Simulation results suggest that the method performs well, and we illustrate the method in two neuroscience applications.

444 citations


Journal ArticleDOI
TL;DR: A new Markov chain Monte Carlo approach to Bayesian analysis of discretely observed diffusion processes and shows that, because of full dependence between the missing paths and the volatility of the diffusion, the rate of convergence of basic algorithms can be arbitrarily slow if the amount of the augmentation is large.
Abstract: In this paper, we introduce a new Markov chain Monte Carlo approach to Bayesian analysis of discretely observed diffusion processes. We treat the paths between any two data points as missing data. As such, we show that, because of full dependence between the missing paths and the volatility of the diffusion, the rate of convergence of basic algorithms can be arbitrarily slow if the amount of the augmentation is large. We offer a transformation of the diffusion which breaks down dependency between the transformed missing paths and the volatility of the diffusion. We then propose two efficient Markov chain Monte Carlo algorithms to sample from the posterior-distribution of the transformed missing observations and the parameters of the diffusion. We apply our results to examples involving simulated data and also to Eurodollar short-rate data.

343 citations


Journal ArticleDOI
TL;DR: In this article, a hierarchical generalised linear model (GLM) is developed as a synthesis of generalized linear models, mixed linear models and structured dispersions, and a restricted maximum likelihood method for the estimation of dispersion is extended to a wider class of models.
Abstract: SUMMARY Hierarchical generalised linear models are developed as a synthesis of generalised linear models, mixed linear models and structured dispersions. We generalise the restricted maximum likelihood method for the estimation of dispersion to the wider class and show how the joint fitting of models for mean and dispersion can be expressed by two interconnected generalised linear models. The method allows models with (i) any combination of a generalised linear model distribution for the response with any conjugate distribution for the random effects, (ii) structured dispersion components, (iii) different link and variance functions for the fixed and random effects, and (iv) the use of quasilikelihoods in place of likelihoods for either or both of the mean and dispersion models. Inferences can be made by applying standard procedures, in particular those for model checking, to components of either generalised linear model. We also show by numerical studies that the new method gives an efficient estimation procedure for substantial class of models of practical importance. Likelihood-type inference is extended to this wide class of models in a unified way.

325 citations


Journal ArticleDOI
TL;DR: In this article, a simple method for estimating the proportional hazards model parameters that requires no assumption on the distribution of the random effects is presented. But this need not hold in practice.
Abstract: SUMMARY A common objective in longitudinal studies is to characterise the relationship between a failure time process and time-independent and time-dependent covariates. Timedependent covariates are generally available as longitudinal data collected periodically during the course of the study. We assume that these data follow a linear mixed effects model with normal measurement error and that the hazard of failure depends both on the underlying random effects describing the covariate process and other time-independent covariates through a proportional hazards relationship. A routine assumption is that the random effects are normally distributed; however, this need not hold in practice. Within this framework, we develop a simple method for estimating the proportional hazards model parameters that requires no assumption on the distribution of the random effects. Large-sample properties are discussed, and finite-sample performance is assessed and compared to competing methods via simulation.

291 citations


Journal ArticleDOI
TL;DR: In this article, the impact of model violations on the estimate of a regression coefficient in a generalised linear mixed model is investigated, and the authors evaluate the asymptotic relative bias that results from incorrect assumptions regarding the random effects.
Abstract: SUMMARY We investigate the impact of model violations on the estimate of a regression coefficient in a generalised linear mixed model. Specifically, we evaluate the asymptotic relative bias that results from incorrect assumptions regarding the random effects. We compare the impact of model violation for two parameterisations of the regression model. Substantial bias in the conditionally specified regression point estimators can result from using a simple random intercepts model when either the random effects distribution depends on measured covariates or there are autoregressive random effects. A marginally specified regression structure that is estimated using maximum likelihood is much less susceptible to bias resulting from random effects model misspecification.

280 citations


Journal ArticleDOI
TL;DR: In this article, the idea of delaying the rejection and adapting the proposal distribution, due to Tierney & Mira (1999), was extended to generate a more flexible class of methods that applies in particular to a variable-dimension setting.
Abstract: SUMMARY In a Metropolis-Hastings algorithm, rejection of proposed moves is an intrinsic part of ensuring that the chain converges to the intended target distribution. However, persistent rejection, perhaps in particular parts of the state space, may indicate that locally the proposal distribution is badly calibrated to the target. As an alternative to careful off-line tuning of state-dependent proposals, the basic algorithm can be modified so that, on rejection, a second attempt to move is made. A different proposal can be generated from a new distribution that is allowed to depend on the previously rejected proposal. We generalise this idea of delaying the rejection and adapting the proposal distribution, due to Tierney & Mira (1999), to generate a more flexible class of methods that applies in particular to a variable-dimension setting. The approach is illustrated by two pedagogical examples and a more realistic application to a changepoints analysis for point processes. Some key wor-ds: Adaptive; Changepoint; Efficiency of Markov chain Monte Carlo estimation; Integrated autocorrelation time; Peskun ordering.

280 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered a variant of the competing risks problem in which a terminal event censors a non-terminal event, but not vice versa, and formulated the joint distribution of the events via a gamma frailty model in the upper wedge where data are observable, with the marginal distributions unspecified.
Abstract: SUMMARY We consider a variation of the competing risks problem in which a terminal event censors a non-terminal event, but not vice versa. The joint distribution of the events is formulated via a gamma frailty model in the upper wedge where data are observable (Day et al., 1997), with the marginal distributions unspecified. An estimator for the association parameter is obtained from a concordance estimating function. A novel plug-in estimator for the marginal distribution of the non-terminal event is shown to be uniformly consistent and to converge weakly to a Gaussian process. The assumptions on the joint distribution outside the upper wedge are weaker than those usually made in competing risks analyses. Simulations demonstrate that the methods work well with practical sample sizes. The proposals are illustrated with data on morbidity and mortality in leukaemia patients.

270 citations


Journal ArticleDOI
TL;DR: In this article, a simple resampling method by perturbing the objective function repeatedly was proposed to estimate the covariance matrix of the estimator of a vector of parameters of interest, which can then be made based on a large collection of the resulting optimisers.
Abstract: Suppose that under a semiparametric setting an estimator of a vector of parameters of interest is obtained by optimising an objective function which has a U-process structure. The covariance matrix of the estimator is generally a function of the underlying density function, which may be difficult to estimate well by conventional methods. In this paper, we present a simple resampling method by perturbing the objective function repeatedly. Inferences of the parameters can then be made based on a large collection of the resulting optimisers. We illustrate our proposal by three examples with a heteroscedastic regression model.

264 citations


Journal ArticleDOI
TL;DR: Within-cluster resampling is proposed as a new method for analysing clustered data in this article, where the authors present theory for the asymptotic normality and provide a consistent variance estimator for the within-clusters estimator.
Abstract: Within-cluster resampling is proposed as a new method for analysing clustered data. Although the focus of this paper is clustered binary data, the within-cluster resampling asymptotic theory is general for many types of clustered data. Within-cluster resampling is a simple but computationally intensive estimation method. Its main advantage over other marginal analysis methods, such as generalised estimating equations (Liang & Zeger, 1986; Zeger & Liang, 1986) is that it remains valid when the risk for the outcome of interest is related to the cluster size, which we term nonignorable cluster size. We present theory for the asymptotic normality and provide a consistent variance estimator for the within-cluster resampling estimator. Simulations and an example are developed that assess the finite-sample behaviour of the new method and show that when both methods are valid its performance is similar to that of generalised estimating equations.

225 citations


Journal ArticleDOI
TL;DR: In this article, the Gibbs sampler is used to estimate the parameters of a generalised linear mixed model with nonignorable missing response data and with nonmonotone patterns of missing data in the response variable.
Abstract: SUMMARY We propose a method for estimating parameters in the generalised linear mixed model with nonignorable missing response data and with nonmonotone patterns of missing data in the response variable. We develop a Monte Carlo EM algorithm for estimating the parameters in the model via the Gibbs sampler. For the normal random effects model, we derive a novel analytical form for the E- and M-steps, which is facilitated by integrating out the random effects. This form leads to a computationally feasible and extremely efficient Monte Carlo EM algorithm for computing maximum likelihood estimates and standard errors. In addition, we propose a very general joint multinomial model for the missing data indicators, which can be specified via a sequence of one-dimensional conditional distributions. This multinomial model allows for an arbitrary correlation structure between the missing data indicators, and has the potential of reducing the number of nuisance parameters. Real datasets from the International Breast Cancer Study Group and an environmental study involving dyspnoea in cotton workers are presented to illustrate the proposed methods.

Journal ArticleDOI
TL;DR: In this article, the notion of degrees of freedom is extended to richly-parameterised models, including linear hierarchical and random-effect models, some smoothers and spatial models, and combinations of these.
Abstract: SUMMARY Drawing on linear model theory, we rigorously extend the notion of degrees of freedom to richly-parameterised models, including linear hierarchical and random-effect models, some smoothers and spatial models, and combinations of these. The number of degrees of freedom is often much smaller than the number of parameters. Our notion of degrees of freedom is compatible with similar ideas long associated with smoothers, but is applicable to new classes of models and can be interpreted using the projection theory of linear models. We use an example to illustrate the two applications of setting prior distributions for variances and fixing model complexity by fixing degrees of freedom.

Journal ArticleDOI
TL;DR: The important practical issues of optimally choosing the window shape and the block size are addressed in detail, while some finite-sample simulations are presented validating the good performance of the tapered block bootstrap.
Abstract: SUMMARY We introduce and study tapered block bootstrap methodology that yields an improvement over the well-known block bootstrap for time series of Kiinsch (1989). The asymptotic validity and the favourable bias properties of the tapered block bootstrap are shown. The important practical issues of optimally choosing the window shape and the block size are addressed in detail, while some finite-sample simulations are presented validating the good performance of the tapered block bootstrap.

Journal ArticleDOI
TL;DR: In this article, the authors compared the minimum divergence estimator of Basu et al. (1998) to a competing Minimum Divergence estimator which turns out to be equivalent to a method proposed from a different perspective by Windham (1995), which can be applied for any parametric model and contain maximum likelihood as a special case.
Abstract: This paper compares the minimum divergence estimator of Basu et al. (1998) to a competing minimum divergence estimator which turns out to be equivalent to a method proposed from a different perspective by Windham (1995). Both methods can be applied for any parametric model and contain maximum likelihood as a special case. Efficiencies are compared under model conditions, and robustness properties are studied. Overall the two methods are found to perform quite similarly. Some relatively small advantages of the former method over the latter are identified.

Journal ArticleDOI
TL;DR: This paper considers patients who are heterogeneous with respect to some prognostic factors and assumes that the response from each patient is a continuous variable, and considers a normal linear model and provides an allocation design with due attention to the prognostic Factors.
Abstract: SUMMARY Adaptive designs are often used in clinical trials to force balance in sequential allocation of patients between two or more competitive treatments. Sometimes, from ethical considerations, the goal may be to allocate a larger number of patients to the better treatment in the course of the trial. In the present paper we consider patients who are heterogeneous with respect to some prognostic factors and assume that the response from each patient is a continuous variable. We consider a normal linear model and provide an allocation design with due attention to the prognostic factors. The loss of efficiency incurred by using a balanced design is also indicated. Issues of inference based on adaptive allocation are also discussed. Finally a guideline is provided for the appropriate choice of the design parameters.

Journal ArticleDOI
TL;DR: Latin hypercube designs suitable for factor screening are presented and they are shown to be efficient in terms of runs required per factor as well as having optimal and orthogonal properties.
Abstract: SUMMARY Latin hypercube designs are often used in computer experiments as they ensure that few design points are redundant when there is effect sparsity. In this paper, designs suitable for factor screening are presented and they are shown to be efficient in terms of runs required per factor as well as having optimal and orthogonal properties. Designs orthogonal under full second-order models are also constructed.

Journal ArticleDOI
TL;DR: A systematic and rigorous treatment of the Procrustes tangent projection is developed here to facilitate its use in applications, including an analytical description of bilateral symmetry of objects.
Abstract: SUMMARY In shape analysis with concentrated data, the Procrustes tangent projection of shapes plays a fundamental role since this projection can be used to convert shape analysis to standard multivariate analysis. In view of its importance, a systematic and rigorous treatment of the Procrustes tangent projection is developed here to facilitate its use in applications. One important application involves an analytical description of bilateral symmetry of objects, together with some related distributional results. An explicit expression for the tangent projection is also derived.

Journal ArticleDOI
TL;DR: In this article, the authors generalize the mixture autoregressive, MAR, model to the logistic mixture auto-regressive with exogenous variables, LMARX, model for the modelling of nonlinear time series.
Abstract: SUMMARY We generalise the mixture autoregressive, MAR, model to the logistic mixture autoregressive with exogenous variables, LMARX, model for the modelling of nonlinear time series. The models consist of a mixture of two Gaussian transfer function models with the mixing proportions changing over time. The model can also be considered as a generalisation of the self-exciting threshold autoregressive, SETAR, model and the open-loop threshold autoregressive, TARSO, model. The advantages of the LMARX model over other nonlinear time series models include a wider range of shape-changing predictive distributions, the ability to handle cycles and conditional heteroscedasticity in the time series and better point prediction. Estimation is easily done via a simple EM algorithm and the model selection problem is addressed. The models are applied to two real datasets and compared with other competing models.

Journal ArticleDOI
TL;DR: A test based on isotonic regression is developed for monotonic trends in short range dependent sequences and is applied to Argentina rainfall data and global warming data as discussed by the authors, which provides another perspective for changepoint problems.
Abstract: A test based on isotonic regression is developed for monotonic trends in short range dependent sequences and is applied to Argentina rainfall data and global warming data. This test provides another perspective for changepoint problems. The isotonic test is shown to be more powerful than some existing tests for trend.

Journal ArticleDOI
TL;DR: An inferential method for frailty models via hierarchical likelihood is developed, which gives a simple unified framework and a numerically efficient algorithm.
Abstract: Systematic likelihood inference for frailty models is possible using Lee & Nelder's (1996) hierarchical likelihood. We develop an inferential method for frailty models via hierarchical likelihood, which gives a simple unified framework and a numerically efficient algorithm. Simulations and practical examples are presented to illustrate the new method.

Journal ArticleDOI
TL;DR: The idea is to generalise Cook's (1977) approach to the conditional expectation of the complete-data loglikelihood function in the EM algorithm by proposing several case-deletion measures for assessing the influence of an observation for complicated models with real missing data or hypothetical missing data corresponding to latent random variables.
Abstract: SUMMARY This paper proposes several case-deletion measures for assessing the influence of an observation for complicated models with real missing data or hypothetical missing data corresponding to latent random variables. The idea is to generalise Cook's (1977) approach to the conditional expectation of the complete-data loglikelihood function in the EM algorithm. On the basis of the diagnostic measures, a procedure is proposed for detecting influential observations. Two examples illustrate our methodology. We show that the method can be applied efficiently to a wide variety of complicated problems that are difficult to handle by existing methods.

Journal ArticleDOI
TL;DR: In this article, the authors investigated shrinkage methods for constructing predictive distributions and showed that there exists a shrinkage predictive distribution dominating the Bayesian predictive distribution based on the vague prior when the dimension is not less than three.
Abstract: SUMMARY We investigate shrinkage methods for constructing predictive distributions. We consider the multivariate Normal model with a known covariance matrix and show that there exists a shrinkage predictive distribution dominating the Bayesian predictive distribution based on the vague prior when the dimension is not less than three. Kullback-Leibler divergence from the true distribution to a predictive distribution is adopted as a loss function. Somiie key w,ords: Invariance; James-Stein estimator; Kullback-Leibler divergence; Stein's prior; Vague prior.

Journal ArticleDOI
TL;DR: In this article, a semi-parametric partially generalised linear model for clustered data using estimating equations is considered, where the mean of the outcome variable depends on some covariates parametrically and a cluster-level covariate nonparametrically.
Abstract: We consider estimation in a semiparametric partially generalised linear model for clustered data using estimating equations. A marginal model is assumed where the mean of the outcome variable depends on some covariates parametrically and a cluster-level covariate nonparametrically. A profile-kernel method allowing for working correlation matrices is developed. We show that the nonparametric part of the model can be estimated using standard nonparametric methods, including smoothing-parameter estimation, and the parametric part of the model can be estimated in a profile fashion. The asymptotic distributions of the parameter estimators are derived, and the optimal estimators of both the nonparametric and parametric parts are shown to be obtained when the working correlation matrix equals the actual correlation matrix. The asymptotic covariance matrix of the parameter estimator is consistently estimated by the sandwich estimator. We show that the semiparametric efficient score takes on a simple form and our profile-kernel method is semiparametric efficient. The results for the case where the nonparametric part of the model is an observation-level covariate are noted to be dramatically different.

Journal ArticleDOI
Wei Pan1
TL;DR: In this article, a modification to the Liang-Zeger prescription for implementing the robust variance estimator is proposed, which can consistently estimate the variance matrix of the estimated regression coefficient in the generalized estimating equation approach.
Abstract: SUMMARY The variance matrix of the estimated regression coefficient in the Liang-Zeger generalised estimating equation approach can be consistently estimated by the so-called sandwich or robust estimator. In this note, we propose a modification to the Liang-Zeger prescription for implementing the robust variance estimator. Analytical and numerical evidence shows the superior performance of our proposal.

Journal ArticleDOI
TL;DR: In this paper, a new model selection criterion for single-index models, AICC, is proposed, which minimizes the expected Kullback-Leibler distance between the true and candidate models.
Abstract: We derive a new model selection criterion for single‐index models, AICC, by minimising the expected Kullback-Leibler distance between the true and candidate models. The proposed criterion selects not only relevant variables but also the smoothing parameter for an unknown link function. Thus, it is a general selection criterion that provides a unified approach to model selection across both parametric and nonparametric functions. Monte Carlo studies demonstrate that AICC performs satisfactorily in most situations. We illustrate the practical use of AICC with an empirical example for modelling the hedonic price function for cars. In addition, we extend the applicability of AICC to partially linear and additive single‐index models. Language: en

Journal ArticleDOI
TL;DR: A general class of semiparametric hazards regression models for survival data is proposed and studied in this paper, which includes some popular classes of models as subclasses, such as Cox's proportional hazards model, the accelerated failure time model and a recently proposed class of models called the accelerated hazards model.
Abstract: A general class of semiparametric hazards regression models for survival data is proposed and studied. This general class includes some popular classes of models as subclasses, such as Cox's proportional hazards model, the accelerated failure time model and a recently proposed class of models called the accelerated hazards model. In the general class of models, a covariate's effect is identified as having two separate components, namely a time-scale change on hazard progression and a relative hazards ratio. The new model is flexible in modelling survival data and may yield more accurate prediction of an individual's survival process. By way of the nested structure that includes the proportional hazards model, the accelerated failure time model and the accelerated hazards model, the general class of models may provide a numerical tool for determining which of them is more appropriate for a given dataset.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new span selector based on the generalised crossvalidation function derived from the gamma deviance, which asymptotically behave like independently distributed chi-squared.
Abstract: A consistent estimator for the spectral density of a stationary random process can be obtained by smoothing the periodograms across frequency. An important component of smoothing is the choice of the span. Lee ( 1997) proposed a span selector that was erroneously claimed to be unbiased for the mean squared error. The naive use of mean squared error has some important drawbacks in this context because the variance of the periodogram depends on its mean, i.e. the spectrum. We propose a new span selector based on the generalised crossvalidation function derived from the gamma deviance. This criterion, originally developed for use in fitting generalised additive models. utilises the approximate full likelihood of periodograms, which asymptotically behave like independently distributed chi-squared. i.e. gamma. random variables. The proposed span selector is very simple and easily implemented. Simulation results suggest that the proposed span selector generally outperforms those obtained under a mean squared error criterion.

Journal ArticleDOI
TL;DR: In this paper, the authors combine and extend the work of Madigan & York (1997) and Dellaportas & Forster (1999) using reversible jump Markov chain Monte Carlo simulation to calculate posterior model probabilities which can then be used to estimate model-averaged statistics of interest.
Abstract: SUMMARY We consider the problem of estimating the total size of a population from a series of incomplete census data. We observe that inference is typically highly sensitive to the choice of model and we demonstrate how Bayesian model averaging techniques easily overcome this problem. We combine and extend the work of Madigan & York (1997) and Dellaportas & Forster (1999) using reversible jump Markov chain Monte Carlo simulation to calculate posterior model probabilities which can then be used to estimate modelaveraged statistics of interest. We provide a detailed description of the simulation procedures involved and consider a wide variety of modelling issues, such as the range of models considered, their parameterisation, both prior choice and sensitivity, and computational efficiency. We consider a detailed example concerning adolescent injuries in Pennsylvania on the basis of medical, school and survey data. In the context of this example, we discuss the relationship between posterior model probabilities and the associated information criteria values for model selection. We also discuss cost-efficiency issues with particular reference to inclusion and exclusion of sources on the grounds of cost. We consider a decision-theoretic approach, which balances the cost and accuracy of different combinations of data sources to guide future decisions on data collection.

Journal ArticleDOI
TL;DR: In this article, it was shown that a factorial design is uniquely determined by its J-characteristics, just as a regular factorial is uniquely defined by its defining relation, and the projection justification of minimum G 2 -aberration is established.
Abstract: Deng & Tang (1999) introduced the generalised resolution and minimum G-aberration criteria for assessing nonregular fractional factorials. In Tang & Deng (1999), a relaxed variant of minimum G-aberration, called minimum G 2 -aberration, is proposed and studied. These criteria are defined using a set of J values, called J-characteristics. In this paper, we show that a factorial design is uniquely determined by its J-characteristics just as a regular factorial design is uniquely determined by its defining relation. The theorem is given through an explicit formula that relates the set of design points to that of J-characteristics. Through this formula, projection justification of minimum G 2 -aberration is established.

Journal ArticleDOI
TL;DR: In this paper, the authors use attributeable effects to expand the scope of pivotal arguments to include displacement effects and the Mann-Whitney-Wilcoxon statistic, and in each case removing an appropriate attributable effect restores the familiar null randomisation distribution of the associated statistic, yielding exact inferences.
Abstract: SUMMARY In randomisation and permutation inference, pivotal arguments remove the hypothesised treatment effect, thereby basing inferences on the null distribution in which the treatment has no effect. This is common, for instance, with additive treatment effects. The current paper uses 'attributable effects' to expand substantially the scope of pivotal arguments. Attributable effects are defined for three cases, namely the 2 x 2 contingency table, displacement effects and the Mann-Whitney-Wilcoxon statistic, and in each case removing an appropriate attributable effect restores the familiar null randomisation distribution of the associated statistic, yielding exact inferences. The procedure extends immediately for use in sensitivity analysis in nonrandomised observational studies.