scispace - formally typeset
Search or ask a question

Showing papers in "Biometrika in 1997"


Journal ArticleDOI
TL;DR: In this article, a new way of introducing a parameter to expand a family of distributions is introduced and applied to yield a new two-parameter extension of the exponential distribution which may serve as a competitor to such commonly-used twoparameter families of life distributions as the Weibull, gamma and lognormal distributions.
Abstract: SUMMARY A new way of introducing a parameter to expand a family of distributions is introduced and applied to yield a new two-parameter extension of the exponential distribution which may serve as a competitor to such commonly-used two-parameter families of life distributions as the Weibull, gamma and lognormal distributions. In addition, the general method is applied to yield a new three-parameter Weibull distribution. Families expanded using the method introduced here have the property that the minimum of a geometric number of independent random variables with common distribution in the family has a distribution again in the family. Bivariate versions are also considered.

1,016 citations


Journal ArticleDOI
TL;DR: Methods for estimating non-Gaussian time series models rely on Markov chain Monte Carlo to carry out simulation smoothing and Bayesian posterior analysis of parameters, and on importance sampling to estimate the likelihood function for classical inference.
Abstract: SUMMARY In this paper we provide methods for estimating non-Gaussian time series models. These techniques rely on Markov chain Monte Carlo to carry out simulation smoothing and Bayesian posterior analysis of parameters, and on importance sampling to estimate the likelihood function for classical inference. The time series structure of the models is used to ensure that our simulation algorithms are efficient.

732 citations


Journal ArticleDOI
TL;DR: In this paper, Monte Carlo simulation is used to obtain an approximation to the loglikelihood for observations with non-Gaussian distributions, where the observations have a Poisson distribution and the observation errors have a t-distribution.
Abstract: State space models are considered for observations which have non-Gaussian distributions. We obtain accurate approximations to the loglikelihood for such models by Monte Carlo simulation. Devices are introduced which improve the accuracy of the approximations and which increase computational efficiency. The loglikelihood function is maximised numerically to obtain estimates of the unknown hyperparameters. Standard errors of the estimates due to simulation are calculated. Details are given for the important special cases where the observations come from an exponential family distribution and where the observation equation is linear but the observation errors are non-Gaussian. The techniques are illustrated with a series for which the observations have a Poisson distribution and a series for which the observation errors have a t-distribution.

462 citations


Journal ArticleDOI
TL;DR: In this paper, a global score test for the null hypothesis that all the variance components are zero is proposed, which is a locally asymptotically most stringent test and does not require specifying the joint distribution of the random effects.
Abstract: SUMMARY There is considerable interest in testing for overdispersion, correlation and heterogeneity across groups in biomedical studies In this paper, we cast the problem in the framework of generalised linear models with random effects We propose a global score test for the null hypothesis that all the variance components are zero This test is a locally asymptotically most stringent test and is robust in the special sense that the test does not require specifying the joint distribution of the random effects We also propose individual score tests and their approximations for testing the variance components separately Both tests can be easily implemented using existing statistical software We illustrate these tests with an application to the study of heterogeneity of mating success across males and females in an experiment on salamander matings, and evaluate their performance through simulation

339 citations


Journal ArticleDOI
TL;DR: In this paper, a nonparametric estimator of Tawn's dependence measure is proposed, which is shown to be uniformly, strongly convergent, and asymptotically unbiased.
Abstract: SUMMARY A bivariate extreme value distribution with fixed marginals is generated by a onedimensional map called a dependence function. This paper proposes a new nonparametric estimator of this function. Its asymptotic properties are examined, and its small-sample behaviour is compared to that of other rank-based and likelihood-based procedures. The new estimator is shown to be uniformly, strongly convergent and asymptotically unbiased. Through simulations, it is also seen to perform reasonably well against the maximum likelihood estimator based on the correct model and to have smaller L1, L2 and L,, errors than any existing nonparametric alternative. The n' consistency of the proposed estimator leads to nonparametric estimation of Tawn's (1988) dependence measure that may be used to test independence in small samples.

293 citations


Journal ArticleDOI
TL;DR: In this article, an approach is proposed to optimal design of experiments for estimating randomeffects regression models, where the population designs are defined by the number of subjects and the individual designs to be performed.
Abstract: SUMMARY An approach is proposed to optimal design of experiments for estimating randomeffects regression models. The population designs are defined by the number of subjects and the individual designs to be performed. Cost functions associated with individual designs are incorporated. For a given maximal cost, an algorithm is proposed for finding the statistical population design that maximises the determinant of the Fisher information matrix of the population parameters. The Fisher information matrix is formulated for linear models and normal distributions. The approach is applied to the design of an optimal experiment in toxicokinetics using a first-order linearisation of the model. Several cost functions and designs of various orders are studied. An example illustrates the optimal population designs and the increased efficiency of some optimal designs over more standard designs.

271 citations


Journal ArticleDOI
TL;DR: In this paper, a Kolmogorov-Smirnov-type statistic was proposed to test the validity of the logistic link function in a case-control sampling plan.
Abstract: SUMMARY We test the logistic regression assumption under a case-control sampling plan. After reparameterisation, the assumed logistic regression model is equivalent to a two-sample semiparametric model in which the log ratio of two density functions is linear in data. By identifying this model with a biased sampling model, we propose a Kolmogorov-Smirnovtype statistic to test the validity of the logistic link function. Moreover, we point out that this test statistic can also be used in mixture sampling. We present a bootstrap procedure along with some results on simulation and on analysis of two real datasets.

248 citations


Journal ArticleDOI
TL;DR: In this paper, an alternative methodology for extreme values of univariate time series was developed, by assuming that the time series is Markovian and using bivariate extreme value theory to suggest appropriate models for the transition distributions.
Abstract: In recent research on extreme value statistics, there has been an extensive development of threshold methods, first in the univariate case and subsequently in the multivariate case as well. In this paper, an alternative methodology for extreme values of univariate time series is developed, by assuming that the time series is Markovian and using bivariate extreme value theory to suggest appropriate models for the transition distributions. A new likelihood representation for threshold methods is presented which we apply to a Markovian time series. An important motivation for developing this kind of theory is the possibility of calculating probability distributions for functionals of extreme events. We address this issue by showing how a theory of compound Poisson limits for additive functionals can be combined with simulation to obtain numerical solutions for problems of practical interest. The methods are illustrated by application to temperature data.

227 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider fitting categorical regression models to data obtained by either stratified or nonstratified case-control, or response selective, sampling from a finite population with known population totals in each response category.
Abstract: SUMMARY We consider fitting categorical regression models to data obtained by either stratified or nonstratified case-control, or response selective, sampling from a finite population with known population totals in each response category. With certain models, such as the logistic with appropriate constant terms, a method variously known as conditional maximum likelihood (Breslow & Cain, 1988) or pseudo-conditional likelihood (Wild, 1991), which involves the prospective fitting of a pseudo-model, results in maximum likelihood estimates of case-control data. We extend these results by showing the maximum likelihood estimates for any model can be found by iterating this process with a simple updating of offset parameters. Attention is also paid to estimation of the asymptotic covariance matrix. One benefit of the results of this paper is the ability to obtain maximum likelihood estimates of the parameters of logistic models for stratified case-control studies, compare Breslow & Cain (1988), Scott & Wild (1991), using an ordinary logistic regression program, even when the stratum constants are modelled.

223 citations


Journal ArticleDOI
TL;DR: In this article, the Bonferroni multiple testing method is extended to multiple testing and the posterior probability of the null hypothesis is adjusted by multiplying by k, the number of tests considered.
Abstract: SUMMARY Bayes/frequentist correspondences between the p-value and the posterior probability of the null hypothesis have been studied in univariate hypothesis testing situations. This paper extends these comparisons to multiple testing and in particular to the Bonferroni multiple testing method, in which p-values are adjusted by multiplying by k, the number of tests considered. In the Bayesian setting, prior assessments may need to be adjusted to account for multiple hypotheses, resulting in corresponding adjustments to the posterior probabilities. Conditions are given for which the adjusted posterior probabilities roughly correspond to Bonferroni adjusted p-values.

222 citations


Journal ArticleDOI
TL;DR: In this paper, a model for longitudinal ordinal data with non-random dropout was proposed, which combines the multivariate Dale model with a logistic regression model for dropout.
Abstract: A model is proposed for longitudinal ordinal data with nonrandom drop-out, which combines the multivariate Dale model for longitudinal ordinal data with a logistic regression model for drop-out Since response and drop-out are modelled as conditionally independent given complete data, the resulting likelihood can be maximised relatively simply, using the EM algorithm, which with acceleration is acceptably fast and, with appropriate additions, can produce estimates of precision The approach is illustrated with an example Such modelling of nonrandom drop-out requires caution because the interpretation of the fitted models depends on assumptions that are unexaminable in a fundamental sense, and the conclusions cannot be regarded as necessarily robust The main role of such modelling may be as a component of a sensitivity analysis

Journal ArticleDOI
TL;DR: In this article, necessary and sufficient conditions for the redundancy of a wide class of nonlinear models for data distributed according to the exponential family are established, and the likelihood surfaces for parameter-redundant models possess completely flat ridges.
Abstract: SUMMARY Necessary and sufficient conditions are established for the parameter redundancy of a wide class of nonlinear models for data distributed according to the exponential family. The likelihood surfaces for parameter-redundant models possess completely flat ridges. Whether a model is parameter redundant can be established by checking the rank of a derivative matrix, using a symbolic algebra package. A feature of contingency table applications is the need to extend conclusions from particular to general dimensions. We meet this via an extension theorem. Examples are given from the area of animal survival estimation using mark-recapture/recovery data.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a minimum chi-squared estimator for the parameters of an ergodic system of stochastic differential equations with partially observed state, and proved that the efficiency of the estimator approaches that of maximum likelihood as the number of moment functions entering the chi-square criterion increases.
Abstract: SUMMARY We propose a minimum chi-squared estimator for the parameters of an ergodic system of stochastic differential equations with partially observed state. We prove that the efficiency of the estimator approaches that of maximum likelihood as the number of moment functions entering the chi-squared criterion increases and as the number of past observations entering each moment function increases. The minimised criterion is asymptotically chi-squared and can be used to test system adequacy. When a fitted system is rejected, inspecting studentised moments suggests how the fitted system might be modified to improve the fit. The method and diagnostic tests are applied to daily observations on the U.S. dollar to Deutschmark exchange rate from 1977 to 1992.

Journal ArticleDOI
Oliver Linton1
TL;DR: In this article, the integration method of Linton & Nielsen (1995) is used to provide starting values that are then used in a one-step backfitting procedure to estimate additive nonparametric regression models.
Abstract: SUMMARY We define a new procedure for estimating additive nonparametric regression models. We use the integration method of Linton & Nielsen (1995) to provide starting values that are then used in a one-step backfitting procedure. We show that our new method is efficient in a certain sense and dominates the straight integration method according to mean squared error.

Journal ArticleDOI
TL;DR: In this paper, the authors present a method that enables them to calculate the survival distribution of quality adjusted lifetime using martingale theory for counting processes, and they can show that their estimator is asymptotically consistent, normally distributed, and its variance estimate can be obtained analytically.
Abstract: SUMMARY Quality adjusted survival analysis is a new approach to therapy evaluation in clinical trials. It has received much attention recently because of its ability to take patients' quality of life into consideration. In this paper, we present a method that enables us to calculate the survival distribution of quality adjusted lifetime. Using martingale theory for counting processes, we can show that our estimator is asymptotically consistent, normally distributed, and its asymptotic variance estimate can be obtained analytically. Simulation experiments are conducted to compare our estimator with the true underlying distribution for two cases that are of practical importance.

Journal ArticleDOI
TL;DR: In a simulation study it is verified that the modified AIC and modified C p provide better approximations to their risk functions, and better model selection, than A IC and C p.
Abstract: The Akaike information criterion, AIC, and the Mallows' C p criterion have been proposed as approximately unbiased estimators for their risks or underlying criterion functions. In this paper we propose modified AIC and C P , for selecting multivariate linear regression models. Our modified AIC and modified C p are intended to reduce bias in situations where the collection of candidate models includes both underspecified and overspecified models. In a simulation study it is verified that the modified AIC and modified C p provide better approximations to their risk functions, and better model selection, than AIC and C p .

Journal ArticleDOI
TL;DR: In this paper, a general regression model is proposed to evaluate covariate effects on ROC'S. The method is illustrated on data from a study of multiformat photographic images used for scintigraphy.
Abstract: SUMMARY Receiver operating characteristic curves (ROC's) are used to evaluate diagnostic tests when test results are not binary. They describe the inherent capacity of the test for distinguishing between truly diseased and nondiseased subjects. Although methodology for estimating and for comparing ROC'S is well developed, to date no general framework exists for evaluating covariate effects on ROC'S. We formulate a general regression model which allows the effects of covariates on test accuracy to be succinctly summarised. Such covariates might include, for example, characteristics of the patient or test environment, test type or severity of disease. The regression models are shown to arise naturally from some classic models for continuous or ordinal test data. Regression parameters are fitted using an estimating equation approach. The method is illustrated on data from a study of multiformat photographic images used for scintigraphy.

Journal ArticleDOI
TL;DR: In this article, the authors developed empirical likelihood in time series models by application of Whittle's estimation method, one obtains an M-estimator of the parameter of a stationary time series from approximately independent observations, the periodogram ordinates.
Abstract: SUMMARY This paper develops empirical likelihood in time series models. By application of Whittle's estimation method, one obtains an M-estimator of the parameter of a stationary time series from approximately independent observations, the periodogram ordinates. This estimator is used to obtain an empirical likelihood ratio which is asymptotically distributed as x2 and which can be used to construct confidence regions. A procedure for the Bartlett correction is also proposed. Finally, small sample properties of the empirical likelihood confidence regions are explored through a simulation.

Journal ArticleDOI
TL;DR: In this article, conditional inclusion probabilities of ever being included in the nested case-control study can be obtained, where conditioning is on the information needed to carry out a nested case control study.
Abstract: In nested case-control studies the controls are sampled from the risk set at the failure times of the cases. The analytical basis for such studies has been limited to semiparametric estimators under proportional-hazard models. In this paper it is observed that conditional inclusion probabilities of ever being included in the nested case-control study can be obtained, where conditioning is on the information needed to carry out a nested case-control study. The inclusion probabilities are used in pseudolikelihoods by weighting the individual log-likelihood contributions by their inverse. This makes it possible to fit parametric regression models. Also a new semiparametric estimator is obtained under the proportional-hazard model. The methods are illustrated by simulation experiments and by application to a dataset.

Journal ArticleDOI
TL;DR: In this paper, a Bayesian methodology for estimating the size of a closed population from multiple incomplete administrative lists is proposed, which allows for a variety of dependence structures between the lists, can make use of covariates, and explicitly accounts for model uncertainty.
Abstract: SUMMARY A Bayesian methodology for estimating the size of a closed population from multiple incomplete administrative lists is proposed. The approach allows for a variety of dependence structures between the lists, can make use of covariates, and explicitly accounts for model uncertainty. Interval estimates from this approach are compared to frequentist and previously published Bayesian approaches. Several examples are considered.

Journal ArticleDOI
TL;DR: In this paper, the authors consider priors obtained by ensuring approximate frequentist validity of posterior quantiles and the posterior distribution function, and show that, at the second order of approximation, the two approaches do not necessarily lead to identical conclusions.
Abstract: SUMMARY The paper considers priors obtained by ensuring approximate frequentist validity of (a) posterior quantiles, and of (b) the posterior distribution function. It is seen that, at the second order of approximation, the two approaches do not necessarily lead to identical conclusions. Examples are given to illustrate this. The role of invariance in the context of probability matching is also discussed.

Journal ArticleDOI
TL;DR: In this paper, two methods for handling missing covariates in using the Cox proportional hazards model were proposed, and the proposed method provides a consistent regression parameter estimator when the probability of missingness depends on the failure or censoring time as well as on the observed covariates.
Abstract: SUMMARY We propose two methods for handling missing covariates in using the Cox proportional hazards model. The maximum partial likelihood estimator based only on study subjects having complete covariates does not utilise all available information. Also it is biased when the probability of missingness depends on the failure or censoring time. Our suggestion is to impute the conditional expectation of the statistic involving missing covariates given the available information. The proposed method provides a consistent regression parameter estimator when the probability of missingness depends on the failure or censoring time as well as on the observed covariates. Also the proposed estimator is more efficient than the estimator suggested previously by Lin & Ying (1993), when data are missing completely at random.

Journal ArticleDOI
Mike West1
TL;DR: In this article, a constructive result on time series decomposition is presented and illustrated, which is useful in analysis of an observed time series through inference about underlying, latent component series that may have physical interpretations.
Abstract: A constructive result on time series decomposition is presented and illustrated. Developed through dynamic linear models, the decomposition is useful in analysis of an observed time series through inference about underlying, latent component series that may have physical interpretations. Particular special cases include state space autoregressive component models, in which the decomposition is useful for isolating latent, quasi-cyclical components, in particular. Brief summaries of analyses of some geological records related to climatic change illustrate the result.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric model for the unknown joint distribution for the missing data, the observed covariates and the proxy is proposed, which defines the measurement error component of the model which relates the missing covariates X with a proxy W.
Abstract: SUMMARY We develop a model and a numerical estimation scheme for a Bayesian approach to inference in case-control studies with errors in covariables. The model proposed in this paper is based on a nonparametric model for the unknown joint distribution for the missing data, the observed covariates and the proxy. This nonparametric distribution defines the measurement error component of the model which relates the missing covariates X with a proxy W. The oxymoron 'nonparametric Bayes' refers to a class of flexible mixture distributions. For the likelihood of disease, given covariates, we choose a logistic regression model. By using a parametric disease model and nonparametric exposure model we obtain robust, interpretable results quantifying the effect of exposure.

Journal ArticleDOI
TL;DR: In this article, a method based on a generalised influence function and generalised Cook statistic is introduced to assess the local influence of small perturbations on a statistic of interest, and this method is equivalent to Cook's work based on normal curvature in the likelihood framework.
Abstract: SUMMARY Based on the definition of generalised influence function and generalised Cook statistic, local influence of small perturbations on the eigenvalues and eigenvectors of a covariance matrix are studied for population and sample versions. The results based on the correlation matrix are also derived and some related topics are discussed. Finally, an example is used for illustration. bations. Although this method has been applied to many models (Beckman, Nachtsheim & Cook, 1987; Lawrance, 1988; Thomas & Cook, 1990), assessing the local influence for multivariate data is still an unexplored area. In this paper, a method based on a generalised influence function and generalised Cook statistic is introduced to assess the local influence of small perturbations on a statistic of interest, and we prove that this method is equivalent to Cook's work based on normal curvature in the likelihood framework. Since local influence on some statistics in multivariate analysis is not easy to study using the likelihood displacement of Cook (1986), this method is expected to have wider applications. In this paper, we employ the generalised Cook statistic to assess local influence in principal components analysis. In ? 2, we introduce the definition of the generalised influ- ence function and generalised Cook statistic. In ? 3, the perturbation theory of eigenvalues and eigenvectors of a real symmetric matrix is studied. In ? 4, the local influence of small perturbations of observations on eigenvalues and the eigenvectors of sample covariance

Journal ArticleDOI
TL;DR: In this article, the Cox model marginal survivor function and pairwise correlation models are specified for a multivariate failure time vector and the corresponding mean and covariance structure for the cumulative baseline hazard variates and standard baseline hazard function estimators are used to develop joint estimating equations for hazard ratio and correlation parameters, in the absence of censorship.
Abstract: SUMMARY Cox model marginal survivor function and pairwise correlation models are specified for a multivariate failure time vector. The corresponding mean and covariance structure for the cumulative baseline hazard variates and standard baseline hazard function estimators are used to develop joint estimating equations for hazard ratio and correlation parameters, in the absence of censorship. Semiparametric models for pairwise survivor functions are required to generalise these equations to allow arbitrary right censorship. Under Clayton model bivariate distributions the resulting equations lead to joint estimators of hazard ratio and cross ratio parameters, and to inferences with useful and ready interpretation. For example, these estimates yield summary measures of pairwise dependency that have been adjusted for covariate effects on marginal hazard rates. Solutions to the proposed estimating equations are shown to be quite generally consistent and asymptotically normally distributed. Moderate sample size properties are examined in simulation studies, and illustrations are provided.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the effect of long-range dependence on bandwidth selection for kernel regression with the plug-in method of Herrmann, Gasser & Kneip.
Abstract: SUMMARY We investigate the effect of long-range dependence on bandwidth selection for kernel regression with the plug-in method of Herrmann, Gasser & Kneip (1992). A new bandwidth estimator is proposed to allow for long-range dependence. Properties of the proposed estimator are investigated theoretically and via simulation. We find that the proposed estimator performs well in terms of integrated squared error of the estimated trend, allowing us to incorporate both deterministic nonlinear features having an unknown structure and long-range dependence into a single model. The method is illustrated using biweekly measurements of the volume of the Great Salt Lake.

Journal ArticleDOI
TL;DR: This work adapts frailty models to relevant censoring processes, and proposes a predictively oriented test statistic and a related estimation procedure based on summing concurrent martingales, each representing a 'landmark' analysis.
Abstract: Serial biological markers are often used in medicine as prognostic indicators. For example, elevated tumour antigen tests presage tumour recurrence; rapid decrease in the CD4 cell counts of HIV + individuals indicate high risk for developing AIDS. Methods for evaluating the usefulness of markers should address how the prediction of future risk is revised, in a way that assists the making of decisions at the time such an indicator is observed. Proportional hazards modelling with time-varying covariates, a common approach, seems unnatural for this purpose because its orientation is explanatory; it addresses the question 'Is the individual suffering a failure more likely to have had the marker?', and allows use of all information prior to the failure, rather than the more relevant predictive question 'Does the appearance of the marker alter the individual's subsequent risk enough to warrant intervention today?'. Frailty models do address this question, but available methods for their analysis make censoring assumptions which are inappropriate in the predictive setting. We adapt frailty models to relevant censoring processes, and propose a predictively oriented test statistic and a related estimation procedure based on summing concurrent martingales, each representing a 'landmark' analysis.

Journal ArticleDOI
TL;DR: In this article, a new bivariate survival function estimator is proposed in the case where the dependence relationship between the censoring variables are modelled, and large sample properties of the proposed estimators are discussed.
Abstract: SUMMARY New bivariate survival function estimators are proposed in the case where the dependence relationship between the censoring variables are modelled. Specific examples include the cases when censoring variables are univariate, mutually independent or specified by a marginal model. Large sample properties of the proposed estimators are discussed. The finite sample performance of the proposed estimators compared with other fully nonparametric estimators is studied via simulations. A real data example is given.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new span selector based on unbiased risk estimation, which does not require the unknown spectrum to possess a second derivative, which is a typical requirement of most plug-in type or spline-based methods.
Abstract: SUMMARY One approach to estimating the spectral density of a stationary time series is to smooth the periodogram and one important component of this approach is the choice of the span for smoothing. This note proposes a new span selector which is based on unbiased risk estimation. The proposed span selector is simple, and does not impose strong conditions on the unknown spectrum. For example, it does not require the unknown spectrum to possess a second derivative, which is a typical requirement of most plug-in type or spline-based methods. The finite sample performance of the proposed span selector is illustrated via a small simulation.