scispace - formally typeset
Search or ask a question

Showing papers in "Journal of The Royal Statistical Society Series C-applied Statistics in 2001"


Journal ArticleDOI
TL;DR: In this paper, a unified approach for Bayesian inference via Markov chain Monte Carlo (MCMC) simulation in generalized additive and semiparametric mixed models is presented, which is particularly appropriate for discrete and other fundamentally non-Gaussian responses, where Gibbs sampling techniques developed for Gaussian models cannot be applied.
Abstract: Most regression problems in practice require flexible semiparametric forms of the predictor for modelling the dependence of responses on covariates. Moreover, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal or spatial data. We present a unified approach for Bayesian inference via Markov chain Monte Carlo (MCMC) simulation in generalized additive and semiparametric mixed models. Different types of covariates, such as usual covariates with fixed effects, metrical covariates with nonlinear effects, unstructured random effects, trend and seasonal components in longitudinal data and spatial covariates are all treated within the same general framework by assigning appropriate priors with different forms and degrees of smoothness. The approach is particularly appropriate for discrete and other fundamentally non-Gaussian responses, where Gibbs sampling techniques developed for Gaussian models cannot be applied, but it also works well for Gaussian responses. We use the close relation between nonparametric regression and dynamic or state space models to develop posterior sampling procedures, based on Markov random field priors. They include recent Metropolis-Hastings block move algorithms for dynamic generalized linear models and extensions for spatial covariates as building blocks. We illustrate the approach with a number of applications that arose out of consulting cases, showing that the methods are computionally feasible also in problems with many covariates and large data sets.

487 citations


Journal ArticleDOI
TL;DR: In this article, a general model for the joint analysis of [T, YIX] and other related functionals by using the relevant information in both T and Y is presented.
Abstract: Summary. In biomedical and public health research, both repeated measures of biomarkers Y as well as times Tto key clinical events are often collected for a subject. The scientific question is how the distribution of the responses [T, YIX] changes with covariates X [TIX] may be the focus of the estimation where Ycan be used as a surrogate for T. Alternatively, Tmay be the time to drop-out in a study in which [YIX] is the target for estimation. Also, the focus of a study might be on the effects of covariates X on both T and Y or on some underlying latent variable which is thought to be manifested in the observable outcomes. In this paper, we present a general model for the joint analysis of [T, YIX] and apply the model to estimate [TIX] and other related functionals by using the relevant information in both Tand Y We adopt a latent variable formulation like that of Fawcett and Thomas and use it to estimate several quantities of clinical relevance to determine the efficacy of a treatment in a clinical trial setting. We use a Markov chain Monte Carlo algorithm to estimate the model's parameters. We illustrate the methodology with an analysis of data from a clinical trial comparing risperidone with a placebo for the treatment of schizophrenia.

237 citations


Journal ArticleDOI
TL;DR: In this paper, a logistic random-intercepts model was used in the context of a longitudinal clinical trial where the Gauss-Hermite method gave valid results only for a high number of quadrature points (Q).
Abstract: Although generalized linear mixed models are recognized to be of major practical importance, it is also known that they can be computationally demanding. The problem is the evaluation of the integral in calculating the marginalized likelihood. The straightforward method is based on the Gauss–Hermite technique, based on Gaussian quadrature points. Another approach is provided by the class of penalized quasi-likelihood methods. It is commonly believed that the Gauss–Hermite method works relatively well in simple situations but fails in more complicated structures. However, we present here a strikingly simple example of a logistic random-intercepts model in the context of a longitudinal clinical trial where the method gives valid results only for a high number of quadrature points (Q). As a consequence, this result warns the practitioner to examine routinely the dependence of the results on Q. The adaptive Gaussian quadrature, as implemented in the new SAS procedure NLMIXED, offered the solution to our problem. However, even the adaptive version of Gaussian quadrature needs careful handling to ensure convergence.

214 citations


Journal ArticleDOI
TL;DR: It is shown that, given an estimate of the average age‐specific hazard of infection, a particular leading left eigenfunction is required to specify R0, and it is proposed that the choice of model be guided by a criterion based on similarity of their contact functions, which allows model uncertainty to be taken into account.
Abstract: The basic reproduction number of an infection, R0, is the average number of secondary infections generated by a single typical infective individual in a totally susceptible population. It is directly related to the effort required to eliminate infection. We consider statistical methods for estimating R0 from age-stratified serological survey data. The main difficulty is indeterminacy, since the contacts between individuals of different ages are not observed. We show that, given an estimate of the average age-specific hazard of infection, a particular leading left eigenfunction is required to specify R0. We review existing methods of estimation in the light of this indeterminacy. We suggest using data from several infections transmitted via the same route, and we propose that the choice of model be guided by a criterion based on similarity of their contact functions. This approach also allows model uncertainty to be taken into account. If one infection induces no lasting immunity, we show that the only additional assumption required to estimate R0 is that the contact function is symmetric. When matched data on two or more infections transmitted by the same route are available, the methods may be extended to incorporate the effect of individual heterogeneity. The approach can also be applied in partially vaccinated populations and to populations comprising loosely linked communities. The methods are illustrated with data on hepatitis A, mumps, rubella, parvovirus, Haemophilus influenzae type b and measles infection.

187 citations


Journal ArticleDOI
TL;DR: In this article, Buyse et al. proposed a new definition of validity in terms of the quality of both trial level and individual level associations between the surrogate and true end points.
Abstract: Before a surrogate end point can replace a final (true) end point in the evaluation of an experimental treatment, it must be formally ‘validated’. The validation will typically require large numbers of observations. It is therefore useful to consider situations in which data are available from several randomized experiments. For two normally distributed end points Buyse and co-workers suggested a new definition of validity in terms of the quality of both trial level and individual level associations between the surrogate and true end points. This paper extends this approach to the important case of two failure time end points, using bivariate survival modelling. The method is illustrated by using two actual sets of data from cancer clinical trials.

184 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed an approach which identifies and incorporates both sources of uncertainty in inference: imprecision due to finite sampling and ignorance due to incompleteness, which produces sets of estimates (regions of ignorance) and sets of confidence regions (combined into regions of uncertainty).
Abstract: Classical inferential procedures induce conclusions from a set of data to a population of interest, accounting for the imprecision resulting from the stochastic component of the model. Less attention is devoted to the uncertainty arising from (unplanned) incompleteness in the data. Through the choice of an identifiable model for non-ignorable non-response, one narrows the possible data-generating mechanisms to the point where inference only suffers from imprecision. Some proposals have been made for assessing the sensitivity to these modelling assumptions; many are based on fitting several plausible but competing models. For example, we could assume that the missing data are missing at random in one model, and then fit an additional model where non-random missingness is assumed. On the basis of data from a Slovenian plebiscite, conducted in 1991, to prepare for independence, it is shown that such an ad hoc procedure may be misleading. We propose an approach which identifies and incorporates both sources of uncertainty in inference: imprecision due to finite sampling and ignorance due to incompleteness. A simple sensitivity analysis considers a finite set of plausible models. We take this idea one step further by considering more degrees of freedom than the data support. This produces sets of estimates (regions of ignorance) and sets of confidence regions (combined into regions of uncertainty).

110 citations


Journal ArticleDOI
TL;DR: A space–time model for use in environmental monitoring applications is developed as a high dimensional multivariate state space time series model, in which the cross-covariance structure is derived from the spatial context of the component series, in such a way that its interpretation is essentially independent of the particular set of spatial locations at which the data are recorded.
Abstract: Motivated by a specific problem concerning the relationship between radar reflectance and rainfall intensity, the paper develops a space–time model for use in environmental monitoring applications. The model is cast as a high dimensional multivariate state space time series model, in which the cross-covariance structure is derived from the spatial context of the component series, in such a way that its interpretation is essentially independent of the particular set of spatial locations at which the data are recorded. We develop algorithms for estimating the parameters of the model by maximum likelihood, and for making spatial predictions of the radar calibration parameters by using realtime computations. We apply the model to data from a weather radar station in Lancashire, England, and demonstrate through empirical validation the predictive performance of the model.

103 citations


Journal ArticleDOI
TL;DR: In the analysis of paired comparison data concerning European universities and students' characteristics, it is demonstrated how to incorporate subject‐specific information into Bradley–Terry‐type models.
Abstract: Summary. Preference decisions will usually depend on the characteristics of both the judges and the objects being judged. In the analysis of paired comparison data concerning European universities and students' characteristics, it is demonstrated how to incorporate subject-specific information into Bradley-Terry-type models. Using this information it is shown that preferences for universities and therefore university rankings are dramatically different for different groups of students. A log-linear representation of a generalized Bradley-Terry model is specified which allows simultaneous modelling of subject- and object-specific covariates and interactions between them. A further advantage of this approach is that standard software for fitting log-linear models, such as GLIM, can be used.

84 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a method for estimating parameters in generalized linear models when the outcome variable is missing for some subjects and the missing data mechanism is non-ignorable, without having to specify a nonignorable model.
Abstract: We propose a method for estimating parameters in generalized linear models when the outcome variable is missing for some subjects and the missing data mechanism is non-ignorable. We assume throughout that the covariates are fully observed. One possible method for estimating the parameters is maximum likelihood with a non-ignorable missing data model. However, caution must be used when fitting non-ignorable missing data models because certain parameters may be inestimable for some models. Instead of fitting a non-ignorable model, we propose the use of auxiliary information in a likelihood approach to reduce the bias, without having to specify a non-ignorable model. The method is applied to a mental health study.

68 citations


Journal ArticleDOI
TL;DR: In this paper, the authors show that the non-informative priors in use in the literature generate a bias towards wider date ranges which does not in general reflect substantial prior knowledge.
Abstract: Bayesian methods are now widely used for analysing radiocarbon dates. We find that the non-informative priors in use in the literature generate a bias towards wider date ranges which does not in general reflect substantial prior knowledge. We recommend using a prior in which the distribution of the difference between the earliest and latest dates has a uniform distribution. We show how such priors are derived from a simple physical model of the deposition and observation process. We illustrate this in a case-study, examining the effect that various priors have on the reconstructed dates, Bayes factors are used to help to decide model choice problems.

63 citations


Journal ArticleDOI
TL;DR: A control chart based on conditional expected values to detect changes in the mean strength when the censoring occurs due to competing risks and to protect against possible confounding caused by changes inThe mean of the censors mechanism is proposed.
Abstract: In industry, process monitoring is widely employed to detect process changes rapidly. However, in some industrial applications observations are censored. For example, when testing breaking strengths and failure times often a limited stress test is performed. With censored observations, a direct application of traditional monitoring procedures is not appropriate. When the censoring occurs due to competing risks, we propose a control chart based on conditional expected values to detect changes in the mean strength. To protect against possible confounding caused by changes in the mean of the censoring mechanism we also suggest a similar chart to detect changes in the mean censoring level. We provide an example of monitoring bond strength to illustrate the application of this methodology.

Journal ArticleDOI
TL;DR: In this article, the likelihood function is built for a general model with k changepoints and applied to the data set, the parameters are estimated and life-table and transition probabilities for treatments in different periods of time are given.
Abstract: functions. The likelihood function is built for a general model with k changepoints and applied to the data set, the parameters are estimated and life-table and transition probabilities for treatments in different periods of time are given. The survival probability functions for different treatments are plotted and compared with the corresponding function for the homogeneous model. The survival functions for the various cohorts submitted for treatment are fitted to the empirical survival functions.

Journal ArticleDOI
TL;DR: The effects of a prospective drug utilization review and patients' characteristics on total in‐patient and out‐patient health care charges are examined and a linear regression model with a non‐constant variance (heteroscedasticity) is proposed.
Abstract: We examine the effects of a prospective drug utilization review and patients' characteristics on total in-patient and out-patient health care charges. Our analysis of charges is complicated by the fact that the total health care charges are skewed. A log-transformation of these charges can normalize their distribution but may not stabilize their variance. To handle these problems, we propose a linear regression model with a non-constant variance (heteroscedasticity). Using results from a fitted linear regression model for log-transformed charges, we also discuss interpreting the regression coefficients in the original scale and estimating the total health care charges to individual patients. Employing these methods, we analyse total health care charges for drug utilization review patients with hypertension and identify patients' factors that are related to their total health care charges.

Journal ArticleDOI
TL;DR: In this article, the development and application of non-stationary time series models for multiple EEG series generated from individual subjects in a clinical neuropsychiatric setting were discussed, where depressed patients experiencing generalized tonic-clonic seizures elicited by electroconvulsive therapy (ECT) as antidepressant treatment.
Abstract: Summary. Multiple time series of scalp electrical potential activity are generated routinely in electroencephalographic (EEG) studies. Such recordings provide important non-invasive data about brain function in human neuropsychiatric disorders. Analyses of EEG traces aim to isolate characteristics of their spatiotemporal dynamics that may be useful in diagnosis, or may improve the understanding of the underlying neurophysiology or may improve treatment through identifying predictors and indicators of clinical outcomes. We discuss the development and application of nonstationary time series models for multiple EEG series generated from individual subjects in a clinical neuropsychiatric setting. The subjects are depressed patients experiencing generalized tonic-clonic seizures elicited by electroconvulsive therapy (ECT) as antidepressant treatment. Two varieties of models-dynamic latent factor models and dynamic regression models-are introduced and studied. We discuss model motivation and form, and aspects of statistical analysis including parameter identifiability, posterior inference and implementation of these models via Markov chain Monte Carlo techniques. In an application to the analysis of a typical set of 19 EEG series recorded during an ECT seizure at different locations over a patient's scalp, these models reveal time-varying features across the series that are strongly related to the placement of the electrodes. We illustrate various model outputs, the exploration of such time-varying spatial structure and its relevance in the ECT study, and in basic EEG research in general.

Journal ArticleDOI
TL;DR: A simple extension of the penalized spline model allows the relationship between weight and the onset of tumour to vary from one experiment to another, and allows non‐linearity in the effects of body weight by modelling the relationship nonparametrically through a penalizedspline.
Abstract: Summary The analysis of animal carcinogenicity data is complicated by various statistical issues A topic of recent debate is how to control for the effect of the animals' body weight on the outcome of interest, the onset of tumours We propose a method which incorporates historical information from the control animals in previously conducted experiments We allow non-linearity in the effects of body weight by modelling the relationship nonparametrically through a penalized spline A simple extension of the penalized spline model allows the relationship between weight and onset of the tumour to vary from one experiment to another

Journal ArticleDOI
TL;DR: A simulation‐based approach to decision theoretic Bayesian optimal design is proposed for choosing sampling times for the anticancer agent paclitaxel, using criteria related to the total area under the curve, the time above a critical threshold and the sampling cost.
Abstract: We propose a simulation-based approach to decision theoretic Bayesian optimal design. The underlying probability model is a population pharmacokinetic model which allows for correlated responses (drug concentrations) and patient-to-patient heterogeneity. We consider the problem of choosing sampling times for the anticancer agent paclitaxel, using criteria related to the total area under the curve, the time above a critical threshold and the sampling cost.

Journal ArticleDOI
TL;DR: The transmission probability for the human immunodeficiency virus is estimated from seroconversion data of a cohort of injecting drug users in Thailand using maximum likelihood methods and incorporates each IDU's reported frequency of needle sharing and injecting acts.
Abstract: We estimate the transmission probability for the human immunodeficiency virus from seroconversion data of a cohort of injecting drug users (IDUs) in Thailand. The transmission probability model developed accounts for interval censoring and incorporates each IDU's reported frequency of needle sharing and injecting acts. Using maximum likelihood methods, the per needle sharing act transmission probability estimate between infectious and susceptible IDUs is 0.008. The effects of covariates, disease dynamics, mismeasured exposure information and the uncertainty of the disease prevalence on the transmission probability estimate are considered.

Journal ArticleDOI
TL;DR: A new form of non‐linear autoregressive time series is proposed to model solar radiation data, by specifying joint marginal distributions at low lags to be multivariate Gaussian mixtures.
Abstract: A new form of non-linear autoregressive time series is proposed to model solar radiation data, by specifying joint marginal distributions at low lags to be multivariate Gaussian mixtures. The model is also a type of multiprocess dynamic linear model, but with the advantage that the likelihood has a closed form.

Journal ArticleDOI
TL;DR: Qualitative issues arise in studies involving the repeated occurrence of certain events, including the validity of semiparametric methods for multiplicative hazard‐based models and the possibilities for marginal analysis of successive gap times, in conjunction with an examination of observational data on repeated shunt failures for a population of children with hydrocephalus.
Abstract: We consider studies involving the repeated occurrence of certain events, in which the emphasis is on the gaps or times between events. Interesting methodological issues arise in such situations, including the validity of semiparametric methods for multiplicative hazard-based models and the possibilities for marginal analysis of successive gap times. We discuss these and other points in conjunction with an examination of observational data on repeated shunt failures for a population of children with hydrocephalus.

Journal ArticleDOI
TL;DR: In this article, the area under the receiver operating characteristic (ROC) curve corresponds to the Wilcoxon statistics if the area is calculated by the trapezoidal rule and it is well known that, when sample observations are independent, the area over the ROC curve correspond to theWilcoxon statistic.
Abstract: Summary. It is well known that, when sample observations are independent, the area under the receiver operating characteristic (ROC) curve corresponds to the Wilcoxon statistics if the area is calculated by the trapezoidal rule. Correlated ROC curves arise often in medical research and have been studied by various parametric methods. On the basis of the Mann-Whitney U-statistics for clustered data proposed by Rosner and Grove, we construct an average ROC curve and derive nonparametric methods to estimate the area under the average curve for correlated ROC curves obtained from multiple readers. For the more complicated case where, in addition to multiple readers examining results on the same set of individuals, two or more diagnostic tests are involved, we derive analytic methods to compare the areas under correlated average ROC curves for these diagnostic tests. We demonstrate our methods in an example and compare our results with those obtained by other methods. The nonparametric average ROC curve and the analytic methods that we propose are easy to explain and simple to implement.

Journal ArticleDOI
TL;DR: In this article, a Bayesian multivariate time series model was used for the analysis of the dynamics of carbon monoxide atmospheric concentrations at four sites and the authors considered the problem of joint temporal prediction when data are observed at a few sites and it is not possible to fit a complex space-time model.
Abstract: Summary. We use a Bayesian multivariate time series model for the analysis of the dynamics of carbon monoxide atmospheric concentrations. The data are observed at four sites. It is assumed that the logarithm of the observed process can be represented as the sum of unobservable components: a trend, a daily periodicity, a stationary autoregressive signal and an erratic term. Bayesian analysis is performed via Gibbs sampling. In particular, we consider the problem of joint temporal prediction when data are observed at a few sites and it is not possible to fit a complex space-time model. A retrospective analysis of the trend component is also given, which is important in that it explains the evolution of the variability in the observed process.

Journal ArticleDOI
TL;DR: This paper deals with the analysis of population-based (unmatched) case-control studies in which the controls (and sometimes the cases) are obtained through a complex multistage survey.
Abstract: Summary. The use of complex sampling designs in population-based case-control studies is becoming more common, particularly for sampling the control population. This is prompted by all the usual cost and logistical benefits that are conferred by multistage sampling. Complex sampling has often been ignored in analysis but, with the advent of packages like SUDAAN, survey-weighted analyses that take account of the sample design can be carried out routinely. This paper explores this approach and more efficient alternatives, which can also be implemented by using readily available software. In this paper we deal with the analysis of population-based (unmatched) case-control studies in which the controls (and sometimes the cases) are obtained through a complex multistage survey. Such studies are becoming reasonably common. This paper was motivated by a study funded by the New Zealand Ministry of Health and the Health Research Council looking at meningitis in young children. The study population consists of all children under 9 years of age in the Auckland region. There were about 250 cases of meningitis over the 3-year duration of the study and all these are included. In addition, a sample of controls was drawn from the remaining children in the study population by a complex multistage design. At the first stage, a sample of 300 census mesh blocks (each containing roughly 70 households) was drawn with probability proportional to the number of houses in the block. Then a systematic sample of 20 households was selected from each chosen mesh block and children from these households were selected for the study with varying probabilities that depend on age and ethnicity as in Table 1. These probabilities were chosen to match the expected frequencies among the cases. The cluster sample sizes varied from 1 to 6 and a total of approximately 250 controls was achieved. This corresponds to a sampling fraction of about 1 in 400, so cases are sampled at a rate that is 400 times that for controls. Similar population-based case-control studies are described in Graubard et al. (1989) and Fears and Gail (2000). For example, Fears and Gail (2000) discussed a population-based case-control study of the effects of ultraviolet radiation on non-melanoma skin cancer. They had a sample of approximately 3000 cases, stratified by age, in nine US regions over a 1-year period and a complex sample of approximately 8000 controls collected from the same regions over the same period. In the control sample, 100 clusters of 100 telephone numbers were

Journal ArticleDOI
TL;DR: This paper proposes a Bayesian method for analysing data from animal carcinogenicity experiments and accommodates occult tumours and censored onset times without restricting tumour lethality, relying on cause‐of‐death data, or requiring interim sacrifices.
Abstract: Statistical inference about tumorigenesis should focus on the tumour incidence rate. Unfortunately, in most animal carcinogenicity experiments, tumours are not observable in live animals and censoring of the tumour onset times is informative. In this paper, we propose a Bayesian method for analysing data from such studies. Our approach focuses on the incidence of tumours and accommodates occult tumours and censored onset times without restricting tumour lethality, relying on cause-of-death data, or requiring interim sacrifices. We represent the underlying state of nature by a multistate stochastic process and assume general probit models for the time-specific transition rates. These models allow the incorporation of covariates, historical control data and subjective prior information. The inherent flexibility of this approach facilitates the interpretation of results, particularly when the sample size is small or the data are sparse. We use a Gibbs sampler to estimate the relevant posterior distributions. The methods proposed are applied to data from a US National Toxicology Program carcinogenicity study.

Journal ArticleDOI
TL;DR: In this article, the adaptive weights smoothing method was applied to time series of images, which typically occur in functional and dynamic MRI, for signal detection in functional MRI and the analysis of dynamic MRI can benefit from spatially adaptive smoothing.
Abstract: We consider the problem of statistical inference for functional and dynamic magnetic resonance imaging (MRI). A new approach is proposed which extends the adaptive weights smoothing procedure of Polzehl and Spokoiny that was originally designed for image denoising. We demonstrate how the adaptive weights smoothing method can be applied to time series of images, which typically occur in functional and dynamic MRI. It is shown how signal detection in functional MRI and the analysis of dynamic MRI can benefit from spatially adaptive smoothing. The performance of the procedure is illustrated by using real and simulated data.

Journal ArticleDOI
TL;DR: In this paper, a stochastic transition framework is proposed to represent the transition from one state to another over age, where individuals are viewed as belonging to one of the states at a given age, but with development pass to another.
Abstract: Psychological theories often posit the existence of several different states. Individuals are viewed as belonging to one of the states at a given age, but with development pass to another state. A main problem in evaluating such theories is representing the transition from one state to another over age. A stochastic transition framework is proposed which should be useful in many different settings. The model is illustrated with data from a cognitive development task.

Journal ArticleDOI
TL;DR: Results for a revised model indicate that a degree of overdispersion exists, but that the estimates of origin–destination flow rates are quite insensitive to the change in model specification.
Abstract: The road system in region RA of Leicester has vehicle detectors embedded in many of the network's road links. Vehicle counts from these detectors can provide transportation researchers with a rich source of data. However, for many projects it is necessary for researchers to have an estimate of origin-to-destination vehicle flow rates. Obtaining such estimates from data observed on individual road links is a non-trivial statistical problem, made more difficult in the present context by non-negligible measurement errors in the vehicle counts collected. The paper uses road link traffic count data from April 1994 to estimate the origin-destination flow rates for region RA. A model for the error prone traffic counts is developed, but the resulting likelihood is not available in closed form. Nevertheless, it can be smoothly approximated by using Monte Carlo integration. The approximate likelihood is combined with prior information from a May 1991 survey in a Bayesian framework. The posterior is explored using the Hastings-Metropolis algorithm, since its normalizing constant is not available. Preliminary findings suggest that the data are overdispersed according to the original model. Results for a revised model indicate that a degree of overdispersion exists, but that the estimates of origin-destination flow rates are quite insensitive to the change in model specification.

Journal ArticleDOI
TL;DR: In this article, the authors generalize the waterhammer equations to account for uncertainty in the description of the liquid and the pipe-line, the behaviour of the boundaries of the pipe line and the method of solution.
Abstract: The waterhammer equations are a pair of partial differential equations that describe the behaviour of an incompressible fluid in a pipe-line. We generalize these equations to account for uncertainty in the description of the liquid and the pipe-line, the behaviour of the boundaries of the pipe-line and the method of solution. We illustrate applications of our model to pipe-line design and to realtime pipe-line monitoring, e.g. for detecting leaks, and discuss the general features of our approach to the careful sourcing of uncertainty in deterministic models.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a set of estimating equations to estimate the parameters of Cox's proportional hazards model with non-ignorably missing covariate data, and they extended those results to nonignorant data using Monte Carlo EM algorithm.
Abstract: A common occurrence in clinical trials with a survival end point is missing covariate data. With ignorably missing covariate data, Lipsitz and Ibrahim proposed a set of estimating equations to estimate the parameters of Cox's proportional hazards model. They proposed to obtain parameter estimates via a Monte Carlo EM algorithm. We extend those results to non-ignorably missing covariate data. We present a clinical trials example with three partially observed laboratory markers which are used as covariates to predict survival.

Journal ArticleDOI
TL;DR: The model confirmed that the LPA response is significantly impaired in individuals infected with the human immunodeficiency virus (HIV), and this effect was significantly stronger among HIV‐infected individuals.
Abstract: The lymphocyte proliferative assay (LPA) of immune competence was conducted on 52 subjects, with up to 36 processing conditions per subject, to evaluate whether samples could be shipped or stored overnight, rather than being processed on fresh blood as currently required. The LPA study resulted in clustered binary data, with both cluster level and cluster-varying covariates. Two modelling strategies for the analysis of such clustered binary data are through the cluster-specific and population-averaged approaches. Whereas most research in this area has focused on the analysis of matched pairs data, in many situations, such as the LPA study, cluster sizes are naturally larger. Through considerations of interpretation and efficiency of these models when applied to large clusters, the mixed effect cluster-specific model was selected as most appropriate for the analysis of the LPA data. The model confirmed that the LPA response is significantly impaired in individuals infected with the human immunodeficiency virus (HIV). The LPA response was found to be significantly lower for shipped and overnight samples than for fresh samples, and this effect was significantly stronger among HIV-infected individuals. Surprisingly, an anticoagulant effect was not detected.

Journal ArticleDOI
TL;DR: The effect of partial dependence in a binary sequence on tests for the presence of a changepoint or changed segment are investigated and exemplified in the context of modelling non‐coding deoxyribonucleic acid (DNA).
Abstract: Summary. The effect of partial dependence in a binary sequence on tests for the presence of a changepoint or changed segment are investigated and exemplified in the context of modelling noncoding deoxyribonucleic acid (DNA). For the levels of dependence that are commonly seen in such DNA, the null distributions of the test statistics are approximately correct and so conclusions based on them are still valid. A strong dependence would, however, invalidate the use of such procedures.