scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in Medicine in 2012"


Journal ArticleDOI
TL;DR: In this article, the authors formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies, and propose a two-stage analysis for investigating the non-linear exposure-response relationship between temperature and non-accidental mortality using time-series data from multiple cities.
Abstract: In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd.

501 citations


Journal ArticleDOI
TL;DR: The proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis and may be used when applying any procedure for fitting the multivariate random effects model.
Abstract: Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd.

448 citations


Journal ArticleDOI
Xinhua Liu1
TL;DR: This paper introduces an alternative to the traditional methods based on the Youden index and the closest-to-(0, 1) criterion for threshold selection, and applies the method to a measure of blood arsenic levels, selecting a cut point to be used as a warning threshold.
Abstract: In biomedical research and practice, quantitative tests or biomarkers are often used for diagnostic or screening purposes, with a cut point established on the quantitative measurement to aid binary classification. This paper introduces an alternative to the traditional methods based on the Youden index and the closest-to-(0, 1) criterion for threshold selection. A concordance probability evaluating the classification accuracy of a dichotomized measure is defined as an objective function of the possible cut point. A nonparametric approach is used to search for the optimal cut point maximizing the objective function. The procedure is shown to perform well in a simulation study. Using data from a real-world study of arsenic-induced skin lesions, we apply the method to a measure of blood arsenic levels, selecting a cut point to be used as a warning threshold.

438 citations


Journal ArticleDOI
TL;DR: In insight into why indirect measures such as biomarkers may fail to provide reliable evidence about the benefit-to-risk profile of interventions, the definitions of clinically meaningful endpoints and surrogate endpoints are discussed, and examples from recent clinical trials are provided.
Abstract: One of the most important considerations in designing clinical trials is the choice of outcome measures. These outcome measures could be clinically meaningful endpoints that are direct measures of how patients feel, function, and survive. Alternatively, indirect measures, such as biomarkers that include physical signs of disease, laboratory measures, and radiological tests, often are considered as replacement endpoints or 'surrogates' for clinically meaningful endpoints. We discuss the definitions of clinically meaningful endpoints and surrogate endpoints, and provide examples from recent clinical trials. We provide insight into why indirect measures such as biomarkers may fail to provide reliable evidence about the benefit-to-risk profile of interventions. We also discuss the nature of evidence that is important in assessing whether treatment effects on a biomarker reliably predict effects on a clinically meaningful endpoint, and provide insights into why this reliability is specific to the context of use of the biomarker.

375 citations


Journal ArticleDOI
TL;DR: This paper presents a method that explicitly incorporates a prespecified probability of achieving the prespecification width or lower limit of a confidence interval and the resultant closed-form formulas are shown to be very accurate.
Abstract: The number of subjects required to estimate the intraclass correlation coefficient in a reliability study has usually been determined on the basis of the expected width of a confidence interval. However, this approach fails to explicitly consider the probability of achieving the desired interval width and may thus provide sample sizes that are too small to have adequate chance of achieving the desired precision. In this paper, we present a method that explicitly incorporates a prespecified probability of achieving the prespecified width or lower limit of a confidence interval. The resultant closed-form formulas are shown to be very accurate. Copyright © 2012 John Wiley & Sons, Ltd.

296 citations


Journal ArticleDOI
TL;DR: It is demonstrated that in the setting of linear discriminant analysis, under the assumptions of multivariate normality, all three measures can be presented as functions of the squared Mahalanobis distance, which affords an interpretation of the magnitude of these measures in the familiar language of effect size for uncorrelated variables.
Abstract: Net reclassification and integrated discrimination improvements have been proposed as alternatives to the increase in the AUC for evaluating improvement in the performance of risk assessment algorithms introduced by the addition of new phenotypic or genetic markers. In this paper, we demonstrate that in the setting of linear discriminant analysis, under the assumptions of multivariate normality, all three measures can be presented as functions of the squared Mahalanobis distance. This relationship affords an interpretation of the magnitude of these measures in the familiar language of effect size for uncorrelated variables. Furthermore, it allows us to conclude that net reclassification improvement can be viewed as a universal measure of effect size. Our theoretical developments are illustrated with an example based on the Framingham Heart Study risk assessment model for high risk men in primary prevention of cardiovascular disease.

259 citations


Journal ArticleDOI
TL;DR: The aim of this article is to address translational aspects of competing risks to the clinical community and the importance of agreement between the competing risks methodology and the study aim, in particular the distinction between etiologic and prognostic research questions.
Abstract: Life expectancy has dramatically increased in industrialized nations over the last 200 hundred years. The aging of populations carries over to clinical research and leads to an increasing representation of elderly and multimorbid individuals in study populations. Clinical research in these populations is complicated by the fact that individuals are likely to experience several potential disease endpoints that prevent some disease-specific endpoint of interest from occurrence. Large developments in competing risks methodology have been achieved over the last decades, but we assume that recognition of competing risks in the clinical community is still marginal. It is the aim of this article to address translational aspects of competing risks to the clinical community. We describe clinical populations where competing risks issues may arise. We then discuss the importance of agreement between the competing risks methodology and the study aim, in particular the distinction between etiologic and prognostic research questions. In a review of 50 clinical studies performed in individuals susceptible to competing risks published in high-impact clinical journals, we found competing risks issues in 70% of all articles. Better recognition of issues related to competing risks and of statistical methods that deal with competing risks in accordance with the aim of the study is needed. Copyright © 2011 John Wiley & Sons, Ltd.

253 citations


Journal ArticleDOI
TL;DR: It is shown that balancing treatment groups using stratification leads to correlation between the treatment groups, and if this correlation is ignored and an unadjusted analysis is performed, standard errors for the treatment effect will be biased upwards, resulting in 95% confidence intervals that areToo wide, type I error rates that are too low and a reduction in power.
Abstract: Many clinical trials restrict randomisation using stratified blocks or minimisation to balance prognostic factors across treatment groups. It is widely acknowledged in the statistical literature that the subsequent analysis should reflect the design of the study, and any stratification or minimisation variables should be adjusted for in the analysis. However, a review of recent general medical literature showed only 14 of 41 eligible studies reported adjusting their primary analysis for stratification or minimisation variables. We show that balancing treatment groups using stratification leads to correlation between the treatment groups. If this correlation is ignored and an unadjusted analysis is performed, standard errors for the treatment effect will be biased upwards, resulting in 95% confidence intervals that are too wide, type I error rates that are too low and a reduction in power. Conversely, an adjusted analysis will give valid inference. We explore the extent of this issue using simulation for continuous, binary and time-to-event outcomes where treatment is allocated using stratified block randomisation or minimisation.

247 citations


Journal ArticleDOI
TL;DR: It is shown that if the added predictor is not statistically significantly associated with the outcome, the null distribution is non-normal, contrary to the assumption of DeLong test, and recommended that for nested models, only the test of association be performed for the new predictors, and if the result is significant, change in AUC be estimated with an appropriate confidence interval, which can be based on the DeLong approach.
Abstract: The area under the receiver operating characteristics curve (AUC of ROC) is a widely used measure of discrimination in risk prediction models. Routinely, the Mann-Whitney statistics is used as an estimator of AUC, while the change in AUC is tested by the DeLong test. However, very often, in settings where the model is developed and tested on the same dataset, the added predictor is statistically significantly associated with the outcome but fails to produce a significant improvement in the AUC. No conclusive resolution exists to explain this finding. In this paper, we will show that the reason lies in the inappropriate application of the DeLong test in the setting of nested models. Using numerical simulations and a theoretical argument based on generalized U-statistics, we show that if the added predictor is not statistically significantly associated with the outcome, the null distribution is non-normal, contrary to the assumption of DeLong test. Our simulations of different scenarios show that the loss of power because of such a misuse of the DeLong test leads to a conservative test for small and moderate effect sizes. This problem does not exist in cases of predictors that are associated with the outcome and for non-nested models. We suggest that for nested models, only the test of association be performed for the new predictors, and if the result is significant, change in AUC be estimated with an appropriate confidence interval, which can be based on the DeLong approach.

192 citations


Journal ArticleDOI
TL;DR: It is found useful to check whether the functionals of the transition hazards satisfy three simple principles, which may be used as criteria for practical interpretability in survival analysis.
Abstract: The basic parameters in both survival analysis and more general multistate models, including the competing risks model and the illness–death model, are the transition hazards. It is often necessary to supplement the analysis of such models with other model parameters, which are all functionals of the transition hazards. Unfortunately, not all such functionals are equally meaningful in practical contexts, even though they may be mathematically well defined. We have found it useful to check whether the functionals satisfy three simple principles, which may be used as criteria for practical interpretability. Copyright © 2011 John Wiley & Sons, Ltd.

183 citations


Journal ArticleDOI
TL;DR: An introduction to ATSs and SMART designs is provided and design issues unique to SMARTs that are best addressed in an external pilot study prior to the full-scale SMART are identified.
Abstract: There is growing interest in how best to adapt and readapt treatments to individuals to maximize clinical benefit. In response, adaptive treatment strategies (ATS), which operationalize adaptive, sequential clinical decision making, have been developed. From a patient's perspective an ATS is a sequence of treatments, each individualized to the patient's evolving health status. From a clinician's perspective, an ATS is a sequence of decision rules that input the patient's current health status and output the next recommended treatment. Sequential multiple assignment randomized trials (SMART) have been developed to address the sequencing questions that arise in the development of ATSs, but SMARTs are relatively new in clinical research. This article provides an introduction to ATSs and SMART designs. This article also discusses the design of SMART pilot studies to address feasibility concerns, and to prepare investigators for a full-scale SMART. We consider an example SMART for the development of an ATS in the treatment of pediatric generalized anxiety disorders. Using the example SMART, we identify and discuss design issues unique to SMARTs that are best addressed in an external pilot study prior to the full-scale SMART. We also address the question of how many participants are needed in a SMART pilot study. A properly executed pilot study can be used to effectively address concerns about acceptability and feasibility in preparation for (that is, prior to) executing a full-scale SMART.

Journal ArticleDOI
TL;DR: This research focuses on the development of statistical methods for identifying adverse events associated with medical products and their operating characteristics when applied to the real‐world data.
Abstract: Background: Expanded availability of observational healthcare data (both administrative claims and electronic health records) has prompted the development of statistical methods for identifying adverse events associated with medical products, but the operating characteristics of these methods when applied to the real-world data are unknown. Methods: We studied the performance of eight analytic methods for estimating of the strength of association-relative risk (RR) and associated standard error of 53 drug–adverse event outcome pairs, both positive and negative controls. The methods were applied to a network of ten observational healthcare databases, comprising over 130 million lives. Performance measures included sensitivity, specificity, and positive predictive value of methods at RR thresholds achieving statistical significance of p 1.5, as well as threshold-free measures such as area under receiver operating characteristic curve (AUC). Results: Although no specific method demonstrated superior performance, the aggregate results provide a benchmark and baseline expectation for risk identification method performance. At traditional levels of statistical significance (RR > 1, p 18%, with positive predictive value <38%. The best predictive model, high-dimensional propensity score, achieved an AUC = 0.77. At 50% sensitivity, false positive rate ranged from 16% to 30%. At 10% false positive rate, sensitivity of the methods ranged from 9% to 33%. Conclusions: Systematic processes for risk identification can provide useful information to supplement an overall safety assessment, but assessment of methods performance suggests a substantial chance of identifying false positive associations. Copyright © 2012 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The utility of closed-form expressions for simulating event times are illustrated by using Monte Carlo simulations to estimate the statistical power to detect as statistically significant the effect of different types of binary time-varying covariates.
Abstract: Simulations and Monte Carlo methods serve an important role in modern statistical research. They allow for an examination of the performance of statistical procedures in settings in which analytic and mathematical derivations may not be feasible. A key element in any statistical simulation is the existence of an appropriate data-generating process: one must be able to simulate data from a specified statistical model. We describe data-generating processes for the Cox proportional hazards model with time-varying covariates when event times follow an exponential, Weibull, or Gompertz distribution. We consider three types of time-varying covariates: first, a dichotomous time-varying covariate that can change at most once from untreated to treated (e.g., organ transplant); second, a continuous time-varying covariate such as cumulative exposure at a constant dose to radiation or to a pharmaceutical agent used for a chronic condition; third, a dichotomous time-varying covariate with a subject being able to move repeatedly between treatment states (e.g., current compliance or use of a medication). In each setting, we derive closed-form expressions that allow one to simulate survival times so that survival times are related to a vector of fixed or time-invariant covariates and to a single time-varying covariate. We illustrate the utility of our closed-form expressions for simulating event times by using Monte Carlo simulations to estimate the statistical power to detect as statistically significant the effect of different types of binary time-varying covariates. This is compared with the statistical power to detect as statistically significant a binary time-invariant covariate.

Journal ArticleDOI
TL;DR: This work developed a statistical methodology based on the generalized propensity score in order to estimate treatment effects in the case of multiple treatments and used these methods to assess the relative effectiveness of individual treatments in the multiple-treatment IMPACT clinical trial.
Abstract: The propensity score method is widely used in clinical studies to estimate the effect of a treatment with two levels on patient's outcomes. However, due to the complexity of many diseases, an effective treatment often involves multiple components. For example, in the practice of Traditional Chinese Medicine (TCM), an effective treatment may include multiple components, e.g. Chinese herbs, acupuncture, and massage therapy. In clinical trials involving TCM, patients could be randomly assigned to either the treatment or control group, but they or their doctors may make different choices about which treatment component to use. As a result, treatment components are not randomly assigned. Rosenbaum and Rubin proposed the propensity score method for binary treatments, and Imbens extended their work to multiple treatments. These authors defined the generalized propensity score as the conditional probability of receiving a particular level of the treatment given the pre-treatment variables. In the present work, we adopted this approach and developed a statistical methodology based on the generalized propensity score in order to estimate treatment effects in the case of multiple treatments. Two methods were discussed and compared: propensity score regression adjustment and propensity score weighting. We used these methods to assess the relative effectiveness of individual treatments in the multiple-treatment IMPACT clinical trial. The results reveal that both methods perform well when the sample size is moderate or large.

Journal ArticleDOI
TL;DR: Using an extensive literature review to assess how Bayesian methods are used in clinical trials, it is found they are most commonly used for dose finding, efficacy monitoring, toxicity monitoring, diagnosis/decision making, and studying pharmacokinetics/pharmacodynamics.
Abstract: Although the frequentist paradigm has been the predominant approach to clinical trial design since the 1940s, it has several notable limitations. Advancements in computational algorithms and computer hardware have greatly enhanced the alternative Bayesian paradigm. Compared with its frequentist counterpart, the Bayesian framework has several unique advantages, and its incorporation into clinical trial design is occurring more frequently. Using an extensive literature review to assess how Bayesian methods are used in clinical trials, we find them most commonly used for dose finding, efficacy monitoring, toxicity monitoring, diagnosis/decision making, and studying pharmacokinetics/pharmacodynamics. The additional infrastructure required for implementing Bayesian methods in clinical trials may include specialized software programs to run the study design, simulation and analysis, and web-based applications, all of which are particularly useful for timely data entry and analysis. Trial success requires not only the development of proper tools but also timely and accurate execution of data entry, quality control, adaptive randomization, and Bayesian computation. The relative merit of the Bayesian and frequentist approaches continues to be the subject of debate in statistics. However, more evidence can be found showing the convergence of the two camps, at least at the practical level. Ultimately, better clinical trial methods lead to more efficient designs, lower sample sizes, more accurate conclusions, and better outcomes for patients enrolled in the trials. Bayesian methods offer attractive alternatives for better trials. More Bayesian trials should be designed and conducted to refine the approach and demonstrate their real benefit in action.

Journal ArticleDOI
TL;DR: It is described how covariates can influence the mood variances and also extend the statistical model by adding a subject-level random effect to the within-subject variance specification.
Abstract: Ecological Momentary Assessment (EMA) and/or Experience Sampling (ESM) methods are increasingly used in health studies to study subjective experiences within changing environmental contexts. In these studies, up to thirty or forty observations are often obtained for each subject. Because there are so many measurements per subject, one can characterize a subject’s mean and variance, and specify models for both. In this article, we focus on an adolescent smoking study using EMA where interest is on characterizing changes in mood variation. We describe how covariates can influence the mood variances, and also extend the statistical model by adding a subject-level random effect to the within-subject variance specification. This permits subjects to have influence on the mean, or location, and variability, or (square of the) scale, of their mood responses. These mixed-effects location scale models have useful applications in many research areas where interest centers on the joint modeling of the mean and variance structure.

Journal ArticleDOI
TL;DR: The parametric g-formula is a viable alternative to inverse probability weighting of marginal structural models and g-estimation of structural nested models for the analysis of complex longitudinal data and substantially reduces the hazard of AIDS or death in two US-based human immunodeficiency virus cohorts.
Abstract: The parametric g-formula can be used to contrast the distribution of potential outcomes under arbitrary treatment regimes. Like g-estimation of structural nested models and inverse probability weighting of marginal structural models, the parametric g-formula can appropriately adjust for measured time-varying confounders that are affected by prior treatment. However, there have been few implementations of the parametric g-formula to date. Here, we apply the parametric g-formula to assess the impact of highly active antiretroviral therapy on time to AIDS or death in two US-based HIV cohorts including 1,498 participants. These participants contributed approximately 7,300 person-years of follow-up of which 49% was exposed to HAART and 382 events occurred; 259 participants were censored due to drop out. Using the parametric g-formula, we estimated that antiretroviral therapy substantially reduces the hazard of AIDS or death (HR=0.55; 95% confidence limits [CL]: 0.42, 0.71). This estimate was similar to one previously reported using a marginal structural model 0.54 (95% CL: 0.38, 0.78). The 6.5-year difference in risk of AIDS or death was 13% (95% CL: 8%, 18%). Results were robust to assumptions about temporal ordering, and extent of history modeled, for time-varying covariates. The parametric g-formula is a viable alternative to inverse probability weighting of marginal structural models and g-estimation of structural nested models for the analysis of complex longitudinal data.

Journal ArticleDOI
TL;DR: The practical implications of different link functions for regression of the absolute risk (or cumulative incidence) of an event and some tools to justify the models in comparison with traditional approaches that combine a series of cause-specific Cox regression models or use the Fine-Gray model are proposed.
Abstract: In survival analysis with competing risks, the transformation model allows different functions between the outcome and explanatory variables. However, the model's prediction accuracy and the interpretation of parameters may be sensitive to the choice of link function. We review the practical implications of different link functions for regression of the absolute risk (or cumulative incidence) of an event. Specifically, we consider models in which the regression coefficients β have the following interpretation: The probability of dying from cause D during the next t years changes with a factor exp(β) for a one unit change of the corresponding predictor variable, given fixed values for the other predictor variables. The models have a direct interpretation for the predictive ability of the risk factors. We propose some tools to justify the models in comparison with traditional approaches that combine a series of cause-specific Cox regression models or use the Fine-Gray model. We illustrate the methods with the use of bone marrow transplant data.

Journal ArticleDOI
TL;DR: The history of reporting guidelines for randomised trials culminating in the CONSORT Statement in 1996 is described and the subsequent development and extension of CONSORT and related initiatives aimed at improving the reliability of the medical research literature are considered.
Abstract: An extensive and growing number of reviews of the published literature demonstrate that health research publications have frequent deficiencies. Of particular concern are poor reports of randomised trials, which make it difficult or impossible for readers to assess how the research was conducted, to evaluate the reliability of the findings, or to place them in the context of existing research evidence. As a result, published reports of trials often cannot be used by clinicians to inform patient care or to inform public health policy, and the data cannot be included in systematic reviews. Reporting guidelines are designed to identify the key information that researchers should include in a report of their research. We describe the history of reporting guidelines for randomised trials culminating in the CONSORT Statement in 1996. We detail the subsequent development and extension of CONSORT and consider related initiatives aimed at improving the reliability of the medical research literature. Copyright © 2012 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: It is shown that the optimal allocation to the control group is smaller than previously advocated and that the gain in efficiency is generally small, and a method is proposed that combines quick evaluation of specific designs and an efficient stochastic search to find the optimal design parameters.
Abstract: In drug development, there is often uncertainty about the most promising among a set of different treatments. Multi-arm multi-stage (MAMS) trials provide large gains in efficiency over separate randomised trials of each treatment. They allow a shared control group, dropping of ineffective treatments before the end of the trial and stopping the trial early if sufficient evidence of a treatment being superior to control is found. In this paper, we discuss optimal design of MAMS trials. An optimal design has the required type I error rate and power but minimises the expected sample size at some set of treatment effects. Finding an optimal design requires searching over stopping boundaries and sample size, potentially a large number of parameters. We propose a method that combines quick evaluation of specific designs and an efficient stochastic search to find the optimal design parameters. We compare various potential designs motivated by the design of a phase II MAMS trial. We also consider allocating more patients to the control group, as has been carried out in real MAMS studies. We show that the optimal allocation to the control group, although greater than a 1:1 ratio, is smaller than previously advocated and that the gain in efficiency is generally small.

Journal ArticleDOI
TL;DR: Simulations showed that when age affected the excess mortality hazard, most estimators, including specific survival, were biased and only two estimators were appropriate to estimate net survival.
Abstract: Net survival, the one that would be observed if cancer were the only cause of death, is the most appropriate indicator to compare cancer mortality between areas or countries. Several parametric and non-parametric methods have been developed to estimate net survival, particularly when the cause of death is unknown. These methods are based either on the relative survival ratio or on the additive excess hazard model, the latter using the general population mortality hazard to estimate the excess mortality hazard (the hazard related to net survival). The present work used simulations to compare estimator abilities to estimate net survival in different settings such as the presence/absence of an age effect on the excess mortality hazard or on the potential time of follow-up, knowing that this covariate has an effect on the general population mortality hazard too. It showed that when age affected the excess mortality hazard, most estimators, including specific survival, were biased. Only two estimators were appropriate to estimate net survival. The first is based on a multivariable excess hazard model that includes age as covariate. The second is non-parametric and is based on the inverse probability weighting. These estimators take differently into account the informative censoring induced by the expected mortality process. The former offers great flexibility whereas the latter requires neither the assumption of a specific distribution nor a model-building strategy. Because of its simplicity and availability in commonly used software, the nonparametric estimator should be considered by cancer registries for population-based studies.

Journal ArticleDOI
TL;DR: A new method for an adaptive clinical trial with co-primary analyses in a predefined subgroup and the full population based on the conditional error function principle is proposed and used in a simulation study to demonstrate that the new method is more powerful than previously suggested analysis strategies.
Abstract: Growing interest in personalised medicine and targeted therapies is leading to an increase in the importance of subgroup analyses. If it is planned to view treatment comparisons in both a predefined subgroup and the full population as co-primary analyses, it is important that the statistical analysis controls the familywise type I error rate. Spiessens and Debois (Cont. Clin. Trials, 2010, 31, 647–656) recently proposed an approach specific for this setting, which incorporates an assumption about the correlation based on the known sizes of the different groups, and showed that this is more powerful than generic multiple comparisons procedures such as the Bonferroni correction. If recruitment is slow relative to the length of time taken to observe the outcome, it may be efficient to conduct an interim analysis. In this paper, we propose a new method for an adaptive clinical trial with co-primary analyses in a predefined subgroup and the full population based on the conditional error function principle. The methodology is generic in that we assume test statistics can be taken to be normally distributed rather than making any specific distributional assumptions about individual patient data. In a simulation study, we demonstrate that the new method is more powerful than previously suggested analysis strategies. Furthermore, we show how the method can be extended to situations when the selection is not based on the final but on an early outcome. We use a case study in a targeted therapy in oncology to illustrate the use of the proposed methodology with non-normal outcomes. Copyright © 2012 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: A series of novel Bayesian statistical MTC models are developed to allow for the simultaneous synthesis of IPD and AD, potentially incorporating study and individual level covariates and producing markedly more accurate treatment-covariate interaction estimates.
Abstract: Mixed treatment comparisons (MTC) extend the traditional pair-wise meta-analytic framework to synthesize information on more than two interventions. Although most MTCs use aggregate data (AD), a proportion of the evidence base might be available at the individual level (IPD). We develop a series of novel Bayesian statistical MTC models to allow for the simultaneous synthesis of IPD and AD, potentially incorporating study and individual level covariates. The effectiveness of different interventions to increase the provision of functioning smoke alarms in households with children was used as a motivating dataset. This included 20 studies (11 AD and 9 IPD), including 11 500 participants. Incorporating the IPD into the network allowed the inclusion of information on subject level covariates, which produced markedly more accurate treatment-covariate interaction estimates than an analysis solely on the AD from all studies. Including evidence at the IPD level in the MTC is desirable when exploring participant level covariates; even when IPD is available only for a fraction of the studies. Such modelling may not only reduce inconsistencies within networks of trials but also assist the estimation of intervention subgroup effects to guide more individualised treatment decisions.

Journal ArticleDOI
TL;DR: Three existing penalised methods that have been proposed to improve predictive accuracy, including ridge, lasso and the garotte, are evaluated using simulated data derived from two clinical datasets and suggest that significant improvements are possible by taking a penalised modelling approach.
Abstract: Prognostic models for survival outcomes are often developed by fitting standard survival regression models, such as the Cox proportional hazards model, to representative datasets However, these models can be unreliable if the datasets contain few events, which may be the case if either the disease or the event of interest is rare Specific problems include predictions that are too extreme, and poor discrimination between low-risk and high-risk patients The objective of this paper is to evaluate three existing penalised methods that have been proposed to improve predictive accuracy In particular, ridge, lasso and the garotte, which use penalised maximum likelihood to shrink coefficient estimates and in some cases omit predictors entirely, are assessed using simulated data derived from two clinical datasets The predictions obtained using these methods are compared with those from Cox models fitted using standard maximum likelihood The simulation results suggest that Cox models fitted using maximum likelihood can perform poorly when there are few events, and that significant improvements are possible by taking a penalised modelling approach The ridge method generally performed the best, although lasso is recommended if variable selection is required

Journal ArticleDOI
TL;DR: Results show that the 'borrowing of strength' from a multivariate meta-analysis can reduce the impact of ORB on the pooled treatment effect estimates, and the use of the Pearson correlation is examined as a novel approach for dealing with missing within-study correlations.
Abstract: Multivariate meta-analysis allows the joint synthesis of multiple correlated outcomes from randomised trials, and is an alternative to a separate univariate meta-analysis of each outcome independently. Usually not all trials report all outcomes; furthermore, outcome reporting bias (ORB) within trials, where an outcome is measured and analysed but not reported on the basis of the results, may cause a biased set of the evidence to be available for some outcomes, potentially affecting the significance and direction of meta-analysis results. The multivariate approach, however, allows one to ‘borrow strength’ across correlated outcomes, to potentially reduce the impact of ORB. Assuming ORB missing data mechanisms, we aim to investigate the magnitude of bias in the pooled treatment effect estimates for multiple outcomes using univariate meta-analysis, and to determine whether the ‘borrowing of strength’ from multivariate meta-analysis can reduce the impact of ORB. A simulation study was conducted for a bivariate fixed effect meta-analysis of two correlated outcomes. The approach is illustrated by application to a Cochrane systematic review. Results show that the ‘borrowing of strength’ from a multivariate meta-analysis can reduce the impact of ORB on the pooled treatment effect estimates. We also examine the use of the Pearson correlation as a novel approach for dealing with missing within-study correlations, and provide an extension to bivariate random-effects models that reduce ORB in the presence of heterogeneity. Copyright © 2012 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: A flexible spatial scan statistic implemented with a restricted likelihood ratio proposed by Tango (2008) is shown to have surprisingly much less computational time and to be able to detect clusters with any shape reasonably well as the relative risk of the cluster becomes large via Monte Carlo simulation.
Abstract: Spatial scan statistics are widely used tools for detection of disease clusters. Especially, the circular spatial scan statistic proposed by Kulldorff (1997) has been utilized in a wide variety of epidemiological studies and disease surveillance. However, as it cannot detect noncircular, irregularly shaped clusters, many authors have proposed different spatial scan statistics, including the elliptic version of Kulldorff's scan statistic. The flexible spatial scan statistic proposed by Tango and Takahashi (2005) has also been used for detecting irregularly shaped clusters. However, this method sets a feasible limitation of a maximum of 30 nearest neighbors for searching candidate clusters because of heavy computational load. In this paper, we show a flexible spatial scan statistic implemented with a restricted likelihood ratio proposed by Tango (2008) to (1) eliminate the limitation of 30 nearest neighbors and (2) to have surprisingly much less computational time than the original flexible spatial scan statistic. As a side effect, it is shown to be able to detect clusters with any shape reasonably well as the relative risk of the cluster becomes large via Monte Carlo simulation. We illustrate the proposed spatial scan statistic with data on mortality from cerebrovascular disease in the Tokyo Metropolitan area, Japan.

Journal ArticleDOI
TL;DR: It is argued that causal effects defined by CMRIER may be more appropriate in many situations, particularly those with policy considerations, if the dynamic interventions are set to be realistic.
Abstract: One of the identifiability assumptions of causal effects defined by marginal structural model (MSM) parameters is the experimental treatment assignment (ETA) assumption. Practical violations of this assumption frequently occur in data analysis when certain exposures are rarely observed within some strata of the population. The inverse probability of treatment weighted (IPTW) estimator is particularly sensitive to violations of this assumption; however, we demonstrate that this is a problem for all estimators of causal effects. This is due to the fact that the ETA assumption is about information (or lack thereof) in the data. A new class of causal models, causal models for realistic individualized exposure rules (CMRIER), is based on dynamic interventions. CMRIER generalize MSM, and their parameters remain fully identifiable from the observed data, even when the ETA assumption is violated, if the dynamic interventions are set to be realistic. Examples of such realistic interventions are provided. We argue that causal effects defined by CMRIER may be more appropriate in many situations, particularly those with policy considerations. Through simulation studies, we examine the performance of the IPTW estimator of the CMRIER parameters in contrast to that of the MSM parameters. We also apply the methodology to a real data analysis in air pollution epidemiology to illustrate the interpretation of the causal effects defined by CMRIER.

Journal ArticleDOI
TL;DR: Simulation studies show that the measures proposed by Kent and O'Quigley and Royston and Sauerbrei, R(D)(2), appear to be the best overall at quantifying predictive ability; it should be noted that neither measure is perfect.
Abstract: Measures of predictive ability play an important role in quantifying the clinical significance of prognostic factors. Several measures have been proposed to evaluate the predictive ability of survival models in the last two decades, but no single measure is consistently used. The proposed measures can be classified into the following categories: explained variation, explained randomness, and predictive accuracy. The three categories are conceptually different and are based on different principles. Several new measures have been proposed since Schemper and Stare's study in 1996 on some of the existing measures. This paper is the first of two papers that study the proposed measures systematically by applying a set of criteria that a measure of predictive ability should possess in the context of survival analysis. The present paper focuses on the explained variation category, and part II studies the proposed measures in the other categories. Simulation studies are used to examine the performance of five explained variation measures with respect to these criteria, discussing their strengths and shortcomings. Our simulation studies show that the measures proposed by Kent and O'Quigley, R(PM)(2), and Royston and Sauerbrei, R(D)(2), appear to be the best overall at quantifying predictive ability. However, it should be noted that neither measure is perfect; R(PM)(2) is sensitive to outliers and R(D)(2) to (marked) non-normality of the distribution of the prognostic index. The results show that the other measures perform poorly, primarily because they are adversely affected by censoring.

Journal ArticleDOI
TL;DR: This paper explores sample size calculation for phase III clinical trials from a Bayesian decision‐theoretic perspective to obtain sample sizes that would be appropriate for studies funded by a large funder such as a public sector body or major pharmaceutical company.
Abstract: Methodology for sample size calculation for phase III clinical trials is well established and widely used. In contrast, for earlier phase clinical trials or pilot studies, although there is an acceptance that the methods used for phase III trials are not appropriate, there is little consensus over methods that should be used. This paper explores this problem from a Bayesian decision-theoretic perspective. The aim is to obtain sample sizes that would be appropriate for studies funded by a large funder such as a public sector body or major pharmaceutical company. The sample sizes obtained are optimal in that they minimise the average number of patients required per successfully identified effective therapy or equivalently maximise the number of effective therapies successfully identified over a long period. It is indicated that the number of patients included in a phase II clinical trial should be approximately 0.03 times that planned to be included in the phase III study. This is similar to that proposed by other researchers in this area, though rather smaller than actually used for many phase II trials. Copyright © 2011 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The exposome paradigm is provided, which compliments the (epi)genome while providing a multitude of opportunities for intervention if exposures can be eliminated or minimized, and its implications for health across the lifespan.
Abstract: Life expectancy is an overall measure of population health, and has reflected a positive trajectory for several decades in the United States when estimated either at birth or 65 years of age [1]. Despite such improvements, life expectancy varies by socio-demographic characteristics and the availability of adequate medical care. Concerns about the continual sustainability for improving life expectancy have grown in light of the obesity epidemic currently occurring in the United States and in other countries across the globe [2, 3]. For example in 2010, all 50 States comprising the United States reported an obesity prevalence of 20% or more, with 12 states reporting a prevalence of 30% or more [2]. Overall, approximately 33% of adults and 17% of children in the United States are currently estimated to be obese [4]. Other notable changes in population health include the marked reduction in fertility rates over time [5], which has partially but not entirely been attributed to changes in cultural norms and behaviors regarding childbearing practices. In fact, an evolving body of evidence is suggestive of temporal declines in human fecundity, which is defined as the biologic capacity of men and women for reproduction irrespective of pregnancy intentions [6]. The body of evidence supporting a decline in male fecundity over the past few decades includes diminished semen quality and increasing rates of genital-urinary malformations and testicular cancer, all of which are hypothesized to originate in utero or the so-called TDS or testicular dysgenesis syndrome [7]. This conceptual framework has recently been extended to women. Specifically, the ovarian dysgenesis syndrome (ODS) posits that female fecundity is established at conception or in utero with early impairments arising during prepubertal or reproductive years as manifested by alterations in the onset or progression of puberty or gynecologic and gravid disorders, respectively [8]. Assuming that human fecundity may be positively associated with survival as recently reported for semen quality [9], its decline may be at a ‘critical tipping point’ for human health as recently suggested [10] underscoring the importance of new research pardigms such as the exposome for assessing the early origins of fecundity and its implications for health across the lifespan. How might researchers further impact the health and well being of populations across the globe? Certainly, new research paradigms are needed for transforming how we think about health and disease and design research, acordingly. Novel paradigm changes are already underway (e.g., genome and epigenome), though noticeably absent is a paradigm that captures the multitude of environmental exposures that impact human health and disease. This data gap prompted the development of the exposome paradigm, which compliments the (epi)genome while providing a multitude of opportunities for intervention if exposures can be eliminated or minimized. The exposome paradigm focuses on the simultaneous measurement of a multitude of biomarkers including those that originate from external and internal sources. External environmental exposures may include chemicals or physical agents such as radiation among many other types of exposures, while internal environmental exposures arise from bodily functions and processes that govern homeostasis. Internal exposures may include chemicals or biomarkers generated via inflammation or stress along with various other pathways. Of note is the absence of biomarkers for some external environmental exposures (e.g., noise or vibration) resulting in missing data or the need for proxy biomarkers. These issues are further discussed below along with the unique aspects of the exposome such as the longitudinal and high dimensional nature of biomarkers across the lifespan. This paper provides a brief overview of the exposome paradigm along with the resources needed for getting started, research hurdles and challenges to overcome and opportunities for discovery. The overview is organized as responses to five questions: 1) What is the exposome? 2) Why is the timing right for exposome research? 3) What resources are needed for moving forward? 4) What research hurdles and challenges need to be overcome? and 5) What impact might the exposome have for transforming population health? We use human fecundity to illustrate how the exposome might be implemented, though the issues pertain to most (non-Mendelian) health outcomes.