scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in Medicine in 2010"


Journal ArticleDOI
TL;DR: A hierarchical Bayesian approach to MTC implemented using WinBUGS and R is taken and it is shown that both methods are useful in identifying potential inconsistencies in different types of network and that they illustrate how the direct and indirect evidence combine to produce the posterior MTC estimates of relative treatment effects.
Abstract: Pooling of direct and indirect evidence from randomized trials, known as mixed treatment comparisons (MTC), is becoming increasingly common in the clinical literature. MTC allows coherent judgements on which of the several treatments is the most effective and produces estimates of the relative effects of each treatment compared with every other treatment in a network.We introduce two methods for checking consistency of direct and indirect evidence. The first method (back-calculation) infers the contribution of indirect evidence from the direct evidence and the output of an MTC analysis and is useful when the only available data consist of pooled summaries of the pairwise contrasts. The second more general, but computationally intensive, method is based on 'node-splitting' which separates evidence on a particular comparison (node) into 'direct' and 'indirect' and can be applied to networks where trial-level data are available. Methods are illustrated with examples from the literature. We take a hierarchical Bayesian approach to MTC implemented using WinBUGS and R.We show that both methods are useful in identifying potential inconsistencies in different types of network and that they illustrate how the direct and indirect evidence combine to produce the posterior MTC estimates of relative treatment effects. This allows users to understand how MTC synthesis is pooling the data, and what is 'driving' the final estimates.We end with some considerations on the modelling assumptions being made, the problems with the extension of the back-calculation method to trial-level data and discuss our methods in the context of the existing literature.

1,559 citations


Journal ArticleDOI
TL;DR: A modelling framework that can simultaneously represent non-linear exposure–response dependencies and delayed effects, based on the definition of a ‘cross-basis’, which is implemented in the package dlnm within the statistical environment R.
Abstract: Environmental stressors often show effects that are delayed in time, requiring the use of statistical models that are flexible enough to describe the additional time dimension of the exposure-response relationship. Here we develop the family of distributed lag non-linear models (DLNM), a modelling framework that can simultaneously represent non-linear exposure-response dependencies and delayed effects. This methodology is based on the definition of a 'cross-basis', a bi-dimensional space of functions that describes simultaneously the shape of the relationship along both the space of the predictor and the lag dimension of its occurrence. In this way the approach provides a unified framework for a range of models that have previously been used in this setting, and new more flexible variants. This family of models is implemented in the package dlnm within the statistical environment R. To illustrate the methodology we use examples of DLNMs to represent the relationship between temperature and mortality, using data from the National Morbidity, Mortality, and Air Pollution Study (NMMAPS) for New York during the period 1987-2000.

1,398 citations


Journal ArticleDOI
TL;DR: An SAS macro is developed and presented here that creates an RCS function of continuous exposures, displays graphs showing the dose‐response association with 95 per cent confidence interval between one main continuous exposure and an outcome when performing linear, logistic, or Cox models, as well as linear and logistic‐generalized estimating equations.
Abstract: Taking into account a continuous exposure in regression models by using categorization, when non-linear dose-response associations are expected, have been widely criticized. As one alternative, restricted cubic spline (RCS) functions are powerful tools (i) to characterize a dose-response association between a continuous exposure and an outcome, (ii) to visually and/or statistically check the assumption of linearity of the association, and (iii) to minimize residual confounding when adjusting for a continuous exposure. Because their implementation with SAS® software is limited, we developed and present here an SAS macro that (i) creates an RCS function of continuous exposures, (ii) displays graphs showing the dose-response association with 95 per cent confidence interval between one main continuous exposure and an outcome when performing linear, logistic, or Cox models, as well as linear and logistic-generalized estimating equations, and (iii) provides statistical tests for overall and non-linear associations. We illustrate the SAS macro using the third National Health and Nutrition Examination Survey data to investigate adjusted dose-response associations (with different models) between calcium intake and bone mineral density (linear regression), folate intake and hyperhomocysteinemia (logistic regression), and serum high-density lipoprotein cholesterol and cardiovascular mortality (Cox model).

1,185 citations


Journal ArticleDOI
TL;DR: The authors examine the performance of various CART‐based propensity score models using simulated data and suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.
Abstract: Machine learning techniques such as classification and regression trees (CART) have been suggested as promising alternatives to logistic regression for the estimation of propensity scores. The authors examined the performance of various CART-based propensity score models using simulated data. Hypothetical studies of varying sample sizes (n=500, 1000, 2000) with a binary exposure, continuous outcome, and 10 covariates were simulated under seven scenarios differing by degree of non-linear and non-additive associations between covariates and the exposure. Propensity score weights were estimated using logistic regression (all main effects), CART, pruned CART, and the ensemble methods of bagged CART, random forests, and boosted CART. Performance metrics included covariate balance, standard error, per cent absolute bias, and 95 per cent confidence interval (CI) coverage. All methods displayed generally acceptable performance under conditions of either non-linearity or non-additivity alone. However, under conditions of both moderate non-additivity and moderate non-linearity, logistic regression had subpar performance, whereas ensemble methods provided substantially better bias reduction and more consistent 95 per cent CI coverage. The results suggest that ensemble methods, especially boosted CART, may be useful for propensity score weighting.

713 citations


Journal ArticleDOI
TL;DR: This work uses theoretical arguments and simulation studies to compare these methods with MI implemented under a missing at random assumption, and finds that the latter is more efficient.
Abstract: When missing data occur in one or more covariates in a regression model, multiple imputation (MI) is widely advocated as an improvement over complete-case analysis (CC). We use theoretical arguments and simulation studies to compare these methods with MI implemented under a missing at random assumption. When data are missing completely at random, both methods have negligible bias, and MI is more efficient than CC across a wide range of scenarios. For other missing data mechanisms, bias arises in one or both methods. In our simulation setting, CC is biased towards the null when data are missing at random. However, when missingness is independent of the outcome given the covariates, CC has negligible bias and MI is biased away from the null. With more general missing data mechanisms, bias tends to be smaller for MI than for CC. Since MI is not always better than CC for missing covariate problems, the choice of method should take into account what is known about the missing data mechanism in a particular substantive application. Importantly, the choice of method should not be based on comparison of standard errors. We propose new ways to understand empirical differences between MI and CC, which may provide insights into the appropriateness of the assumptions underlying each method, and we propose a new index for assessing the likely gain in precision from MI: the fraction of incomplete cases among the observed values of a covariate (FICO).

612 citations


Journal ArticleDOI
TL;DR: This work proposes a natural and easily implemented multivariate extension of this procedure which is accessible to applied researchers and provides a much less computationally intensive alternative to existing methods.
Abstract: Multivariate meta-analysis is increasingly used in medical statistics. In the univariate setting, the non-iterative method proposed by DerSimonian and Laird is a simple and now standard way of performing random effects meta-analyses. We propose a natural and easily implemented multivariate extension of this procedure which is accessible to applied researchers and provides a much less computationally intensive alternative to existing methods. In a simulation study, the proposed procedure performs similarly in almost all ways to the more established iterative restricted maximum likelihood approach. The method is applied to some real data sets and an extension to multivariate meta-regression is described.

506 citations


Journal ArticleDOI
TL;DR: It is shown that problems can be overcome in most cases occurring in practice by replacing the approximate normal within-study likelihood by the appropriate exact likelihood, which leads to a generalized linear mixed model that can be fitted in standard statistical software.
Abstract: We consider random effects meta-analysis where the outcome variable is the occurrence of some event of interest. The data structures handled are where one has one or more groups in each study, and in each group either the number of subjects with and without the event, or the number of events and the total duration of follow-up is available. Traditionally, the meta-analysis follows the summary measures approach based on the estimates of the outcome measure(s) and the corresponding standard error(s). This approach assumes an approximate normal within-study likelihood and treats the standard errors as known. This approach has several potential disadvantages, such as not accounting for the standard errors being estimated, not accounting for correlation between the estimate and the standard error, the use of an (arbitrary) continuity correction in case of zero events, and the normal approximation being bad in studies with few events. We show that these problems can be overcome in most cases occurring in practice by replacing the approximate normal within-study likelihood by the appropriate exact likelihood. This leads to a generalized linear mixed model that can be fitted in standard statistical software. For instance, in the case of odds ratio meta-analysis, one can use the non-central hypergeometric distribution likelihood leading to mixed-effects conditional logistic regression. For incidence rate ratio meta-analysis, it leads to random effects logistic regression with an offset variable. We also present bivariate and multivariate extensions. We present a number of examples, especially with rare events, among which an example of network meta-analysis.

492 citations


Journal ArticleDOI
TL;DR: The NCI method may be used for estimating the distribution of usual nutrient intake for populations and subpopulations as part of a unified framework of estimation of usual intake of dietary constituents.
Abstract: It is of interest to estimate the distribution of usual nutrient intake for a population from repeat 24-h dietary recall assessments. A mixed effects model and quantile estimation procedure, developed at the National Cancer Institute (NCI), may be used for this purpose. The model incorporates a Box–Cox parameter and covariates to estimate usual daily intake of nutrients; model parameters are estimated via quasi-Newton optimization of a likelihood approximated by the adaptive Gaussian quadrature. The parameter estimates are used in a Monte Carlo approach to generate empirical quantiles; standard errors are estimated by bootstrap. The NCI method is illustrated and compared with current estimation methods, including the individual mean and the semi-parametric method developed at the Iowa State University (ISU), using data from a random sample and computer simulations. Both the NCI and ISU methods for nutrients are superior to the distribution of individual means. For simple (no covariate) models, quantile estimates are similar between the NCI and ISU methods. The bootstrap approach used by the NCI method to estimate standard errors of quantiles appears preferable to Taylor linearization. One major advantage of the NCI method is its ability to provide estimates for subpopulations through the incorporation of covariates into the model. The NCI method may be used for estimating the distribution of usual nutrient intake for populations and subpopulations as part of a unified framework of estimation of usual intake of dietary constituents. Copyright © 2010 John Wiley & Sons, Ltd.

419 citations


Journal ArticleDOI
TL;DR: A doubly robust version of IPTW had superior performance compared with the other propensity-score methods and resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates.
Abstract: Propensity score methods are increasingly being used to estimate the effects of treatments on health outcomes using observational data. There are four methods for using the propensity score to estimate treatment effects: covariate adjustment using the propensity score, stratification on the propensity score, propensity-score matching, and inverse probability of treatment weighting (IPTW) using the propensity score. When outcomes are binary, the effect of treatment on the outcome can be described using odds ratios, relative risks, risk differences, or the number needed to treat. Several clinical commentators suggested that risk differences and numbers needed to treat are more meaningful for clinical decision making than are odds ratios or relative risks. However, there is a paucity of information about the relative performance of the different propensity-score methods for estimating risk differences. We conducted a series of Monte Carlo simulations to examine this issue. We examined bias, variance estimation, coverage of confidence intervals, mean-squared error (MSE), and type I error rates. A doubly robust version of IPTW had superior performance compared with the other propensity-score methods. It resulted in unbiased estimation of risk differences, treatment effects with the lowest standard errors, confidence intervals with the correct coverage rates, and correct type I error rates. Stratification, matching on the propensity score, and covariate adjustment using the propensity score resulted in minor to modest bias in estimating risk differences. Estimators based on IPTW had lower MSE compared with other propensity-score methods. Differences between IPTW and propensity-score matching may reflect that these two methods estimate the average treatment effect and the average treatment effect for the treated, respectively. Copyright © 2010 John Wiley & Sons, Ltd.

318 citations


Journal ArticleDOI
TL;DR: A systematic strategy for addressing the challenge of how to build a good enough mixed effects model is suggested and easily implemented practical advice to build mixed effects models is introduced.
Abstract: Mixed effects models have become very popular, especially for the analysis of longitudinal data. One challenge is how to build a good enough mixed effects model. In this paper, we suggest a systematic strategy for addressing this challenge and introduce easily implemented practical advice to build mixed effects models. A general discussion of the scientific strategies motivates the recommended five-step procedure for model fitting. The need to model both the mean structure (the fixed effects) and the covariance structure (the random effects and residual error) creates the fundamental flexibility and complexity. Some very practical recommendations help to conquer the complexity. Centering, scaling, and full-rank coding of all the predictor variables radically improve the chances of convergence, computing speed, and numerical accuracy. Applying computational and assumption diagnostics from univariate linear models to mixed model data greatly helps to detect and solve the related computational problems. Applying computational and assumption diagnostics from the univariate linear models to the mixed model data can radically improve the chances of convergence, computing speed, and numerical accuracy. The approach helps to fit more general covariance models, a crucial step in selecting a credible covariance model needed for defensible inference. A detailed demonstration of the recommended strategy is based on data from a published study of a randomized trial of a multicomponent intervention to prevent young adolescents' alcohol use. The discussion highlights a need for additional covariance and inference tools for mixed models. The discussion also highlights the need for improving how scientists and statisticians teach and review the process of finding a good enough mixed model.

233 citations


Journal ArticleDOI
TL;DR: Improving the methods and infrastructure for CER will require sustained attention to the following issues: Meaningful involvement of patients, consumers, clinicians, payers, and policymakers in key phases of CER study design and implementation.
Abstract: Comparative effectiveness research (CER) has received substantial attention as a potential approach for improving health outcomes while lowering costs of care, and for improving the relevance and quality of clinical and health services research. The Institute of Medicine defines CER as 'the conduct and synthesis of systematic research comparing different interventions and strategies to prevent, diagnose, treat, and monitor health conditions. The purpose of this research is to inform patients, providers, and decision-makers, responding to their expressed needs, about which interventions are most effective for which patients under specific circumstances.' Improving the methods and infrastructure for CER will require sustained attention to the following issues: (1) Meaningful involvement of patients, consumers, clinicians, payers, and policymakers in key phases of CER study design and implementation; (2) Development of methodological 'best practices' for the design of CER studies that reflect decision-maker needs and balance internal validity with relevance, feasibility and timeliness; and (3) Improvements in research infrastructure to enhance the validity and efficiency with which CER studies are implemented. The approach to addressing each of these issues should be informed by the understanding that the primary purpose of CER is to help health care decision makers make informed clinical and health policy decisions.

Journal ArticleDOI
TL;DR: It is shown that generalized pairwise comparisons extend well-known non-parametric tests, and they lead to a general measure of the difference between the groups called the 'proportion in favor of treatment', denoted Δ, which is related to traditional measures of treatment effect for a single variable.
Abstract: This paper extends the idea behind the U-statistic of the Wilcoxon-Mann-Whitney test to perform generalized pairwise comparisons between two groups of observations. The observations are outcomes captured by a single variable, possibly repeatedly measured, or by several variables of any type (e.g. discrete, continuous, time to event). When several outcomes are considered, they must be prioritized. We show that generalized pairwise comparisons extend well-known non-parametric tests, and illustrate their interest using data from two randomized clinical trials. We also show that they lead to a general measure of the difference between the groups called the 'proportion in favor of treatment', denoted Δ, which is related to traditional measures of treatment effect for a single variable.

Journal ArticleDOI
TL;DR: A spatial scan statistic is proposed for multinomial data consisting of five different disease categories to identify areas with distinct disease-type patterns in two counties in the U.K.
Abstract: As a geographical cluster detection analysis tool, the spatial scan statistic has been developed for different types of data such as Bernoulli, Poisson, ordinal, exponential and normal. Another interesting data type is multinomial. For example, one may want to find clusters where the disease-type distribution is statistically significantly different from the rest of the study region when there are different types of disease. In this paper, we propose a spatial scan statistic for such data, which is useful for geographical cluster detection analysis for categorical data without any intrinsic order information. The proposed method is applied to meningitis data consisting of five different disease categories to identify areas with distinct disease-type patterns in two counties in the U.K. The performance of the method is evaluated through a simulation study.

Journal ArticleDOI
TL;DR: A new confidence interval is proposed that has better coverage than the DerSimonian-Laird method, and that is less sensitive to publication bias, and is centred on a fixed effects estimate, but allow for heterogeneity by including an assessment of the extra uncertainty induced by the random effects setting.
Abstract: The DerSimonian-Laird confidence interval for the average treatment effect in meta-analysis is widely used in practice when there is heterogeneity between studies. However, it is well known that its coverage probability (the probability that the interval actually includes the true value) can be substantially below the target level of 95 per cent. It can also be very sensitive to publication bias. In this paper, we propose a new confidence interval that has better coverage than the DerSimonian-Laird method, and that is less sensitive to publication bias. The key idea is to note that fixed effects estimates are less sensitive to such biases than random effects estimates, since they put relatively more weight on the larger studies and relatively less weight on the smaller studies. Whereas the DerSimonian-Laird interval is centred on a random effects estimate, we centre our confidence interval on a fixed effects estimate, but allow for heterogeneity by including an assessment of the extra uncertainty induced by the random effects setting. Properties of the resulting confidence interval are studied by simulation and compared with other random effects confidence intervals that have been proposed in the literature. An example is briefly discussed.

Journal ArticleDOI
TL;DR: Logistic quantile regression constitutes an effective method to fill the gap when modeling the probability of binary outcomes with the widespread use of logistic and probit regression.
Abstract: When research interest lies in continuous outcome variables that take on values within a known range (e.g. a visual analog scale for pain within 0 and 100 mm), the traditional statistical methods, such as least-squares regression, mixed-effects models, and even classic nonparametric methods such as the Wilcoxon's test, may prove inadequate. Frequency distributions of bounded outcomes are often unimodal, U-shaped, and J-shaped. To the best of our knowledge, in the biomedical and epidemiological literature bounded outcomes have seldom been analyzed by appropriate methods that, for one, correctly constrain inference to lie within the feasible range of values. In many respects, continuous bounded outcomes can be likened to probabilities or propensities. Yet, what has long been heeded when modeling the probability of binary outcomes with the widespread use of logistic and probit regression, so far appears to have been overlooked with continuous bounded outcomes with consequences at times disastrous. Logistic quantile regression constitutes an effective method to fill this gap.

Journal ArticleDOI
TL;DR: A previous approach to estimate the crude probability of death in population‐based cancer studies used life table methods, but it is shown how the estimates can be obtained after fitting a relative survival model.
Abstract: Relative survival is used extensively in population-based cancer studies to measure patient survival correcting for causes of death not related to the disease of interest. An advantage of relative survival is that it provides a measure of mortality associated with a particular disease, without the need for information on cause of death. Relative survival provides a measure of net mortality, i.e. the probability of death due to cancer in the absence of other causes. This is a useful measure, but it is also of interest to measure crude mortality, i.e. the probability of death due to cancer in the presence of other causes. A previous approach to estimate the crude probability of death in population-based cancer studies used life table methods, but we show how the estimates can be obtained after fitting a relative survival model. We adopt flexible parametric models for relative survival, which use restricted cubic splines for the baseline cumulative excess hazard and for any time-dependent effects. We illustrate the approach using an example of men diagnosed with prostate cancer in England and Wales showing the differences in net and crude survival for different ages.

Journal ArticleDOI
TL;DR: This work puts forward a new strategy designed for situations when there is not a priori information about ‘when’ and ‘where’ these differences appear in the spatio‐temporal domain, simultaneously testing numerous hypotheses, which increase the risk of false positives.
Abstract: Current analysis of event-related potentials (ERP) data is usually based on the a priori selection of channels and time windows of interest for studying the differences between experimental conditions in the spatio-temporal domain. In this work we put forward a new strategy designed for situations when there is not a priori information about 'when' and 'where' these differences appear in the spatio-temporal domain, simultaneously testing numerous hypotheses, which increase the risk of false positives. This issue is known as the problem of multiple comparisons and has been managed with methods that control the false discovery rate (FDR), such as permutation test and FDR methods. Although the former has been previously applied, to our knowledge, the FDR methods have not been introduced in the ERP data analysis. Here we compare the performance (on simulated and real data) of permutation test and two FDR methods (Benjamini and Hochberg (BH) and local-fdr, by Efron). All these methods have been shown to be valid for dealing with the problem of multiple comparisons in the ERP analysis, avoiding the ad hoc selection of channels and/or time windows. FDR methods are a good alternative to the common and computationally more expensive permutation test. The BH method for independent tests gave the best overall performance regarding the balance between type I and type II errors. The local-fdr method is preferable for high dimensional (multichannel) problems where most of the tests conform to the empirical null hypothesis. Differences among the methods according to assumptions, null distributions and dimensionality of the problem are also discussed.

Journal ArticleDOI
TL;DR: The AUC has a well‐understood weakness when comparing ROC curves which cross, but it also has the more fundamental weakness of failing to balance different kinds of misdiagnoses effectively.
Abstract: Because accurate diagnosis lies at the heart of medicine, it is important to be able to evaluate the effectiveness of diagnostic tests. A variety of accuracy measures are used. One particularly widely used measure is the AUC, the area under the receiver operating characteristic (ROC) curve. This measure has a well-understood weakness when comparing ROC curves which cross. However, it also has the more fundamental weakness of failing to balance different kinds of misdiagnoses effectively. This is not merely an aspect of the inevitable arbitrariness in choosing a performance measure, but is a core property of the way the AUC is defined. This property is explored, and an alternative, the H measure, is described. Copyright © 2010 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: CFL is proposed as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood, and confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively.
Abstract: Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.

Journal ArticleDOI
TL;DR: This paper examines the properties of the adaptive kernel estimator by both asymptotic analysis and a simulation study, finding advantages over the fixed kernel approach in both the cases.
Abstract: Kernel smoothing is routinely used for the estimation of relative risk based on point locations of disease cases and sampled controls over a geographical region. Typically, fixed-bandwidth kernel estimation has been employed, despite the widely recognized problems experienced with this methodology when the underlying densities exhibit the type of spatial inhomogeneity frequently seen in geographical epidemiology. A more intuitive approach is to utilize a spatially adaptive, variable smoothing parameter. In this paper, we examine the properties of the adaptive kernel estimator by both asymptotic analysis and a simulation study, finding advantages over the fixed kernel approach in both the cases. We also look at practical issues with implementation of the adaptive relative risk estimator (including bandwidth choice and boundary correction), and develop a computationally inexpensive method for generating tolerance contours to highlight areas of significantly elevated risk.

Journal ArticleDOI
TL;DR: This paper proposes a method for the adjustment of the usual group-sequential boundaries to maintain strong control of the familywise error rate even when short-term endpoint data are used for the treatment selection at the first interim analysis.
Abstract: Seamless phase II/III designs allow strong control of the familywise type I error rate when the most promising of a number of experimental treatments is selected at an interim analysis to continue along with the control treatment. If the primary endpoint is observed only after long-term follow-up it may be desirable to use correlated short-term endpoint data available at the interim analysis to inform the treatment selection. If short-term data are available for some patients for whom the primary endpoint is not available, basing treatment selection on these data may, however, lead to inflation of the type I error rate. This paper proposes a method for the adjustment of the usual group-sequential boundaries to maintain strong control of the familywise error rate even when short-term endpoint data are used for the treatment selection at the first interim analysis. This method allows the use of the short-term data, leading to an increase in power when these data are correlated with the primary endpoint data.

Journal ArticleDOI
TL;DR: This work proposes alternative approaches based on Poisson random effects models to make inference about the relative risk between two treatment groups and shows that the proposed methods perform well when the underlying event rates are low.
Abstract: Meta-analysis provides a useful framework for combining information across related studies and has been widely utilized to combine data from clinical studies in order to evaluate treatment efficacy. More recently, meta-analysis has also been used to assess drug safety. However, because adverse events are typically rare, standard methods may not work well in this setting. Most popular methods use fixed or random effects models to combine effect estimates obtained separately for each individual study. In the context of very rare outcomes, effect estimates from individual studies may be unstable or even undefined. We propose alternative approaches based on Poisson random effects models to make inference about the relative risk between two treatment groups. Simulation studies show that the proposed methods perform well when the underlying event rates are low. The methods are illustrated using data from a recent meta-analysis (N. Engl. J. Med. 2007; 356(24):2457-2471) of 48 comparative trials involving rosiglitazone, a type 2 diabetes drug, with respect to its possible cardiovascular toxicity.

Journal ArticleDOI
TL;DR: Property of the index J(3), defined as the accuracy, or the maximum correct classification, for a given three-class classification problem, is studied and methods are applied to data from an MRS study on human immunodeficiency virus (HIV) patients.
Abstract: We study properties of the index J(3), defined as the accuracy, or the maximum correct classification, for a given three-class classification problem. Specifically, using J(3) one can assess the discrimination between the three distributions and obtain an optimal pair of cut-off points c(1)

Journal ArticleDOI
TL;DR: A comparison of a new Bayesian deterministic inference approach for latent Gaussian models using integrated nested Laplace approximations (INLA) and its results indicate that INLA is more stable and gives generally better coverage probabilities for the pooled estimates and less biased estimates of variance parameters.
Abstract: For bivariate meta-analysis of diagnostic studies, likelihood approaches are very popular. However, they often run into numerical problems with possible non-convergence. In addition, the construction of confidence intervals is controversial. Bayesian methods based on Markov chain Monte Carlo (MCMC) sampling could be used, but are often difficult to implement, and require long running times and diagnostic convergence checks. Recently, a new Bayesian deterministic inference approach for latent Gaussian models using integrated nested Laplace approximations (INLA) has been proposed. With this approach MCMC sampling becomes redundant as the posterior marginal distributions are directly and accurately approximated. By means of a real data set we investigate the influence of the prior information provided and compare the results obtained by INLA, MCMC, and the maximum likelihood procedure SAS PROC NLMIXED. Using a simulation study we further extend the comparison of INLA and SAS PROC NLMIXED by assessing their performance in terms of bias, mean-squared error, coverage probability, and convergence rate. The results indicate that INLA is more stable and gives generally better coverage probabilities for the pooled estimates and less biased estimates of variance parameters. The user-friendliness of INLA is demonstrated by documented R-code.

Journal ArticleDOI
TL;DR: This work advocates a simple graphic that provides further insight into discrimination, namely a histogram or dot plot of the risk score in the outcome groups, and discusses the comparative merits of the c-index and the (standardized) mean difference in risk score between the outcomes groups.
Abstract: Logistic regression models are widely used in medicine for predicting patient outcome (prognosis) and constructing diagnostic tests (diagnosis). Multivariable logistic models yield an (approximately) continuous risk score, a transformation of which gives the estimated event probability for an individual. A key aspect of model performance is discrimination, that is, the model's ability to distinguish between patients who have (or will have) an event of interest and those who do not (or will not). Graphical aids are important in understanding a logistic model. The receiver-operating characteristic (ROC) curve is familiar, but not necessarily easy to interpret. We advocate a simple graphic that provides further insight into discrimination, namely a histogram or dot plot of the risk score in the outcome groups. The most popular performance measure for the logistic model is the c-index, numerically equivalent to the area under the ROC curve. We discuss the comparative merits of the c-index and the (standardized) mean difference in risk score between the outcome groups. The latter statistic, sometimes known generically as the effect size, has been computed in slightly different ways by several different authors, including Glass, Cohen and Hedges. An alternative measure is the overlap between the distributions in the outcome groups, defined as the area under the minimum of the two density functions. The larger the overlap, the weaker the discrimination. Under certain assumptions about the distribution of the risk score, the c-index, effect size and overlap are functionally related. We illustrate the ideas with simulated and real data sets.

Journal ArticleDOI
TL;DR: This paper shows how to efficiently build fully and partially independent conditional (FIC/PIC) sparse approximations for the GP in two‐dimensional surface, and how to conduct approximate inference using expectation propagation algorithm and Laplace approximation.
Abstract: Gaussian process (GP) models are widely used in disease mapping as they provide a natural framework for modeling spatial correlations. Their challenges, however, lie in computational burden and memory requirements. In disease mapping models, the other difficulty is inference, which is analytically intractable due to the non-Gaussian observation model. In this paper, we address both these challenges. We show how to efficiently build fully and partially independent conditional (FIC/PIC) sparse approximations for the GP in two-dimensional surface, and how to conduct approximate inference using expectation propagation (EP) algorithm and Laplace approximation (LA). We also propose to combine FIC with a compactly supported covariance function to construct a computationally efficient additive model that can model long and short length-scale spatial correlations simultaneously. The benefit of these approximations is computational. The sparse GPs speed up the computations and reduce the memory requirements. The posterior inference via EP and Laplace approximation is much faster and is practically as accurate as via Markov chain Monte Carlo.

Journal ArticleDOI
TL;DR: It is established that multiple‐treatments meta‐regression is a good method for examining whether novel agent effects are present and estimation of their magnitude in the three worked examples suggests an exaggeration of the hazard ratio by 6 per cent.
Abstract: Multiple-treatments meta-analyses are increasingly used to evaluate the relative effectiveness of several competing regimens. In some fields which evolve with the continuous introduction of new agents over time, it is possible that in trials comparing older with newer regimens the effectiveness of the latter is exaggerated. Optimism bias, conflicts of interest and other forces may be responsible for this exaggeration, but its magnitude and impact, if any, needs to be formally assessed in each case. Whereas such novelty bias is not identifiable in a pair-wise meta-analysis, it is possible to explore it in a network of trials involving several treatments. To evaluate the hypothesis of novel agent effects and adjust for them, we developed a multiple-treatments meta-regression model fitted within a Bayesian framework. When there are several multiple-treatments meta-analyses for diverse conditions within the same field/specialty with similar agents involved, one may consider either different novel agent effects in each meta-analysis or may consider the effects to be exchangeable across the different conditions and outcomes. As an application, we evaluate the impact of modelling and adjusting for novel agent effects for chemotherapy and other non-hormonal systemic treatments for three malignancies. We present the results and the impact of different model assumptions to the relative ranking of the various regimens in each network. We established that multiple-treatments meta-regression is a good method for examining whether novel agent effects are present and estimation of their magnitude in the three worked examples suggests an exaggeration of the hazard ratio by 6 per cent (2-11 per cent).

Journal ArticleDOI
TL;DR: The meta‐analysis provided evidence that ethanol intake was related to esophageal SCC risk in a nonlinear fashion, and a statistically significant excess risk for moderate and intermediate doses of alcohol was also observed, with no evidence of a threshold effect.
Abstract: A fundamental challenge in meta-analyses of published epidemiological dose-response data is the estimate of the function describing how the risk of disease varies across different levels of a given exposure. Issues in trend estimate include within studies variability, between studies heterogeneity, and nonlinear trend components. We present a method, based on a two-step process, that addresses simultaneously these issues. First, two-term fractional polynomial models are fitted within each study included in the meta-analysis, taking into account the correlation between the reported estimates for different exposure levels. Second, the pooled dose-response relationship is estimated considering the between studies heterogeneity, using a bivariate random-effects model. This method is illustrated by a meta-analysis aimed to estimate the shape of the dose-response curve between alcohol consumption and esophageal squamous cell carcinoma (SCC). Overall, 14 case-control studies and one cohort study, including 3000 cases of esophageal SCC, were included. The meta-analysis provided evidence that ethanol intake was related to esophageal SCC risk in a nonlinear fashion. High levels of alcohol consumption resulted in a substantial risk of esophageal SCC as compared to nondrinkers. However, a statistically significant excess risk for moderate and intermediate doses of alcohol was also observed, with no evidence of a threshold effect.

Journal ArticleDOI
TL;DR: This work presents a conditional maximized sequential probability ratio test (CMaxSPRT), which adjusts for the uncertainty in the expected counts and incorporates the randomness and variability from both the historical data and the surveillance population.
Abstract: The importance of post-marketing surveillance for drug and vaccine safety is well recognized as rare but serious adverse events may not be detected in pre-approval clinical trials. In such surveillance, a sequential test is preferable, in order to detect potential problems as soon as possible. Various sequential probability ratio tests (SPRT) have been applied in near real-time vaccine and drug safety surveillance, including Wald's classical SPRT with a single alternative and the Poisson-based maximized SPRT (MaxSPRT) with a composite alternative. These methods require that the expected number of events under the null hypothesis is known as a function of time t. In practice, the expected counts are usually estimated from historical data. When a large sample size from the historical data is lacking, the SPRTs are biased due to the variance in the estimate of the expected number of events. We present a conditional maximized sequential probability ratio test (CMaxSPRT), which adjusts for the uncertainty in the expected counts. Our test incorporates the randomness and variability from both the historical data and the surveillance population. Evaluations of the statistical power for CMaxSPRT are presented under different scenarios.

Journal ArticleDOI
TL;DR: Practical considerations for establishing efficient study designs to estimate relevant target doses are considered and optimal designs for estimating both the minimum effective dose and the dose achieving a certain percentage of the maximum treatment effect are considered.
Abstract: A key objective in the clinical development of a medicinal drug is the determination of an adequate dose level and, more broadly, the characterization of its dose response relationship. If the dose is set too high, safety and tolerability problems are likely to result, while selecting too low a dose makes it difficult to establish adequate efficacy in the confirmatory phase, possibly leading to a failed program. Hence, dose finding studies are of critical importance in drug development and need to be planned carefully. In this paper, we focus on practical considerations for establishing efficient study designs to estimate relevant target doses. We consider optimal designs for estimating both the minimum effective dose and the dose achieving a certain percentage of the maximum treatment effect. These designs are compared with D-optimal designs for a given dose response model. Extensions to robust designs accounting for model uncertainty are also discussed. A case study is used to motivate and illustrate the methods from this paper.