scispace - formally typeset
Search or ask a question

Showing papers in "Statistical Methods in Medical Research in 2016"


Journal ArticleDOI
TL;DR: This paper looks at how to choose an external pilot trial sample size in order to minimise the sample size of the overall clinical trial programme, that is, the pilot and the main trial together, and produces a method of calculating the optimal solution.
Abstract: Sample size justification is an important consideration when planning a clinical trial, not only for the main trial but also for any preliminary pilot trial. When the outcome is a continuous variable, the sample size calculation requires an accurate estimate of the standard deviation of the outcome measure. A pilot trial can be used to get an estimate of the standard deviation, which could then be used to anticipate what may be observed in the main trial. However, an important consideration is that pilot trials often estimate the standard deviation parameter imprecisely. This paper looks at how we can choose an external pilot trial sample size in order to minimise the sample size of the overall clinical trial programme, that is, the pilot and the main trial together. We produce a method of calculating the optimal solution to the required pilot trial sample size when the standardised effect size for the main trial is known. However, as it may not be possible to know the standardised effect size to be used prior to the pilot trial, approximate rules are also presented. For a main trial designed with 90% power and two-sided 5% significance, we recommend pilot trial sample sizes per treatment arm of 75, 25, 15 and 10 for standardised effect sizes that are extra small (≤0.1), small (0.2), medium (0.5) or large (0.8), respectively.

783 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new parameterization of the Bayesian Hierarchical Model (Besag, York and Mollie) model, which allows the hyperparameters of the two random effects to be seen independently from each other.
Abstract: In recent years, disease mapping studies have become a routine application within geographical epidemiology and are typically analysed within a Bayesian hierarchical model formulation. A variety of model formulations for the latent level have been proposed but all come with inherent issues. In the classical BYM (Besag, York and Mollie) model, the spatially structured component cannot be seen independently from the unstructured component. This makes prior definitions for the hyperparameters of the two random effects challenging. There are alternative model formulations that address this confounding; however, the issue on how to choose interpretable hyperpriors is still unsolved. Here, we discuss a recently proposed parameterisation of the BYM model that leads to improved parameter control as the hyperparameters can be seen independently from each other. Furthermore, the need for a scaled spatial component is addressed, which facilitates assignment of interpretable hyperpriors and make these transferable between spatial applications with different graph structures. The hyperparameters themselves are used to define flexible extensions of simple base models. Consequently, penalised complexity priors for these parameters can be derived based on the information-theoretic distance from the flexible model to the base model, giving priors with clear interpretation. We provide implementation details for the new model formulation which preserve sparsity properties, and we investigate systematically the model performance and compare it to existing parameterisations. Through a simulation study, we show that the new model performs well, both showing good learning abilities and good shrinkage behaviour. In terms of model choice criteria, the proposed model performs at least equally well as existing parameterisations, but only the new formulation offers parameters that are interpretable and hyperpriors that have a clear meaning.

261 citations


Journal ArticleDOI
TL;DR: The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.
Abstract: In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of the covariates are missing, analysts can use multiple imputation to fill in the missing data, estimate propensity scores based on the m completed datasets, and use the propensity scores to estimate treatment effects. We compare two approaches to implement this process. In the first, the analyst estimates the treatment effect using propensity score matching within each completed data set, and averages the m treatment effect estimates. In the second approach, the analyst averages the m propensity scores for each record across the completed datasets, and performs propensity score matching with these averaged scores to estimate the treatment effect. We compare properties of both methods via simulation studies using artificial and real data. The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.

247 citations


Journal ArticleDOI
TL;DR: A model-based framework for the assessment of calibration in the binary setting that provides natural extensions to the survival data setting and it is shown that Poisson regression models can be used to easily assess calibration in prognostic models.
Abstract: Current methods used to assess calibration are limited, particularly in the assessment of prognostic models. Methods for testing and visualizing calibration (e.g. the Hosmer-Lemeshow test and calibration slope) have been well thought out in the binary regression setting. However, extension of these methods to Cox models is less well known and could be improved. We describe a model-based framework for the assessment of calibration in the binary setting that provides natural extensions to the survival data setting. We show that Poisson regression models can be used to easily assess calibration in prognostic models. In addition, we show that a calibration test suggested for use in survival data has poor performance. Finally, we apply these methods to the problem of external validation of a risk score developed for the general population when assessed in a special patient population (i.e. patients with particular comorbidities, such as rheumatoid arthritis).

205 citations


Journal ArticleDOI
TL;DR: The unique features of the data within each cohort that have implications for the application of linear spline multilevel models are described, for example, differences in the density and inter-individual variation in measurement occasions, and multiple sources of measurement with varying measurement error.
Abstract: Childhood growth is of interest in medical research concerned with determinants and consequences of variation from healthy growth and development. Linear spline multilevel modelling is a useful approach for deriving individual summary measures of growth, which overcomes several data issues (co-linearity of repeat measures, the requirement for all individuals to be measured at the same ages and bias due to missing data). Here, we outline the application of this methodology to model individual trajectories of length/height and weight, drawing on examples from five cohorts from different generations and different geographical regions with varying levels of economic development. We describe the unique features of the data within each cohort that have implications for the application of linear spline multilevel models, for example, differences in the density and inter-individual variation in measurement occasions, and multiple sources of measurement with varying measurement error. After providing example Stata syntax and a suggested workflow for the implementation of linear spline multilevel models, we conclude with a discussion of the advantages and disadvantages of the linear spline approach compared with other growth modelling methods such as fractional polynomials, more complex spline functions and other non-linear models.

159 citations


Journal ArticleDOI
TL;DR: It is found that stratification on the propensity score resulted in the greatest bias and a method based on earlier work by Cole and Hernán tended to have the best performance for estimating absolute effects of treatment on survival outcomes.
Abstract: Observational studies are increasingly being used to estimate the effect of treatments, interventions and exposures on outcomes that can occur over time. Historically, the hazard ratio, which is a relative measure of effect, has been reported. However, medical decision making is best informed when both relative and absolute measures of effect are reported. When outcomes are time-to-event in nature, the effect of treatment can also be quantified as the change in mean or median survival time due to treatment and the absolute reduction in the probability of the occurrence of an event within a specified duration of follow-up. We describe how three different propensity score methods, propensity score matching, stratification on the propensity score and inverse probability of treatment weighting using the propensity score, can be used to estimate absolute measures of treatment effect on survival outcomes. These methods are all based on estimating marginal survival functions under treatment and lack of treatment. We then conducted an extensive series of Monte Carlo simulations to compare the relative performance of these methods for estimating the absolute effects of treatment on survival outcomes. We found that stratification on the propensity score resulted in the greatest bias. Caliper matching on the propensity score and a method based on earlier work by Cole and Hernan tended to have the best performance for estimating absolute effects of treatment on survival outcomes. When the prevalence of treatment was less extreme, then inverse probability of treatment weighting-based methods tended to perform better than matching-based methods.

144 citations


Journal ArticleDOI
TL;DR: A broad range of statistical issues related to multi-arm multi-stage trials are explored including a comparison of different ways to power a multi- arm multi- stage trial; choosing the allocation ratio to the control group compared to other experimental arms; the consequences of adding additional experimental arms during aMulti-armmulti-stage trial, and how one might control the type-I error rate when this is necessary.
Abstract: Multi-arm multi-stage designs can improve the efficiency of the drug-development process by evaluating multiple experimental arms against a common control within one trial. This reduces the number of patients required compared to a series of trials testing each experimental arm separately against control. By allowing for multiple stages experimental treatments can be eliminated early from the study if they are unlikely to be significantly better than control. Using the TAILoR trial as a motivating example, we explore a broad range of statistical issues related to multi-arm multi-stage trials including a comparison of different ways to power a multi-arm multi-stage trial; choosing the allocation ratio to the control group compared to other experimental arms; the consequences of adding additional experimental arms during a multi-arm multi-stage trial, and how one might control the type-I error rate when this is necessary; and modifying the stopping boundaries of a multi-arm multi-stage design to account for unknown variance in the treatment outcome. Multi-arm multi-stage trials represent a large financial investment, and so considering their design carefully is important to ensure efficiency and that they have a good chance of succeeding.

82 citations


Journal ArticleDOI
TL;DR: This paper considers how random-effects logistic regression models may be used in a number of different types of meta-analyses, including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-Analyses of diagnostic test accuracy.
Abstract: Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

82 citations


Journal ArticleDOI
TL;DR: The results provide strong empirical evidence that gene–gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance.
Abstract: Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis’s nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis’s on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene–gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges th...

81 citations


Journal ArticleDOI
TL;DR: This study outlines the key steps, from the literature search to sensitivity analysis, necessary to perform a valid NMA of binomial data, exploiting Markov Chain Monte Carlo approaches, and specifies the requirements for different models and parameter interpretations.
Abstract: This study presents an overview of conceptual and practical issues of a network meta-analysis (NMA), particularly focusing on its application to randomised controlled trials with a binary outcome of interest. We start from general considerations on NMA to specifically appraise how to collect study data, structure the analytical network and specify the requirements for different models and parameter interpretations, with the ultimate goal of providing physicians and clinician-investigators a practical tool to understand pros and cons of NMA. Specifically, we outline the key steps, from the literature search to sensitivity analysis, necessary to perform a valid NMA of binomial data, exploiting Markov Chain Monte Carlo approaches. We also apply this analytical approach to a case study on the beneficial effects of volatile agents compared to total intravenous anaesthetics for surgery to further clarify the statistical details of the models, diagnostics and computations. Finally, datasets and models for the freeware WinBUGS package are presented for the anaesthetic agent example.

80 citations


Journal ArticleDOI
TL;DR: This work provides an analytic approach to assess the noncollapsibility effect in a point-exposure study and provides a general formula for expressing the non Collapsesibility effect and demonstrates that collapsibility can have an important impact on estimation in practice.
Abstract: One approach to quantifying the magnitude of confounding in observational studies is to compare estimates with and without adjustment for a covariate, but this strategy is known to be defective for noncollapsible measures such as the odds ratio. Comparing estimates from marginal structural and standard logistic regression models, the total difference between crude and conditional effects can be decomposed into the sum of a noncollapsibility effect and confounding bias. We provide an analytic approach to assess the noncollapsibility effect in a point-exposure study and provide a general formula for expressing the noncollapsibility effect. Next, we provide a graphical approach that illustrates the relationship between the noncollapsibility effect and the baseline risk, and reveals the behavior of the noncollapsibility effect for a range of different exposure and covariate effects. Various observations about noncollapsibility can be made from the different scenarios with or without confounding; for example, the magnitude of effect of the covariate plays a more important role in the noncollapsibility effect than does that of the effect of the exposure. In order to explore the noncollapsibility effect of the odds ratio in the presence of time-varying confounding, we simulated an observational cohort study. The magnitude of noncollapsibility was generally comparable to the effect in the point-exposure study in our simulation settings. Finally, in an applied example we demonstrate that collapsibility can have an important impact on estimation in practice.

Journal ArticleDOI
TL;DR: A modification of Fleiss’ kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables, which generalizes the use of s* to a bivariate case.
Abstract: Assessing the inter-rater agreement between observers, in the case of ordinal variables, is an important issue in both the statistical theory and biomedical applications. Typically, this problem has been dealt with the use of Cohen’s weighted kappa, which is a modification of the original kappa statistic, proposed for nominal variables in the case of two observers. Fleiss (1971) put forth a generalization of kappa in the case of multiple observers, but both Cohen’s and Fleiss’ kappa could have a paradoxical behavior, which may lead to a difficult interpretation of their magnitude. In this paper, a modification of Fleiss’ kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile and bootstrap-t confidence intervals based on this statistic. The normal asymptotic distribution of the proposed statistic is demonstrated. Our results are applied to the classica...

Journal ArticleDOI
TL;DR: It is concluded that pilot studies will usually be too large to estimate parameters required for estimating a sample size for a main cluster randomised trial with sufficient precision and too small to provide reliable estimates of rates for process measures such as recruitment or follow-up rates.
Abstract: There is currently a lot of interest in pilot studies conducted in preparation for randomised controlled trials. This paper focuses on sample size requirements for external pilot studies for cluster randomised trials. We consider how large an external pilot study needs to be to assess key parameters for input to the main trial sample size calculation when the primary outcome is continuous, and to estimate rates, for example recruitment rates, with reasonable precision. We used simulation to provide the distribution of the expected number of clusters for the main trial under different assumptions about the natural cluster size, intra-cluster correlation, eventual cluster size in the main trial, and various decisions made at the piloting stage. We chose intra-cluster correlation values and pilot study size to reflect those commonly reported in the literature. Our results show that estimates of sample size required for the main trial are likely to be biased downwards and very imprecise unless the pilot study includes large numbers of clusters and individual participants. We conclude that pilot studies will usually be too small to estimate parameters required for estimating a sample size for a main cluster randomised trial (e.g. the intra-cluster correlation coefficient) with sufficient precision and too small to provide reliable estimates of rates for process measures such as recruitment or follow-up rates.

Journal ArticleDOI
TL;DR: This paper considers the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, and derives the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models.
Abstract: Recently, a flexible cure rate survival model has been developed by assuming the number of competing causes of the event of interest to follow the Conway-Maxwell-Poisson distribution. This model includes some of the well-known cure rate models discussed in the literature as special cases. Data obtained from cancer clinical trials are often right censored and expectation maximization algorithm can be used in this case to efficiently estimate the model parameters based on right censored data. In this paper, we consider the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, we derive the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models. The standard errors of the maximum likelihood estimates are obtained by inverting the observed information matrix. The method of inference developed here is examined by means of an extensive Monte Carlo simulation study. Finally, we illustrate the proposed methodology with a real data on cancer recurrence.

Journal ArticleDOI
TL;DR: A time-continuous model is introduced and discrete observations are simulated in order to judge the relationship between the DAG and the immediate causal model and it is found that there is no clear relationship; indeed the Bayesian network described by the D AG may not relate to the causal model.
Abstract: Directed acyclic graphs (DAGs) play a large role in the modern approach to causal inference. DAGs describe the relationship between measurements taken at various discrete times including the effect of interventions. The causal mechanisms, on the other hand, would naturally be assumed to be a continuous process operating over time in a cause-effect fashion. How does such immediate causation, that is causation occurring over very short time intervals, relate to DAGs constructed from discrete observations? We introduce a time-continuous model and simulate discrete observations in order to judge the relationship between the DAG and the immediate causal model. We find that there is no clear relationship; indeed the Bayesian network described by the DAG may not relate to the causal model. Typically, discrete observations of a process will obscure the conditional dependencies that are represented in the underlying mechanistic model of the process. It is therefore doubtful whether DAGs are always suited to describe causal relationships unless time is explicitly considered in the model. We relate the issues to mechanistic modeling by using the concept of local (in)dependence. An example using data from the Swiss HIV Cohort Study is presented.

Journal ArticleDOI
TL;DR: This article presents an overview and tutorial of statistical methods for meta-analysis of diagnostic tests under two scenarios: (1) when the reference test can be considered a gold standard and (2) whenThe reference test cannot be consideredA gold standard.
Abstract: In this article, we present an overview and tutorial of statistical methods for meta-analysis of diagnostic tests under two scenarios: (1) when the reference test can be considered a gold standard and (2) when the reference test cannot be considered a gold standard. In the first scenario, we first review the conventional summary receiver operating characteristics approach and a bivariate approach using linear mixed models. Both approaches require direct calculations of study-specific sensitivities and specificities. We next discuss the hierarchical summary receiver operating characteristics curve approach for jointly modeling positivity criteria and accuracy parameters, and the bivariate generalized linear mixed models for jointly modeling sensitivities and specificities. We further discuss the trivariate generalized linear mixed models for jointly modeling prevalence, sensitivities and specificities, which allows us to assess the correlations among the three parameters. These approaches are based on the exact binomial distribution and thus do not require an ad hoc continuity correction. Lastly, we discuss a latent class random effects model for meta-analysis of diagnostic tests when the reference test itself is imperfect for the second scenario. A number of case studies with detailed annotated SAS code in MIXED and NLMIXED procedures are presented to facilitate the implementation of these approaches.

Journal ArticleDOI
TL;DR: A weighted AUCC,D(t) with time- and data-dependent weights as a summary measure of the mean AUCC(t), restricted to a finite time range to ensure its clinical relevance is proposed.
Abstract: Assessments of the discriminative performance of prognostic models have led to the development of several measures that extend the concept of discrimination as evaluated by the receiver operating characteristics curve and the area under the receiver operating characteristic curve (AUC) of diagnostic settings. Thus, several time-dependent-receiver operating characteristic curve and AUC(t) have been proposed. One of the most used, the cumulative/dynamic AUCC,D(t) is the probability that, given two randomly chosen patients, one having failed before t and the other having failed after t, the prognostic marker will be correctly ranked. In this paper, we propose a weighted AUCC,D(t) with time- and data-dependent weights as a summary measure of the mean AUCC,D(t), restricted to a finite time range to ensure its clinical relevance. A simulation study shows that estimated restricted mean AUC increased with the strength of association of the covariate with the outcome, with low impact of censoring, and adequate coverage of bootstrap confidence intervals. We illustrate this methodology to two real datasets from two randomized clinical trials to assess the prognostic factors of the overall mortality in patients who have compensated cirrhosis and to assess the prognostic factors of event-free survival in patients who have acute myeloid leukemia.

Journal ArticleDOI
TL;DR: A comparison of the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests, Support Vector Machines, Linear Discriminant Analysis, LDA and k-Nearest Neighbour, finds LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates.
Abstract: BackgroundRecent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on fa...

Journal ArticleDOI
TL;DR: A test that is unbiased for non-normal data, for small sample sizes as well as for two-sided alternatives and that can be computed for high-dimensional data has been recently proposed and is based on the ranks of the interpoint Euclidean distances between observations.
Abstract: The multivariate location problem is addressed. The most familiar method to address the problem is the Hotelling test. When the hypothesis of normal distributions holds, the Hotelling test is optimal. Unfortunately, in practice the distributions underlying the samples are generally unknown and without assuming normality the finite sample unbiasedness of the Hotelling test is not guaranteed. Moreover, high-dimensional data are increasingly encountered when analyzing medical and biological problems, and in these situations the Hotelling test performs poorly or cannot be computed. A test that is unbiased for non-normal data, for small sample sizes as well as for two-sided alternatives and that can be computed for high-dimensional data has been recently proposed and is based on the ranks of the interpoint Euclidean distances between observations. Five modifications of this test are proposed and compared to the original test and the Hotelling test. Unbiasedness and consistency of the tests are proven and the problem of power computation is addressed. It is shown that two of the modified interpoint distance-based tests are always more powerful than the original test. Particularly, the modified test based on the Tippett criterium is suggested when the assumption of normality is not tenable and/or in case of high-dimensional data with complex dependence structure which are typical in molecular biology and medical imaging. A practical application to a case-control study where functional magnetic resonance imaging is used is discussed.

Journal ArticleDOI
TL;DR: It is shown that no one method can handle all plausible visit scenarios and it is suggested that careful analysis of the visit process should inform the choice of analytic method for the outcomes.
Abstract: When data are collected longitudinally, measurement times often vary among patients. This is of particular concern in clinic-based studies, for example retrospective chart reviews. Here, typically no two patients will share the same set of measurement times and moreover, it is likely that the timing of the measurements is associated with disease course; for example, patients may visit more often when unwell. While there are statistical methods that can help overcome the resulting bias, these make assumptions about the nature of the dependence between visit times and outcome processes, and the assumptions differ across methods. The purpose of this paper is to review the methods available with a particular focus on how the assumptions made line up with visit processes encountered in practice. Through this we show that no one method can handle all plausible visit scenarios and suggest that careful analysis of the visit process should inform the choice of analytic method for the outcomes. Moreover, there are some commonly encountered visit scenarios that are not handled well by any method, and we make recommendations with regard to study design that would minimize the chances of these problematic visit scenarios arising.

Journal ArticleDOI
TL;DR: It is concluded that evidence of efficacy based on a series of (smaller) trials, may lower the error rates compared with using a single well-powered trial.
Abstract: There is debate whether clinical trials with suboptimal power are justified and whether results from large studies are more reliable than the (combined) results of smaller trials. We quantified the error rates for evaluations based on single conventionally powered trials (80% or 90% power) versus evaluations based on the random-effects meta-analysis of a series of smaller trials. When a treatment was assumed to have no effect but heterogeneity was present, the error rates for a single trial were increased more than 10-fold above the nominal rate, even for low heterogeneity. Conversely, for meta-analyses on a series of trials, the error rates were correct. When selective publication was present, the error rates were always increased, but they still tended to be lower for a series of trials than single trials. We conclude that evidence of efficacy based on a series of (smaller) trials, may lower the error rates compared with using a single well-powered trial. Only when both heterogeneity and selective publi...

Journal ArticleDOI
Yize Zhao1, Qi Long1
TL;DR: Numerical studies show that in the presence of high-dimensional data the standard multiple imputations approach performs poorly and the imputation approach using Bayesian lasso regression achieves, in most cases, better performance than the other imputation methods including the standard imputations using the correctly specified imputation model.
Abstract: Missing data are frequently encountered in biomedical, epidemiologic and social research. It is well known that a naive analysis without adequate handling of missing data may lead to bias and/or loss of efficiency. Partly due to its ease of use, multiple imputation has become increasingly popular in practice for handling missing data. However, it is unclear what is the best strategy to conduct multiple imputation in the presence of high-dimensional data. To answer this question, we investigate several approaches of using regularized regression and Bayesian lasso regression to impute missing values in the presence of high-dimensional data. We compare the performance of these methods through numerical studies, in which we also evaluate the impact of the dimension of the data, the size of the true active set for imputation, and the strength of correlation. Our numerical studies show that in the presence of high-dimensional data the standard multiple imputation approach performs poorly and the imputation approach using Bayesian lasso regression achieves, in most cases, better performance than the other imputation methods including the standard imputation approach using the correctly specified imputation model. Our results suggest that Bayesian lasso regression and its extensions are better suited for multiple imputation in the presence of high-dimensional data than the other regression methods.

Journal ArticleDOI
TL;DR: It is concluded that positivity bias was of limited magnitude and did not explain the large differences in the point estimates, and propensity score-matching for ATT and inverse probability of treatment weighting for average treatment effect yield substantially different estimates of treatment effect.
Abstract: ObjectivePropensity score matching is typically used to estimate the average treatment effect for the treated while inverse probability of treatment weighting aims at estimating the population average treatment effect. We illustrate how different estimands can result in very different conclusions.Study designWe applied the two propensity score methods to assess the effect of continuous positive airway pressure on mortality in patients hospitalized for acute heart failure. We used Monte Carlo simulations to investigate the important differences in the two estimates.ResultsContinuous positive airway pressure application increased hospital mortality overall, but no continuous positive airway pressure effect was found on the treated. Potential reasons were (1) violation of the positivity assumption; (2) treatment effect was not uniform across the distribution of the propensity score. From simulations, we concluded that positivity bias was of limited magnitude and did not explain the large differences in the p...

Journal ArticleDOI
TL;DR: It is concluded that the coverage probability for assessing agreement is the preferred agreement index on the basis of computational simplicity, its ability for rapid identification of discordant measurements to provide guidance for review and retraining, and its consistent evaluation of data quality across multiple reviewers, populations, and continuous/categorical data.
Abstract: Clinical core laboratories, such as Echocardiography core laboratories, are increasingly used in clinical studies with imaging outcomes as primary, secondary, or surrogate endpoints. While many factors contribute to the quality of measurements of imaging variables, an essential step in ensuring the value of imaging data includes formal assessment and control of reproducibility via intra-observer and inter-observer reliability. There are many different agreement/reliability indices in the literature. However, different indices may lead to different conclusions and it is not clear which index is the preferred choice as an overall indication of data quality and a tool for providing guidance on improving quality and reliability in a core lab setting. In this paper, we pre-specify the desirable characteristics of an agreement index for assessing and improving reproducibility in a core lab setting; we compare existing agreement indices in terms of these characteristics to choose a preferred index. We conclude that, among the existing indices reviewed, the coverage probability for assessing agreement is the preferred agreement index on the basis of computational simplicity, its ability for rapid identification of discordant measurements to provide guidance for review and retraining, and its consistent evaluation of data quality across multiple reviewers, populations, and continuous/categorical data.

Journal ArticleDOI
TL;DR: In this paper, a copula-based dependence model was used to investigate the bias caused by dependent censoring on gene selection and then, an alternative gene selection procedure was developed for non-small-cell lung cancer data.
Abstract: Dependent censoring arises in biomedical studies when the survival outcome of interest is censored by competing risks. In survival data with microarray gene expressions, gene selection based on the univariate Cox regression analyses has been used extensively in medical research, which however, is only valid under the independent censoring assumption. In this paper, we first consider a copula-based framework to investigate the bias caused by dependent censoring on gene selection. Then, we utilize the copula-based dependence model to develop an alternative gene selection procedure. Simulations show that the proposed procedure adjusts for the effect of dependent censoring and thus outperforms the existing method when dependent censoring is indeed present. The non-small-cell lung cancer data are analyzed to demonstrate the usefulness of our proposal. We implemented the proposed method in an R "compound.Cox" package.

Journal ArticleDOI
TL;DR: A joint model that consists of a multilevel item response theory (MLIRT) model for the multiple longitudinal outcomes, and a Cox’s proportional hazard model with piecewise constant baseline hazards for the event time data is developed.
Abstract: In many clinical trials, studying neurodegenerative diseases including Parkinson’s disease (PD), multiple longitudinal outcomes are collected in order to fully explore the multidimensional impairment caused by these diseases. The follow-up of some patients can be stopped by some outcome-dependent terminal event, e.g. death and dropout. In this article, we develop a joint model that consists of a multilevel item response theory (MLIRT) model for the multiple longitudinal outcomes, and a Cox’s proportional hazard model with piecewise constant baseline hazards for the event time data. Shared random effects are used to link together two models. The model inference is conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in BUGS language. Our proposed model is evaluated by simulation studies and is applied to the DATATOP study, a motivating clinical trial assessing the effect of tocopherol on PD among patients with early PD.

Journal ArticleDOI
TL;DR: Three extensions of the two-part random effects models, allowing the positive values to follow a generalized gamma distribution, a log-skew-normal distribution, and a normal distribution after the Box-Cox transformation are considered, finding that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity.
Abstract: Two-part random effects models (Olsen and Schafer,(1) Tooze et al.(2)) have been applied to repeated measures of semi-continuous data, characterized by a mixture of a substantial proportion of zero values and a skewed distribution of positive values. In the original formulation of this model, the natural logarithm of the positive values is assumed to follow a normal distribution with a constant variance parameter. In this article, we review and consider three extensions of this model, allowing the positive values to follow (a) a generalized gamma distribution, (b) a log-skew-normal distribution, and (c) a normal distribution after the Box-Cox transformation. We allow for the possibility of heteroscedasticity. Maximum likelihood estimation is shown to be conveniently implemented in SAS Proc NLMIXED. The performance of the methods is compared through applications to daily drinking records in a secondary data analysis from a randomized controlled trial of topiramate for alcohol dependence treatment. We find that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity. We also compare the three models by the likelihood ratio tests for non-nested hypotheses (Vuong(3)). The results suggest that the generalized gamma distribution provides the best fit, though no statistically significant differences are found in pairwise model comparisons.

Journal ArticleDOI
TL;DR: The use of a compositional data perspective for statistical analyses in the field of nutritional epidemiology is described, based on isometric log-ratio transformation, which allows full inferences about each element of dietary composition and adjustment by total energy intake.
Abstract: The purpose of epidemiological studies of nutrition and disease is to investigate the effects of specific dietary components regardless of total energy intake, but this is sometimes hampered by the compositional nature of dietary data. Compositional data are those that measure parts of a whole, such as percentages or proportions, and particular methodologies have been developed to allow their statistical analysis and theoretical and practical applications in various sciences. This paper describes the use of a compositional data perspective for statistical analyses in the field of nutritional epidemiology. The approach is based on isometric log-ratio transformation and has been previously proposed for the construction of regression models using compositional explanatory variables. The new isometric log-ratio variables allow full inferences about each element of dietary composition and adjustment by total energy intake. Using data from an Italian population-based study, logistic regression models were fitted to evaluate the effects of the intake of macronutrients (proteins, fats and carbohydrates) on the odds of having metabolic syndrome in middle-aged subjects.

Journal ArticleDOI
TL;DR: An extensive review of the existing methods for biomarker combination and a new combination method, namely, the nonparametric stepwise approach are proposed, to empirically evaluate and compare the performance of different linear combination methods in yielding the largest area under receiver operating characteristic curve.
Abstract: Multiple diagnostic tests or biomarkers can be combined to improve diagnostic accuracy. The problem of finding the optimal linear combinations of biomarkers to maximise the area under the receiver operating characteristic curve has been extensively addressed in the literature. The purpose of this article is threefold: (1) to provide an extensive review of the existing methods for biomarker combination; (2) to propose a new combination method, namely, the nonparametric stepwise approach; (3) to use leave-one-pair-out cross-validation method, instead of re-substitution method, which is overoptimistic and hence might lead to wrong conclusion, to empirically evaluate and compare the performance of different linear combination methods in yielding the largest area under receiver operating characteristic curve. A data set of Duchenne muscular dystrophy was analysed to illustrate the applications of the discussed combination methods.

Journal ArticleDOI
TL;DR: A large number of quantities such as transition probabilities, cumulative probabilities and life expectancies are reviewed in an illness-death model, perhaps the most common multi-state model in the medical literature, and a way to estimate them in addition to the transition intensities and the regression parameters is proposed.
Abstract: Multi-state models allow subjects to move among a finite number of states during a follow-up period. Most often, the objects of study are the transition intensities. The impact of covariates on them can also be studied by specifying regression models. Thus, estimation in multi-state models is usually focused on the transition intensities (or the cumulative transition intensities) and on the regression parameters. However, from a clinical or epidemiological point of view, other quantities could provide additional information and may be more relevant to answer practical questions. For example, given a set of covariates for a subject, it may be of interest to estimate the probability to experience a future event or the expected time without any event. To address these kinds of issues, we need to estimate quantities such as transition probabilities, cumulative probabilities and life expectancies. The purpose of this paper is to review a large number of these quantities in an illness-death model which is perha...