scispace - formally typeset
Search or ask a question

Showing papers in "Pharmaceutical Statistics in 2014"


Journal ArticleDOI
TL;DR: This manuscript reviews several methods for historical borrowing, illustrating how key parameters in each method affect borrowing behavior, and then, compares these methods on the basis of mean square error, power and type I error.
Abstract: Clinical trials rarely, if ever, occur in a vacuum. Generally, large amounts of clinical data are available prior to the start of a study, particularly on the current study's control arm. There is obvious appeal in using (i.e., 'borrowing') this information. With historical data providing information on the control arm, more trial resources can be devoted to the novel treatment while retaining accurate estimates of the current control arm parameters. This can result in more accurate point estimates, increased power, and reduced type I error in clinical trials, provided the historical information is sufficiently similar to the current control data. If this assumption of similarity is not satisfied, however, one can acquire increased mean square error of point estimates due to bias and either reduced power or increased type I error depending on the direction of the bias. In this manuscript, we review several methods for historical borrowing, illustrating how key parameters in each method affect borrowing behavior, and then, we compare these methods on the basis of mean square error, power and type I error. We emphasize two main themes. First, we discuss the idea of 'dynamic' (versus 'static') borrowing. Second, we emphasize the decision process involved in determining whether or not to include historical borrowing in terms of the perceived likelihood that the current control arm is sufficiently similar to the historical data. Our goal is to provide a clear review of the key issues involved in historical borrowing and provide a comparison of several methods useful for practitioners.

333 citations


Journal ArticleDOI
Rebecca Finch1
TL;DR: The book is a summary of multipleTesting in pharmaceutical clinical trials, covering a broad range of topics from multiple testing in dose-finding studies, adaptive designed studies and microarray experiments to discussing the regulatory issues and modern multiplicity procedures such as gatekeeping.
Abstract: The book is a summary of multiple testing in pharmaceutical clinical trials, covering a broad range of topics from multiple testing in dose-finding studies, adaptive designed studies and microarray experiments to discussing the regulatory issues and modern multiplicity procedures such as gatekeeping. It is aimed predominantly at biostatisticians working in preclinical and clinical trials. The book consists of seven chapters. Each chapter starts with an introduction to the multiple testing issues, followed by a theoretical description of the available statistical techniques and examples. However, the book does not cover all areas of multiplicity faced by a clinical trial biostatisticians, such as pharmacokinetic/pharmacodynamic modelling, and Bayesian theory is not considered. Chapter 1 begins the book with a summary of multiplicity problems from a regulatory perspective. The chapter is written by two influential regulatory statisticians from the European and FDA regulatory environments and provides a broad overview of the different areas where multiplicity issues may arise. The chapter covers the regulatory issues with regard to multiple endpoints, multiple dose comparisons, subgroups and multiplicity concerns in special situations. It also provides some methods for reducing the multiplicity within clinical trials, such as using hierarchical ordering and composite endpoints, and mentions situations where adjustment is not considered necessary. Chapter 2 provides the theoretical foundation for the rest of the book. The authors begin with introducing the different error rates, including the definition and importance of the family-wise error rate. The authors go on to explain popular multiple testing principles (union-intersection and intersection-union testing, and the closure and partitioning principles), followed by clear explanations (with examples) of commonly used approaches to multiple testing, such as procedures based on univariate p-values (e.g. Bonferroni, Fallback and Hochberg), parametric testing procedures (e.g. Dunnett procedures) and resampling-based procedures. Chapter 3 gives an overview of multiple testing problems in dose-finding studies and how trend tests are used for the detection of dose–response signals. The authors also provide definitions for the minimum effective and maximum safe doses and how they are estimated. Power and sample size calculations are also included in the chapter for the maximum safe dose estimation. Finally, the authors explain model-based methods that can be used to estimate an adequate dose to achieve a desired response. Chapter 4 continues the principles and procedures introduced in Chapter 2, covering different methods for adjusting for multiple endpoints. There is a lot of repetition within the ‘at-least-one’ procedures section and the contents of Chapter 2, which is noticeable when reading the book in order, but does allow for the two chapters to be stand-alone. As well as at-least-one procedures, the chapter includes global procedures for assessing the overall efficacy of a treatment, ‘all-or-none’ procedures and the superiority-noninferiority approach. Chapter 5 focuses on gatekeeping procedures, which offer a more flexible hierarchical structure than the fixed-sequence procedure (introduced in Chapter 2). The authors walk the reader through serial, parallel and tree gatekeeping procedures, using worked (easy to understand) examples along the way. The chapter could be improved by discussing the graphical approach published by Bretz et al. [1] when explaining implementing gatekeeping procedures. Chapter 6 starts with a review of the design and analysis of adaptive trials and the multiplicity issues that can arise above those already present in fixed design trials. Repeated hypothesis testing at interim analyses and sample size adjustment (both in a blinded and unblinded setting) are reviewed, including discussion on stopping boundaries and formulae for updating the sample size. The authors discuss applications of the closure procedure to adaptive designs based on combination tests and conditional error rates, which allow trial design modifications based on unblinded interim data. The chapter finishes with a description of two case studies based on adaptive treatment selection and subgroup selection at an interim analysis. The final chapter covers the design and analysis of microarray experiments for pharmacogenomics. The chapter starts with a clear overview of microarrays and introduces the two stages of pharmacogenetic development. The introduction is easy to understand, even for readers lacking experience in the field. The multiplicity concerns around individual biomarkers and subgroups are then discussed, along with the control of multiple error rates (not just the family-wise error rate). The design of pharmacogenomic studies is demonstrated with the use of a case study. In conclusion, the book is well written and covers a wide range of clinical trial settings in which multiple testing issues arise. Descriptions of the procedures are supported with clinical trial-related examples, with many of the chapters providing guidance into the implementation in commonly used software applications (SAS and R), making the book a useful tool for biostatisticians dealing with multiple testing problems in clinical trials. Each chapter begins with an overview of the multiple testing issues faced within different clinical trial settings, which may be of interest to clinical trial practitioners. However, with the

92 citations


Journal ArticleDOI
TL;DR: Under the proposed design, the posterior estimates of the model parameters continuously update to make the decisions of dose assignment and early stopping, and the design is competitive and outperforms some existing designs.
Abstract: In early phase dose-finding cancer studies, the objective is to determine the maximum tolerated dose, defined as the highest dose with an acceptable dose-limiting toxicity rate. Finding this dose for drug-combination trials is complicated because of drug–drug interactions, and many trial designs have been proposed to address this issue. These designs rely on complicated statistical models that typically are not familiar to clinicians, and are rarely used in practice. The aim of this paper is to propose a Bayesian dose-finding design for drug combination trials based on standard logistic regression. Under the proposed design, we continuously update the posterior estimates of the model parameters to make the decisions of dose assignment and early stopping. Simulation studies show that the proposed design is competitive and outperforms some existing designs. We also extend our design to handle delayed toxicities. Copyright © 2014 John Wiley & Sons, Ltd.

55 citations


Journal ArticleDOI
TL;DR: A general class of Bayesian group sequential designs is presented, where multiple criteria based on the posterior distribution can be defined to reflect clinically meaningful decision criteria on whether to stop or continue the trial at the interim analyses.
Abstract: Bayesian approaches to the monitoring of group sequential designs have two main advantages compared with classical group sequential designs: first, they facilitate implementation of interim success and futility criteria that are tailored to the subsequent decision making, and second, they allow inclusion of prior information on the treatment difference and on the control group. A general class of Bayesian group sequential designs is presented, where multiple criteria based on the posterior distribution can be defined to reflect clinically meaningful decision criteria on whether to stop or continue the trial at the interim analyses. To evaluate the frequentist operating characteristics of these designs, both simulation methods and numerical integration methods are proposed, as implemented in the corresponding R package gsbDesign. Normal approximations are used to allow fast calculation of these characteristics for various endpoints. The practical implementation of the approach is illustrated with several clinical trial examples from different phases of drug development, with various endpoints, and informative priors.

51 citations


Journal ArticleDOI
TL;DR: In the absence of placebo-controlled trials, determining the non-inferiority (NI) margin for comparing an experimental treatment with an active comparator is based on carefully selected well-controlled historical clinical trials as mentioned in this paper.
Abstract: In the absence of placebo-controlled trials, determining the non-inferiority (NI) margin for comparing an experimental treatment with an active comparator is based on carefully selected well-controlled historical clinical trials. With this approach, information on the effect of the active comparator from other sources including observational studies and early phase trials is usually ignored because of the need to maintain active comparator effect across trials. This may lead to conservative estimates of the margin that translate into larger sample-size requirements for the design and subsequent frequentist analysis, longer trial durations, and higher drug development costs. In this article, we provide methodological approaches to determine NI margins that can utilize all relevant historical data through a novel power adjusted Bayesian meta-analysis, with Dirichlet process priors, that puts ordered weights on the amount of information a set of data contributes. We also provide a Bayesian decision rule for the non-inferiority analysis that is based on a broader use of available prior information and a sample-size determination that is based on this Bayesian decision rule. Finally, the methodology is illustrated through several examples. Published 2013. This article is a U.S. Government work and is in the public domain in the USA.

50 citations


Journal ArticleDOI
TL;DR: Inferentially seamless studies are one of the best‐known adaptive trial designs, but regulatory guidance suggests that statistical issues associated with study conduct are not as well understood as they should be.
Abstract: Background: Inferentially seamless studies are one of the best-known adaptive trial designs. Statistical inference for these studies is a well-studied problem. Regulatory guidance suggests that statistical issues associated with study conduct are not as well understood. Some of these issues are caused by the need for early pre-specification of the phase III design and the absence of sponsor access to unblinded data. Before statisticians decide to choose a seamless IIb/III design for their programme, they should consider whether these pitfalls will be an issue for their programme. Methods: We consider four case studies. Each design met with varying degrees of success. We explore the reasons for this variation to identify characteristics of drug development programmes that lend themselves well to inferentially seamless trials and other characteristics that warn of difficulties. Results: Seamless studies require increased upfront investment and planning to enable the phase III design to be specified at the outset of phase II. Pivotal, inferentially seamless studies are unlikely to allow meaningful sponsor access to unblinded data before study completion. This limits a sponsor's ability to reflect new information in the phase III portion. Conclusions: When few clinical data have been gathered about a drug, phase II data will answer many unresolved questions. Committing to phase III plans and study designs before phase II begins introduces extra risk to drug development. However, seamless pivotal studies may be an attractive option when the clinical setting and development programme allow, for example, when revisiting dose selection. Copyright © 2014 John Wiley & Sons, Ltd.

37 citations


Journal ArticleDOI
TL;DR: The negative multinomial distribution is used to apply this approach to analyses of recurrent events and other similar outcomes and is illustrated by a trial in severe asthma where the primary endpoint was rate of exacerbations and the primary analysis was based on the negative binomial model.
Abstract: Statistical analyses of recurrent event data have typically been based on the missing at random assumption. One implication of this is that, if data are collected only when patients are on their randomized treatment, the resulting de jure estimator of treatment effect corresponds to the situation in which the patients adhere to this regime throughout the study. For confirmatory analysis of clinical trials, sensitivity analyses are required to investigate alternative de facto estimands that depart from this assumption. Recent publications have described the use of multiple imputation methods based on pattern mixture models for continuous outcomes, where imputation for the missing data for one treatment arm (e.g. the active arm) is based on the statistical behaviour of outcomes in another arm (e.g. the placebo arm). This has been referred to as controlled imputation or reference-based imputation. In this paper, we use the negative multinomial distribution to apply this approach to analyses of recurrent events and other similar outcomes. The methods are illustrated by a trial in severe asthma where the primary endpoint was rate of exacerbations and the primary analysis was based on the negative binomial model.

36 citations


Journal ArticleDOI
TL;DR: A method for calculating the sample size under assumption of a piecewise exponential distribution for the cancer vaccine group and an exponential distributionFor the placebo group as the survival model is presented and the impact of delayed effect timing on both the choice of the Fleming-Harrington weights and the increment in the required number of events is discussed.
Abstract: In recent years, immunological science has evolved, and cancer vaccines are available for treating existing cancers. Because cancer vaccines require time to elicit an immune response, a delayed treatment effect is expected. Accordingly, the use of weighted log-rank tests with the Fleming-Harrington class of weights is proposed for evaluation of survival endpoints. We present a method for calculating the sample size under assumption of a piecewise exponential distribution for the cancer vaccine group and an exponential distribution for the placebo group as the survival model. The impact of delayed effect timing on both the choice of the Fleming-Harrington's weights and the increment in the required number of events is discussed.

35 citations


Journal ArticleDOI
Devan V. Mehrotra1
TL;DR: An analysis of covariance approach is recommended that uses the within-subject difference in treatment responses as the dependent variable and the corresponding difference in baseline responses as a covariate.
Abstract: In many two-period, two-treatment (2 × 2) crossover trials, for each subject, a continuous response of interest is measured before and after administration of the assigned treatment within each period. The resulting data are typically used to test a null hypothesis involving the true difference in treatment response means. We show that the power achieved by different statistical approaches is greatly influenced by (i) the 'structure' of the variance-covariance matrix of the vector of within-subject responses and (ii) how the baseline (i.e., pre-treatment) responses are accounted for in the analysis. For (ii), we compare different approaches including ignoring one or both period baselines, using a common change from baseline analysis (which we advise against), using functions of one or both baselines as period-specific or period-invariant covariates, and doing joint modeling of the post-baseline and baseline responses with corresponding mean constraints for the latter. Based on theoretical arguments and simulation-based type I error rate and power properties, we recommend an analysis of covariance approach that uses the within-subject difference in treatment responses as the dependent variable and the corresponding difference in baseline responses as a covariate. Data from three clinical trials are used to illustrate the main points.

35 citations


Journal ArticleDOI
TL;DR: The safety sub-team of the BSWG explores the use of Bayesian methods when applied to drug safety meta-analysis and network meta- analysis related to the cardiovascular safety of non-steroidal anti-inflammatory drugs.
Abstract: The Drug Information Association Bayesian Scientific Working Group (BSWG) was formed in 2011 with a vision to ensure that Bayesian methods are well understood and broadly utilized for design and analysis and throughout the medical product development process, and to improve industrial, regulatory, and economic decision making. The group, composed of individuals from academia, industry, and regulatory, has as its mission to facilitate the appropriate use and contribute to the progress of Bayesian methodology. In this paper, the safety sub-team of the BSWG explores the use of Bayesian methods when applied to drug safety meta-analysis and network meta-analysis. Guidance is presented on the conduct and reporting of such analyses. We also discuss different structural model assumptions and provide discussion on prior specification. The work is illustrated through a case study involving a network meta-analysis related to the cardiovascular safety of non-steroidal anti-inflammatory drugs.

27 citations


Journal ArticleDOI
TL;DR: It is demonstrated how this inflation of the significance level can be adjusted for to achieve control of the type I error rate at a pre-specified level and some refinements of the re-estimation procedure are proposed to improve the power properties, in particular in scenarios with small sample sizes.
Abstract: In drug development, bioequivalence studies are used to indirectly demonstrate clinical equivalence of a test formulation and a reference formulation of a specific drug by establishing their equivalence in bioavailability. These studies are typically run as crossover studies. In the planning phase of such trials, investigators and sponsors are often faced with a high variability in the coefficients of variation of the typical pharmacokinetic endpoints such as the area under the concentration curve or the maximum plasma concentration. Adaptive designs have recently been considered to deal with this uncertainty by adjusting the sample size based on the accumulating data. Because regulators generally favor sample size re-estimation procedures that maintain the blinding of the treatment allocations throughout the trial, we propose in this paper a blinded sample size re-estimation strategy and investigate its error rates. We show that the procedure, although blinded, can lead to some inflation of the type I error rate. In the context of an example, we demonstrate how this inflation of the significance level can be adjusted for to achieve control of the type I error rate at a pre-specified level. Furthermore, some refinements of the re-estimation procedure are proposed to improve the power properties, in particular in scenarios with small sample sizes.

Journal ArticleDOI
TL;DR: The survey results support findings from the literature and provide additional insight on regulatory acceptance of Bayesian methods and information on the need for a Bayesian infrastructure within an organization.
Abstract: Bayesian applications in medical product development have recently gained popularity. Despite many advances in Bayesian methodology and computations, increase in application across the various areas of medical product development has been modest. The DIA Bayesian Scientific Working Group (BSWG), which includes representatives from industry, regulatory agencies, and academia, has adopted the vision to ensure Bayesian methods are well understood, accepted more broadly, and appropriately utilized to improve decision making and enhance patient outcomes. As Bayesian applications in medical product development are wide ranging, several sub-teams were formed to focus on various topics such as patient safety, non-inferiority, prior specification, comparative effectiveness, joint modeling, program-wide decision making, analytical tools, and education. The focus of this paper is on the recent effort of the BSWG Education sub-team to administer a Bayesian survey to statisticians across 17 organizations involved in medical product development. We summarize results of this survey, from which we provide recommendations on how to accelerate progress in Bayesian applications throughout medical product development. The survey results support findings from the literature and provide additional insight on regulatory acceptance of Bayesian methods and information on the need for a Bayesian infrastructure within an organization. The survey findings support the claim that only modest progress in areas of education and implementation has been made recently, despite substantial progress in Bayesian statistical research and software availability. Copyright © 2013 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The approach to prepare data for sharing with other researchers in a way that minimises risk with respect to the privacy and confidentiality of research participants, ensures compliance with current data privacy legal requirements and yet retains utility of the anonymised datasets for research purposes is described.
Abstract: In May 2013, GlaxoSmithKline (980 Great West Road, Brentford, Middlesex, TW8 9GS, UK) established a new online system to enable scientific researchers to request access to anonymised patient level clinical trial data. Providing access to individual patient data collected in clinical trials enables conduct of further research that may help advance medical science or improve patient care. In turn, this helps ensure that the data provided by research participants are used to maximum effect in the creation of new knowledge and understanding. However, when providing access to individual patient data, maintaining the privacy and confidentiality of research participants is critical. This article describes the approach we have taken to prepare data for sharing with other researchers in a way that minimises risk with respect to the privacy and confidentiality of research participants, ensures compliance with current data privacy legal requirements and yet retains utility of the anonymised datasets for research purposes. We recognise that there are different possible approaches and that broad consensus is needed.

Journal ArticleDOI
TL;DR: The bias-correction method of imputation-based AUCs was introduced and it was found that the bias-corrected estimate successfully compensated the overestimation in the simulation studies, and the estimation of the imputations was illustrated using breast cancer data.
Abstract: A cure rate model is a survival model incorporating the cure rate with the assumption that the population contains both uncured and cured individuals. It is a powerful statistical tool for prognostic studies, especially in cancer. The cure rate is important for making treatment decisions in clinical practice. The proportional hazards (PH) cure model can predict the cure rate for each patient. This contains a logistic regression component for the cure rate and a Cox regression component to estimate the hazard for uncured patients. A measure for quantifying the predictive accuracy of the cure rate estimated by the Cox PH cure model is required, as there has been a lack of previous research in this area. We used the Cox PH cure model for the breast cancer data; however, the area under the receiver operating characteristic curve (AUC) could not be estimated because many patients were censored. In this study, we used imputation-based AUCs to assess the predictive accuracy of the cure rate from the PH cure model. We examined the precision of these AUCs using simulation studies. The results demonstrated that the imputation-based AUCs were estimable and their biases were negligibly small in many cases, although ordinary AUC could not be estimated. Additionally, we introduced the bias-correction method of imputation-based AUCs and found that the bias-corrected estimate successfully compensated the overestimation in the simulation studies. We also illustrated the estimation of the imputation-based AUCs using breast cancer data.

Journal ArticleDOI
TL;DR: The safety subteam of the Drug Information Association Bayesian Scientific Working Group evaluates challenges associated with current methods for designing and analyzing safety trials and provides an overview of several suggested Bayesian opportunities that may increase efficiency of safety trials along with relevant case examples.
Abstract: Safety assessment is essential throughout medical product development. There has been increased awareness of the importance of safety trials recently, in part due to recent US Food and Drug Administration guidance related to thorough assessment of cardiovascular risk in the treatment of type 2 diabetes. Bayesian methods provide great promise for improving the conduct of safety trials. In this paper, the safety subteam of the Drug Information Association Bayesian Scientific Working Group evaluates challenges associated with current methods for designing and analyzing safety trials and provides an overview of several suggested Bayesian opportunities that may increase efficiency of safety trials along with relevant case examples.

Journal ArticleDOI
TL;DR: Based on simulations, it is shown that the current practice of ignoring centre heterogeneity can be seriously misleading, and the performances of the frailty modelling approach over competing methods are illustrated.
Abstract: Conducting a clinical trial at multiple study centres raises the issue of whether and how to adjust for centre heterogeneity in the statistical analysis. In this paper, weaddress this issue for multicentre clinical trials with a time-to-event outcome. Based on simulations, we show that the current practice of ignoring centre heterogeneity can be seriously misleading, and we illustrate the performances of the frailty modelling approach over competing methods. A special attention is paid to the problem of misspecification of the frailty distribution. The appendix provides sample codes in R and in SAS to perform the analyses in this paper.

Journal ArticleDOI
TL;DR: To help explain the non-negligible chance of failing to reproduce a previous positive finding, a conceptual analogy between phases II-III development stages and interim analyses of a trial with a group sequential design is drawn.
Abstract: It is frequently noted that an initial clinical trial finding was not reproduced in a later trial. This is often met with some surprise. Yet, there is a relatively straightforward reason partially responsible for this observation. In this article, we examine this reason by first reviewing some findings in a recent publication in the Journal of the American Medical Association. To help explain the non-negligible chance of failing to reproduce a previous positive finding, we compare a series of trials to successive diagnostic tests used for identifying a condition. To help explain the suspicion that the treatment effect, when observed in a subsequent trial, seems to have decreased in magnitude, we draw a conceptual analogy between phases II-III development stages and interim analyses of a trial with a group sequential design. Both analogies remind us that what we observed in an early trial could be a false positive or a random high. We discuss statistical sources for these occurrences and discuss why it is important for statisticians to take these into consideration when designing and interpreting trial results.

Journal ArticleDOI
TL;DR: A new approach is proposed that builds on the previously proposed approaches and uses data available at the interim analysis to estimate parameters and then, on the basis of these estimates, chooses the treatment selection method with the highest probability of correctly selecting the most effective treatment.
Abstract: Seamless phase II/III clinical trials are conducted in two stages with treatment selection at the first stage. In the first stage, patients are randomized to a control or one of k > 1 experimental treatments. At the end of this stage, interim data are analysed, and a decision is made concerning which experimental treatment should continue to the second stage. If the primary endpoint is observable only after some period of follow-up, at the interim analysis data may be available on some early outcome on a larger number of patients than those for whom the primary endpoint is available. These early endpoint data can thus be used for treatment selection. For two previously proposed approaches, the power has been shown to be greater for one or other method depending on the true treatment effects and correlations. We propose a new approach that builds on the previously proposed approaches and uses data available at the interim analysis to estimate these parameters and then, on the basis of these estimates, chooses the treatment selection method with the highest probability of correctly selecting the most effective treatment. This method is shown to perform well compared with the two previously described methods for a wide range of true parameter values. In most cases, the performance of the new method is either similar to or, in some cases, better than either of the two previously proposed methods. © 2014 The Authors. Pharmaceutical Statistics published by John Wiley & Sons Ltd.

Journal ArticleDOI
TL;DR: The adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof.
Abstract: Multiple testing procedures defined by directed, weighted graphs have recently been proposed as an intuitive visual tool for constructing multiple testing strategies that reflect the often complex contextual relations between hypotheses in clinical trials. Many well-known sequentially rejective tests, such as (parallel) gatekeeping tests or hierarchical testing procedures are special cases of the graph based tests. We generalize these graph-based multiple testing procedures to adaptive trial designs with an interim analysis. These designs permit mid-trial design modifications based on unblinded interim data as well as external information, while providing strong family wise error rate control. To maintain the familywise error rate, it is not required to prespecify the adaption rule in detail. Because the adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof. Examples of adaptations are dropping of treatment arms, selection of subpopulations, and sample size reassessment. If, in the interim analysis, it is decided to continue the trial as planned, the adaptive test reduces to the originally planned multiple testing procedure. Only if adaptations are actually implemented, an adjusted test needs to be applied. The procedure is illustrated with a case study and its operating characteristics are investigated by simulations. Copyright © 2014 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: It is found that the score-based methods on the whole have the best two-sided coverage, although they have slight deficiencies for confidence levels of 90% or lower, and the Brown-Li 'Jeffreys' method appears to perform reasonably well, and in most situations, it has better one-sided Coverage than the widely recommended alternatives.
Abstract: This paper uses graphical methods to illustrate and compare the coverage properties of a number of methods for calculating confidence intervals for the difference between two independent binomial proportions. We investigate both small-sample and large-sample properties of both two-sided and one-sided coverage, with an emphasis on asymptotic methods. In terms of aligning the smoothed coverage probability surface with the nominal confidence level, we find that the score-based methods on the whole have the best two-sided coverage, although they have slight deficiencies for confidence levels of 90% or lower. For an easily taught, hand-calculated method, the Brown-Li 'Jeffreys' method appears to perform reasonably well, and in most situations, it has better one-sided coverage than the widely recommended alternatives. In general, we find that the one-sided properties of many of the available methods are surprisingly poor. In fact, almost none of the existing asymptotic methods achieve equal coverage on both sides of the interval, even with large sample sizes, and consequently if used as a non-inferiority test, the type I error rate (which is equal to the one-sided non-coverage probability) can be inflated. The only exception is the Gart-Nam 'skewness-corrected' method, which we express using modified notation in order to include a bias correction for improved small-sample performance, and an optional continuity correction for those seeking more conservative coverage. Using a weighted average of two complementary methods, we also define a new hybrid method that almost matches the performance of the Gart-Nam interval.

Journal ArticleDOI
TL;DR: Two distinct tests for symmetry of distribution and a test for the median of a symmetric distribution, sharing a common test statistic, are proposed to be referred to by different names, as 'test for symmetry based on signed-rank statistic' and ' test for median based onsigned-rank statistics', respectively.
Abstract: In statistical literature, the term 'signed-rank test' (or 'Wilcoxon signed-rank test') has been used to refer to two distinct tests: a test for symmetry of distribution and a test for the median of a symmetric distribution, sharing a common test statistic. To avoid potential ambiguity, we propose to refer to those two tests by different names, as 'test for symmetry based on signed-rank statistic' and 'test for median based on signed-rank statistic', respectively. The utility of such terminological differentiation should become evident through our discussion of how those tests connect and contrast with sign test and one-sample t-test. Published 2014. This article is a U.S. Government work and is in the public domain in the USA.

Journal ArticleDOI
TL;DR: An oncology research network considering a reasonable trial in melanoma, including adolescents, will compete for recruitment with the PIP‐triggered trials designed by regulatory tunnel vision and sponsored by companies under EMA‐imposed pressure.
Abstract: The European Medicines Agency (EMA) website lists all diseases that officially exist in adults only. The class waiver for juvenile melanoma was revoked in 2008 referring to US SEER statistics. This statistical justification is misleading. Melanoma in adolescents is much rarer than claimed by EMA/Paediatric Committee; < 1=4 of adolescents with melanoma need systemic treatment; separate efficacy studies are neither medically justified nor feasible. The scarce adolescent patients should be allowed to participate in adult trials. To force companies to investigate them separately turns them into paediatric hostages, to adapt the term therapeutic orphans coined in 1968 by Shirkey. There are now five melanoma Paediatric Investigation Plans (PIPs). Probably none of the PIP-triggered clinical studies will ever be completed; we propose to call them ghost studies. An oncology research network considering a reasonable trial in melanoma, including adolescents, will compete for recruitment with the PIP-triggered trials designed by regulatory tunnel vision and sponsored by companies under EMA-imposed pressure. EMA/Paediatric Committee’s territorial enthusiasm (“our patients”) damages oncology research. Copyright © 2014 John Wiley &S ons, Ltd.

Journal ArticleDOI
TL;DR: In this paper, the authors consider an extension of the nonlinear mixed-effects models in which random effects and within-subject errors are assumed to be distributed according to a rich class of parametric models that are often used for robust inference.
Abstract: A common assumption in nonlinear mixed-effects models is the normality of both random effects and within-subject errors. However, such assumptions make inferences vulnerable to the presence of outliers. More flexible distributions are therefore necessary for modeling both sources of variability in this class of models. In the present paper, I consider an extension of the nonlinear mixed-effects models in which random effects and within-subject errors are assumed to be distributed according to a rich class of parametric models that are often used for robust inference. The class of distributions I consider is the scale mixture of multivariate normal distributions that consist of a wide range of symmetric and continuous distributions. This class includes heavy-tailed multivariate distributions, such as the Student's t and slash and contaminated normal. With the scale mixture of multivariate normal distributions, robustification is achieved from the tail behavior of the different distributions. A Bayesian framework is adopted, and MCMC is used to carry out posterior analysis. Model comparison using different criteria was considered. The procedures are illustrated using a real dataset from a pharmacokinetic study. I contrast results from the normal and robust models and show how the implementation can be used to detect outliers. Copyright © 2013 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: It is argued that the adaptive design methodology of Li et al. (Biostatistics 3:277–287) for two-stage trials with mid-trial sample size adjustment is closer in principle to a group sequential design, in spite of its obvious adaptive element.
Abstract: In this paper, we review the adaptive design methodology of Li et al. (Biostatistics 3:277–287) for two-stage trials with mid-trial sample size adjustment. We argue that it is closer in principle to a group sequential design, in spite of its obvious adaptive element. Several extensions are proposed that aim to make it even more attractive and transparent alternative to a standard (fixed sample size) trial for funding bodies to consider. These enable a cap to be put on the maximum sample size and for the trial data to be analysed using standard methods at its conclusion. The regulatory view of trials incorporating unblinded sample size re-estimation is also discussed. © 2014 The Authors. Pharmaceutical Statistics published by John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: This work considers the cases where both cost and effectiveness are assumed to have a normal distribution and when costs are gamma distributed and effectiveness is normally distributed and proposes a Bayesian approach to correct cost-effectiveness studies for unmeasured confounding.
Abstract: Unmeasured confounding is a common problem in observational studies. Failing to account for unmeasured confounding can result in biased point estimators and poor performance of hypothesis tests and interval estimators. We provide examples of the impacts of unmeasured confounding on cost-effectiveness analyses using observational data along with a Bayesian approach to correct estimation. Assuming validation data are available, we propose a Bayesian approach to correct cost-effectiveness studies for unmeasured confounding. We consider the cases where both cost and effectiveness are assumed to have a normal distribution and when costs are gamma distributed and effectiveness is normally distributed. Simulation studies were conducted to determine the impact of ignoring the unmeasured confounder and to determine the size of the validation data required to obtain valid inferences.


Journal ArticleDOI
TL;DR: This work proposes to analyze the treatment effect on tumor growth kinetics using a joint modeling framework accounting for the informative missing mechanism to better characterize treatment effect and thereby inform decision‐making in phase II oncology trials.
Abstract: The tumor burden (TB) process is postulated to be the primary mechanism through which most anticancer treatments provide benefit. In phase II oncology trials, the biologic effects of a therapeutic agent are often analyzed using conventional endpoints for best response, such as objective response rate and progression-free survival, both of which causes loss of information. On the other hand, graphical methods including spider plot and waterfall plot lack any statistical inference when there is more than one treatment arm. Therefore, longitudinal analysis of TB data is well recognized as a better approach for treatment evaluation. However, longitudinal TB process suffers from informative missingness because of progression or death. We propose to analyze the treatment effect on tumor growth kinetics using a joint modeling framework accounting for the informative missing mechanism. Our approach is illustrated by multisetting simulation studies and an application to a nonsmall-cell lung cancer data set. The proposed analyses can be performed in early-phase clinical trials to better characterize treatment effect and thereby inform decision-making.

Journal ArticleDOI
TL;DR: An algebraic expression for the variance of the treatment difference in a complete two‐period design is derived and it is shown that under a ‘no difference’ null, correlation does not result in variance inflation in this design.
Abstract: A complete two-period experimental design has been defined as one in which subjects are randomized to treatment, observed for the occurrence of an event of interest, re-randomized, and observed again for the event in a second period. A 4-year vaccine efficacy trial was planned to compare a high-dose vaccine with a standard dose vaccine. Subjects would be randomized each year, and subjects who had participated in a previous year would be allowed to re-enroll in a subsequent year and would be re-randomized. A question of interest is whether positive correlation between observations on subjects who re-enrolled would inflate the variance of test statistics. The effect of re-enrollment and correlation on type 1 error in a 4-year trial is investigated by simulation. As conducted, the trial met its power requirements after two years. Subjects therefore included some who participated for a single year and others who participated in both years. Those who participated in both years constituted a complete two-period design. An algebraic expression for the variance of the treatment difference in a complete two-period design is derived. It is shown that under a ‘no difference’ null, correlation does not result in variance inflation in this design. When there is a treatment difference, there is variance inflation but it is small. In the vaccine efficacy trial, the effect of correlation on the statistical inference was negligible. Copyright © 2014 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: This work proposes to utilize the primary analysis based on a mixed-effects model for repeated measures to draw inference about the treatment effect under the extended placebo-based pattern-mixture model and applies the proposed method to a clinical study of major depressive disorders.
Abstract: Pattern-mixture models provide a general and flexible framework for sensitivity analyses of nonignorable missing data in longitudinal studies. The placebo-based pattern-mixture model handles missing data in a transparent and clinically interpretable manner. We extend this model to include a sensitivity parameter that characterizes the gradual departure of the missing data mechanism from being missing at random toward being missing not at random under the standard placebo-based pattern-mixture model. We derive the treatment effect implied by the extended model. We propose to utilize the primary analysis based on a mixed-effects model for repeated measures to draw inference about the treatment effect under the extended placebo-based pattern-mixture model. We use simulation studies to confirm the validity of the proposed method. We apply the proposed method to a clinical study of major depressive disorders.

Journal ArticleDOI
TL;DR: This work considers fitting the so-called Emax model to continuous response data from clinical trials designed to investigate the dose-response relationship for an experimental compound to be comparable with or better than some others that have been used when maximum likelihood estimation fails to converge.
Abstract: We consider fitting the so-called Emax model to continuous response data from clinical trials designed to investigate the dose-response relationship for an experimental compound. When there is insufficient information in the data to estimate all of the parameters because of the high dose asymptote being ill defined, maximum likelihood estimation fails to converge. We explore the use of either bootstrap resampling or the profile likelihood to make inferences about effects and doses required to give a particular effect, using limits on the parameter values to obtain the value of the maximum likelihood when the high dose asymptote is ill defined. The results obtained show these approaches to be comparable with or better than some others that have been used when maximum likelihood estimation fails to converge and that the profile likelihood method outperforms the method of bootstrap resampling used.