scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in Medicine in 1998"


Journal ArticleDOI
TL;DR: The propensity score, defined as the conditional probability of being treated given the covariates, can be used to balance the variance of covariates in the two groups, and therefore reduce bias as mentioned in this paper.
Abstract: In observational studies, investigators have no control over the treatment assignment. The treated and non-treated (that is, control) groups may have large differences on their observed covariates, and these differences can lead to biased estimates of treatment effects. Even traditional covariance analysis adjustments may be inadequate to eliminate this bias. The propensity score, defined as the conditional probability of being treated given the covariates, can be used to balance the covariates in the two groups, and therefore reduce this bias. In order to estimate the propensity score, one must model the distribution of the treatment indicator variable given the observed covariates. Once estimated the propensity score can be used to reduce bias through matching, stratification (subclassification), regression adjustment, or some combination of all three. In this tutorial we discuss the uses of propensity score methods for bias reduction, give references to the literature and illustrate the uses through applied examples.

4,948 citations


Journal ArticleDOI
TL;DR: A number of methods of extracting estimates of the log hazard ratio and its variance in a variety of situations are presented to improve the efficiency and reliability of meta-analyses of the published literature with survival-type endpoints.
Abstract: Meta-analyses aim to provide a full and comprehensive summary of related studies which have addressed a similar question. When the studies involve time to event (survival-type) data the most appropriate statistics to use are the log hazard ratio and its variance. However, these are not always explicitly presented for each study. In this paper a number of methods of extracting estimates of these statistics in a variety of situations are presented. Use of these methods should improve the efficiency and reliability of meta-analyses of the published literature with survival-type endpoints.

3,998 citations


Journal ArticleDOI
TL;DR: Criteria appropriate to the evaluation of various proposed methods for interval estimate methods for proportions include: closeness of the achieved coverage probability to its nominal value; whether intervals are located too close to or too distant from the middle of the scale; expected interval width; avoidance of aberrations such as limits outside [0,1] or zero width intervals; and ease of use.
Abstract: Simple interval estimate methods for proportions exhibit poor coverage and can produce evidently inappropriate intervals. Criteria appropriate to the evaluation of various proposed methods include: closeness of the achieved coverage probability to its nominal value; whether intervals are located too close to or too distant from the middle of the scale; expected interval width; avoidance of aberrations such as limits outside [0,1] or zero width intervals; and ease of use, whether by tables, software or formulae. Seven methods for the single proportion are evaluated on 96,000 parameter space points. Intervals based on tail areas and the simpler score methods are recommended for use. In each case, methods are available that aim to align either the minimum or the mean coverage with the nominal 1 -alpha.

3,825 citations


Journal ArticleDOI
TL;DR: A functional approximation to earlier exact results is shown to have excellent agreement with the exact results and one can use it easily without intensive numerical computation.
Abstract: A method is developed to calculate the required number of subjects k in a reliability study, where reliability is measured using the intraclass correlation rho. The method is based on a functional approximation to earlier exact results. The approximation is shown to have excellent agreement with the exact results and one can use it easily without intensive numerical computation. Optimal design configurations are also discussed; for reliability values of about 40 per cent or higher, use of two or three observations per subject will minimize the total number of observations required.

1,795 citations


Journal ArticleDOI
TL;DR: Two new approaches which also avoid aberrations are developed and evaluated, and a tail area profile likelihood based method produces the best coverage properties, but is difficult to calculate for large denominators.
Abstract: Several existing unconditional methods for setting confidence intervals for the difference between binomial proportions are evaluated. Computationally simpler methods are prone to a variety of aberrations and poor coverage properties. The closely interrelated methods of Mee and Miettinen and Nurminen perform well but require a computer program. Two new approaches which also avoid aberrations are developed and evaluated. A tail area profile likelihood based method produces the best coverage properties, but is difficult to calculate for large denominators. A method combining Wilson score intervals for the two proportions to be compared also performs well, and is readily implemented irrespective of sample size.

1,634 citations


Journal ArticleDOI
TL;DR: To update the British growth reference, anthropometric data for weight, height, body mass index and head circumference from 17 distinct surveys representative of England, Scotland and Wales were analysed by maximum penalized likelihood using the LMS method.
Abstract: To update the British growth reference, anthropometric data for weight, height, body mass index (weight/height2) and head circumference from 17 distinct surveys representative of England, Scotland and Wales (37,700 children, age range 23 weeks gestation to 23 years) were analysed by maximum penalized likelihood using the LMS method. This estimates the measurement centiles in terms of three age-sex-specific cubic spline curves: the L curve (Box-Cox power to remove skewness), M curve (median) and S curve (coefficient of variation). A two-stage fitting procedure was developed to model the age trends in median weight and height, and simulation was used to estimate confidence intervals for the fitted centiles. The reference converts measurements to standard deviation scores (SDS) that are very close to Normally distributed - the means, medians and skewness for the four measurements are effectively zero overall, with standard deviations very close to one and only slight evidence of positive kurtosis beyond+/-2 SDS. The ability to express anthropometry as SDS greatly simplifies growth assessment.

1,105 citations


Journal ArticleDOI
TL;DR: This paper suggests use of sample size formulae for comparing means or for comparing proportions in order to calculate the required sample size for a simple logistic regression model.
Abstract: A sample size calculation for logistic regression involves complicated formulae. This paper suggests use of sample size formulae for comparing means or for comparing proportions in order to calculate the required sample size for a simple logistic regression model. One can then adjust the required sample size for a multiple logistic regression model by a variance inflation factor. This method requires no assumption of low response probability in the logistic model as in a previous publication. One can similarly calculate the sample size for linear regression models. This paper also compares the accuracy of some existing sample-size software for logistic regression with computer power simulations. An example illustrates the methods.

963 citations


Journal ArticleDOI
TL;DR: Two new algorithms for fitting binormal ROC curves to continuously-distributed data are developed: a true ML algorithm (LABROC4) and a quasi-ML algorithm (LabROC5) that requires substantially less computation with large data sets.
Abstract: We show that truth-state runs in rank-ordered data constitute a natural categorization of continuously-distributed test results for maximum likelihood (ML) estimation of ROC curves. On this basis, we develop two new algorithms for fitting binormal ROC curves to continuously-distributed data: a true ML algorithm (LABROC4) and a quasi-ML algorithm (LABROC5) that requires substantially less computation with large data sets. Simulation studies indicate that both algorithms produce reliable estimates of the binormal ROC curve parameters a and b, the ROC-area index Az, and the standard errors of those estimates.

865 citations


Journal ArticleDOI
TL;DR: It is concluded that the test of heterogeneity should not be the sole determinant of model choice in meta-analysis, and inspection of relevant normal plots, as well as clinical insight, may be more relevant to both the investigation and modelling of heterogeneity.
Abstract: The investigation of heterogeneity is a crucial part of any meta-analysis. While it has been stated that the test for heterogeneity has low power, this has not been well quantified. Moreover the assumptions of normality implicit in the standard methods of meta-analysis are often not scrutinized in practice. Here we simulate how the power of the test for heterogeneity depends on the number of studies included, the total information (that is total weight or inverse variance) available and the distribution of weights among the different studies. We show that the power increases with the total information available rather than simply the number of studies, and that it is substantially lowered if, as is quite common in practice, one study comprises a large proportion of the total information. We also describe normal plots that are useful in assessing whether the data conform to a fixed effect or random effects model, together with appropriate tests, and give an application to the analysis of a multi-centre trial of blood pressure reduction. We conclude that the test of heterogeneity should not be the sole determinant of model choice in meta-analysis, and inspection of relevant normal plots, as well as clinical insight, may be more relevant to both the investigation and modelling of heterogeneity.

639 citations


Journal ArticleDOI
TL;DR: The paper takes the reader through the relevant practicalities of model fitting, interpretation and criticism and demonstrates that, in a simple case such as this, analyses based upon these model-based approaches produce reassuringly similar inferences to standard analysesbased upon more conventional methods.
Abstract: Much of the research in epidemiology and clinical science is based upon longitudinal designs which involve repeated measurements of a variable of interest in each of a series of individuals. Such designs can be very powerful, both statistically and scientifically, because they enable one to study changes within individual subjects over time or under varied conditions. However, this power arises because the repeated measurements tend to be correlated with one another, and this must be taken into proper account at the time of analysis or misleading conclusions may result. Recent advances in statistical theory and in software development mean that studies based upon such designs can now be analysed more easily, in a valid yet flexible manner, using a variety of approaches which include the use of generalized estimating equations, and mixed models which incorporate random effects. This paper provides a particularly simple illustration of the use of these two approaches, taking as a practical example the analysis of a study which examined the response of portable peak expiratory flow meters to changes in true peak expiratory flow in 12 children with asthma. The paper takes the reader through the relevant practicalities of model fitting, interpretation and criticism and demonstrates that, in a simple case such as this, analyses based upon these model-based approaches produce reassuringly similar inferences to standard analyses based upon more conventional methods.

627 citations


Journal ArticleDOI
TL;DR: An adaptive dose escalation scheme for use in cancer phase I clinical trials that makes use of all the information available at the time of each dose assignment, and directly addresses the ethical need to control the probability of overdosing is described.
Abstract: We describe an adaptive dose escalation scheme for use in cancer phase I clinical trials. The method is fully adaptive, makes use of all the information available at the time of each dose assignment, and directly addresses the ethical need to control the probability of overdosing. It is designed to approach the maximum tolerated dose as fast as possible subject to the constraint that the predicted proportion of patients who receive an overdose does not exceed a specified value. We conducted simulations to compare the proposed method with four up-and-down designs, two stochastic approximation methods, and with a variant of the continual reassessment method. The results showed the proposed method effective as a means to control the frequency of overdosing. Relative to the continual reassessment method, our scheme overdosed a smaller proportion of patients, exhibited fewer toxicities and estimated the maximum tolerated dose with comparable accuracy. When compared to the non-parametric schemes, our method treated fewer patients at either subtherapeutic or severely toxic dose levels, treated more patients at optimal dose levels and estimated the maximum tolerated dose with smaller average bias and mean squared error. Hence, the proposed method is promising alternative to currently used cancer phase I clinical trial designs.

Journal ArticleDOI
TL;DR: Better methods are described, based on the profile likelihood obtained by conditionally maximizing the proportion of discordant pairs by setting confidence intervals for the difference theta between binomial proportions based on paired data.
Abstract: Existing methods for setting confidence intervals for the difference theta between binomial proportions based on paired data perform inadequately. The asymptotic method can produce limits outside the range of validity. The 'exact' conditional method can yield an interval which is effectively only one-sided. Both these methods also have poor coverage properties. Better methods are described, based on the profile likelihood obtained by conditionally maximizing the proportion of discordant pairs. A refinement (methods 5 and 6) which aligns 1-alpha with an aggregate of tail areas produces appropriate coverage properties. A computationally simpler method based on the score interval for the single proportion also performs well (method 10).

Journal ArticleDOI
TL;DR: This tutorial is to illustrate and compare available methods which correctly treat the data as being interval-censored and to allow those familiar with the application of standard survival analysis techniques the option of applying appropriate methods when presented with interval- censored data.
Abstract: In standard time-to-event or survival analysis, occurrence times of the event of interest are observed exactly or are right-censored, meaning that it is only known that the event occurred after the last observation time. There are numerous methods available for estimating the survival curve and for testing and estimation of the effects of covariates in this context. In some situations, however, the times of the events of interest may only be known to have occurred within an interval of time. In clinical trials, for example, patients are often seen at pre-scheduled visits but the event of interest may occur in between visits. These data are interval-censored. Owing to the lack of well-known statistical methodology and available software, a common ad hoc approach is to assume that the event occurred at the end (or beginning or midpoint) of each interval, and then apply methods for standard time-to-event data. However, this approach can lead to invalid inferences, and in particular will tend to underestimate the standard errors of the estimated parameters. The purpose of this tutorial is to illustrate and compare available methods which correctly treat the data as being interval-censored. It is not meant to be a full review of all existing methods, but only those which are available in standard statistical software, or which can be easily programmed. All approaches will be illustrated on two data sets and compared with methods which ignore the interval-censored nature of the data. We hope this tutorial will allow those familiar with the application of standard survival analysis techniques the option of applying appropriate methods when presented with interval-censored data.

Journal ArticleDOI
TL;DR: The Ohio data set has been of particular interest because of the suggestion that a nuclear facility in the southwest of the state may have caused increased levels of lung cancer there, but it is argued here that the data are inadequate for a proper investigation of this issue.
Abstract: This paper combines existing models for longitudinal and spatial data in a hierarchical Bayesian framework, with particular emphasis on the role of time- and space-varying covariate effects. Data analysis is implemented via Markov chain Monte Carlo methods. The methodology is illustrated by a tentative re-analysis of Ohio lung cancer data 1968-1988. Two approaches that adjust for unmeasured spatial covariates, particularly tobacco consumption, are described. The first includes random effects in the model to account for unobserved heterogeneity; the second adds a simple urbanization measure as a surrogate for smoking behaviour. The Ohio data set has been of particular interest because of the suggestion that a nuclear facility in the southwest of the state may have caused increased levels of lung cancer there. However, we contend here that the data are inadequate for a proper investigation of this issue.

Journal ArticleDOI
TL;DR: The proposed artificial neural network (ANN) approach can be applied for the estimation of the functional relationships between covariates and time in survival data to improve model predictivity in the presence of complex prognostic relationships.
Abstract: Flexible modelling in survival analysis can be useful both for exploratory and predictive purposes. Feed forward neural networks were recently considered for flexible non-linear modelling of censored survival data through the generalization of both discrete and continuous time models. We show that by treating the time interval as an input variable in a standard feed forward network with logistic activation and entropy error function, it is possible to estimate smoothed discrete hazards as conditional probabilities of failure. We considered an easily implementable approach with a fast selection criteria of the best configurations. Examples on data sets from two clinical trials are provided. The proposed artificial neural network (ANN) approach can be applied for the estimation of the functional relationships between covariates and time in survival data to improve model predictivity in the presence of complex prognostic relationships.

Journal ArticleDOI
TL;DR: Two random-effects approaches are proposed for the regression meta-analysis of multiple correlated outcomes and their use with fixed-effects models and with separate-outcomes models in a meta- analysis of periodontal clinical trials are compared.
Abstract: Earlier work showed how to perform fixed-effects meta-analysis of studies or trials when each provides results on more than one outcome per patient and these multiple outcomes are correlated. That fixed-effects generalized-least-squares approach analyzes the multiple outcomes jointly within a single model, and it can include covariates, such as duration of therapy or quality of trial, that may explain observed heterogeneity of results among the trials. Sometimes the covariates explain all the heterogeneity, and the fixed-effects regression model is appropriate. However, unexplained heterogeneity may often remain, even after taking into account known or suspected covariates. Because fixed-effects models do not make allowance for this remaining unexplained heterogeneity, the potential exists for bias in estimated coefficients, standard errors and p-values. We propose two random-effects approaches for the regression meta-analysis of multiple correlated outcomes. We compare their use with fixed-effects models and with separate-outcomes models in a meta-analysis of periodontal clinical trials. A simulation study shows the advantages of the random-effects approach. These methods also facilitate meta-analysis of trials that compare more than two treatments.

Journal ArticleDOI
TL;DR: A method of sequential analysis for randomized clinical trials that allows use of all prior data in a trial to determine the use and weighting of subsequent observations is presented.
Abstract: I present a method of sequential analysis for randomized clinical trials that allows use of all prior data in a trial to determine the use and weighting of subsequent observations. One continues to assign subjects until one has 'used up' all the variance of the test statistic. There are many strategies to determine the weights including Bayesian methods (though the proposal is a frequentist design). I explore further the self-designing aspect of the randomized trial to note that in some cases it makes good sense (i) to change the weighting on components of a multivariate endpoint, (ii) to add or drop treatment arms (especially in a parallel group dose ranging/efficacy/safety trial), (iii) to select sites to use as the trial goes on, (iv) to change the test statistic and (v) even to rethink the whole drug development paradigm to shorten drug development time while keeping current standards for the level of evidence necessary for approval.

Journal ArticleDOI
TL;DR: This example concerning mastitis in dairy cows is exceptional in that from a simple plot of the data two outlying observations can be identified that are the source of the apparent evidence for non-random dropout and also provide an explanation of the behaviour of the sensitivity analysis.
Abstract: The outcome-based selection model of Diggle and Kenward for repeated measurements with non-random dropout is applied to a very simple example concerning the occurrence of mastitis in dairy cows, in which the occurrence of mastitis can be modelled as a dropout process. It is shown through sensitivity analysis how the conclusions concerning the dropout mechanism depend crucially on untestable distributional assumptions. This example is exceptional in that from a simple plot of the data two outlying observations can be identified that are the source of the apparent evidence for non-random dropout and also provide an explanation of the behaviour of the sensitivity analysis. It is concluded that a plausible model for the data does not require the assumption of non-random dropout.

Journal ArticleDOI
TL;DR: This study compares eight different risk- Adjustment methods as applied to a CABG surgery population of 28 providers to create a common metric by which to display the results of these various risk-adjustment methodologies with regard to dichotomous outcomes such as in-hospital mortality.
Abstract: Risk-adjustment and provider profiling have become common terms as the medical profession attempts to measure quality and assess value in health care. One of the areas of care most thoroughly developed in this regard is quality assessment for coronary artery bypass grafting (CABG). Because in-hospital mortality following CABG has been studied extensively, risk-adjustment mechanisms are already being used in this area for provider profiling. This study compares eight different risk-adjustment methods as applied to a CABG surgery population of 28 providers. Five of the methods use an external risk-adjustment algorithm developed in an independent population, while the other three rely on an internally developed logistic model. The purposes of this study are to: (i) create a common metric by which to display the results of these various risk-adjustment methodologies with regard to dichotomous outcomes such as in-hospital mortality, and (ii) to compare how these risk-adjustment methods quantify the 'outlier' standing of providers. Section 2 describes the data, the external and internal risk-adjustment algorithms, and eight approaches to provider profiling. Section 3 then demonstrates the results of applying these methods on a data set specifically collected for quality improvement.

Journal ArticleDOI
TL;DR: The significant correlation between the CR and the TE suggests that, rather than merely pooling the TE into a single summary estimate, investigators should search for the causes of heterogeneity related to patient characteristics and treatment protocols to determine when treatment is most beneficial and that they should plan to study this heterogeneity in clinical trials.
Abstract: If the control rate (CR) in a clinical trial represents the incidence or the baseline severity of illness in the study population, the size of treatment effects may tend to very with the size of control rates. To investigate this hypothesis, we examined 115 meta-analyses covering a wide range of medical applications for evidence of a linear relationship between the CR and three treatment effect (TE) measures: the risk difference (RD); the log relative risk (RR), and the log odds ratio (OR). We used a hierarchical model that estimates the true regression while accounting for the random error in the measurement of and the functional dependence between the observed TE and the CR. Using a two standard error rule of significance, we found the control rate was about two times more likely to be significantly related to the RD (31 per cent) than to the RR (13 per cent) or the OR (14 per cent). Correlations between TE and CR were more likely when the meta-analysis included 10 or more trials and if patient follow-up was less than six months and homogeneous. Use of weighted linear regression (WLR) of the observed TE on the observed CR instead of the hierarchical model underestimated standard errors and overestimated the number of significant results by a factor of two. The significant correlation between the CR and the TE suggests that, rather than merely pooling the TE into a single summary estimate, investigators should search for the causes of heterogeneity related to patient characteristics and treatment protocols to determine when treatment is most beneficial and that they should plan to study this heterogeneity in clinical trials.

Journal ArticleDOI
TL;DR: This paper will serve as an introduction to the problem of missing QOL data in cancer clinical trials and provide an estimation of its magnitude, and approaches to its prevention and solution.
Abstract: Measurement of quality of life (QOL) in cancer clinical trials has increased in recent years as more groups realize the importance of such endpoints. A key problem has been missing data. Some QOL data may unavoidably be missing, as for example when patients are too ill to complete forms. Other important sources are potentially avoidable and can broadly be divided into three categories: (i) methodological factors; (ii) logistic and administrative factors; (iii) patient-related factors. Logistic and administrative factors, for example, staff oversights, have proven to be most important. Since most QOL measurements require patient self-report, it is usually not possible to rectify the failure to collect baseline data or any follow-up assessments. There is strong evidence that such data are not 'missing at random', and cannot be ignored without introducing bias. Although several approaches to the analysis of partly missing data have been described, none is entirely satisfactory. Prevention of avoidable missing data is better than attempted cure. In July 1996, an international conference on missing QOL data in cancer clinical trials reported the experience of most major groups involved. This paper will serve as an introduction to the problem and provide an estimation of its magnitude, and approaches to its prevention and solution.

Journal ArticleDOI
TL;DR: In this paper, the Akaike information criterion (AIC) and the AIC-based model selection criteria are applied to check for difference in T4 cell counts between two disease groups.
Abstract: When testing for a treatment effect or a difference among groups, the distributional assumptions made about the response variable can have a critical impact on the conclusions drawn. For example, controversy has arisen over transformations of the response (Keene). An alternative approach is to use some member of the family of generalized linear models. However, this raises the issue of selecting the appropriate member, a problem of testing non-nested hypotheses. Standard model selection criteria, such as the Akaike information criterion (AIC), can be used to resolve problems. These procedures for comparing generalized linear models are applied to checking for difference in T4 cell counts between two disease groups. We conclude that appropriate model selection criteria should be specified in the protocol for any study, including clinical trials, in order that optimal inferences can be drawn about treatment differences.

Journal ArticleDOI
TL;DR: This paper adapted the generalized estimating equation (GEE) approach of Liang and Zeger to sample size calculations for discrete and continuous outcome variables, and used the damped exponential family of correlation structures described in Munoz et al. for the working correlation matrix among the repeated measures.
Abstract: Derivation of the minimum sample size is an important consideration in an applied research effort. When the outcome is measured at a single time point, sample size procedures are well known and widely applied. The corresponding situation for longitudinal designs, however, is less well developed. In this paper, we adapt the generalized estimating equation (GEE) approach of Liang and Zeger to sample size calculations for discrete and continuous outcome variables. The non-central version of the Wald χ2 test is considered. We use the damped exponential family of correlation structures described in Munoz et al. for the ‘working’ correlation matrix among the repeated measures. We present a table of minimum sample sizes for binary outcomes, and discuss extensions that account for unequal allocation, staggered entry and loss to follow-up. © 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: The critical consideration for studies with covariance analyses planned as the primary method for comparing treatments is the specification of the covariables in the protocol (or in an amendment or formal plan prior to any unmasking of the study).
Abstract: Analysis of covariance is an effective method for addressing two considerations for randomized clinical trials. One is reduction of variance for estimates of treatment effects and thereby the production of narrower confidence intervals and more powerful statistical tests. The other is the clarification of the magnitude of treatment effects through adjustment of corresponding estimates for any random imbalances between the treatment groups with respect to the covariables. The statistical basis of covariance analysis can be either non-parametric, with reliance only on the randomization in the study design, or parametric through a statistical model for a postulated sampling process. For non-parametric methods, there are no formal assumptions for how a response variable is related to the covariables, but strong correlation between response and covariables is necessary for variance reduction. Computations for these methods are straightforward through the application of weighted least squares to fit linear models to the differences between treatment groups for the means of the response variable and the covariables jointly with a specification that has null values for the differences that correspond to the covariables. Moreover, such analysis is similarly applicable to dichotomous indicators, ranks or integers for ordered categories, and continuous measurements. Since non-parametric covariance analysis can have many forms, the ones which are planned for a clinical trial need careful specification in its protocol. A limitation of non-parametric analysis is that it does not directly address the magnitude of treatment effects within subgroups based on the covariables or the homogeneity of such effects. For this purpose, a statistical model is needed. When the response criterion is dichotomous or has ordered categories, such a model may have a non-linear nature which determines how covariance adjustment modifies results for treatment effects. Insight concerning such modifications can be gained through their evaluation relative to non-parametric counterparts. Such evaluation usually indicates that alternative ways to compare treatments for a response criterion with adjustment for a set of covariables mutually support the same conclusion about the strength of treatment effects. This robustness is noteworthy since the alternative methods for covariance analysis have substantially different rationales and assumptions. Since findings can differ in important ways across alternative choices for covariables (as opposed to methods for covariance adjustment), the critical consideration for studies with covariance analyses planned as the primary method for comparing treatments is the specification of the covariables in the protocol (or in an amendment or formal plan prior to any unmasking of the study.

Journal ArticleDOI
TL;DR: The generalised F mixture model can relax the usual stronger distributional assumptions and allow the analyst to uncover structure in the data that might otherwise have been missed, illustrated by fitting the model to data from large-scale clinical trials with long follow-up of lymphoma patients.
Abstract: Cure rate estimation is an important issue in clinical trials for diseases such as lymphoma and breast cancer and mixture models are the main statistical methods. In the last decade, mixture models under different distributions, such as exponential, Weibull, log-normal and Gompertz, have been discussed and used. However, these models involve stronger distributional assumptions than is desirable and inferences may not be robust to departures from these assumptions. In this paper, a mixture model is proposed using the generalized F distribution family. Although this family is seldom used because of computational difficulties, it has the advantage of being very flexible and including many commonly used distributions as special cases. The generalised F mixture model can relax the usual stronger distributional assumptions and allow the analyst to uncover structure in the data that might otherwise have been missed. This is illustrated by fitting the model to data from large-scale clinical trials with long follow-up of lymphoma patients. Computational problems with the model and model selection methods are discussed. Comparison of maximum likelihood estimates with those obtained from mixture models under other distributions are included.

Journal ArticleDOI
TL;DR: It is shown that imputation using the mean of all observed items in the same subscale may be an inappropriate method for many of the items in quality of life questionnaires, and would result in biased or misleading estimates.
Abstract: Missing data has been a problem in many quality of life studies. This paper focuses upon the issues involved in handling forms which contain one or more missing items, and reviews the alternative procedures. One of the most widely practised approaches is imputation using the mean of all observed items in the same subscale. This, together with the related estimation of the subscale score, is based upon traditional psychometric approaches to scale design and analysis. We show that it may be an inappropriate method for many of the items in quality of life questionnaires, and would result in biased or misleading estimates. We provide examples of items and subscales which violate the psychometric foundations that underpin simple mean imputation. A checklist is proposed for examining the adequacy of simple imputation, and some alternative procedures are indicated.

Journal ArticleDOI
TL;DR: In this article, a score-based confidence interval for the di ¼erence of two proportions in clinical trials, case-control studies and sensitivity comparison studies of two laboratory tests is derived.
Abstract: SUMMARY This paper considers a model for the di⁄erence of two proportions in a paired or matched design of clinical trials, case-control studies and also sensitivity comparison studies of two laboratory tests. This model includes a parameter indicating both interpatient variability of response probabilities and their correlation. Under the proposed model, we derive a one-sided test for equivalence based upon the eƒcient score. Equivalence is defined here as not more than 100* per cent inferior. McNemar’s test for significance is shown to be a special case of the proposed test. Further, a score-based confidence interval for the di⁄erence of two proportions is derived. One of the features of these methods is applicability to the 2]2 table with o⁄-diagonal zero cells; all the McNemar type tests and confidence intervals published so far cannot apply to such data. A Monte Carlo simulation study shows that the proposed test has empirical significance levels closer to the nominal a-level than the other tests recently proposed and further that the proposed confidence interval has better empirical coverage probability than those of the four published methods. ( 1998 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: A unified overview of various analytic approaches to correct for non-compliance in randomized trials is provided and several new structural (causal) models are introduced: the coarse structural nested models, the non-nested marginal structural models and the continuous-time structural nested model, and their properties are compared with those of previously proposedStructural nested models.
Abstract: In randomized trials comparing a new therapy to standard therapy, the sharp null hypothesis of equivalent therapeutic efficacy does not imply the intent-to-treat null hypothesis of equal outcome distributions in the two-treatment arm if non-compliance is present. As a consequence, the development of analytic methods that adjust for non-compliance is of particular importance in equivalence trials comparing a new therapy to standard therapy. This paper provides, in the context of equivalence trial, a unified overview of various analytic approaches to correct for non-compliance in randomized trials. The overview focuses on comparing and contrasting the plausibility, robustness, and strength of assumptions required by each method and their programming and computational burdens. In addition, several new structural (causal) models are introduced: the coarse structural nested models, the non-nested marginal structural models and the continuous-time structural nested models, and their properties are compared with those of previously proposed structural nested models. The fundamental assumption that allows us to correct for non-compliance is that the decision whether or not to continue to comply with assigned therapy at time t is random (that is, ignorable or explainable) conditional on the history up to t of measured pre- and time-dependent post-randomization prognostic factors. In the final sections of the paper, we consider how the consequences of violations of our assumption of conditionally ignorable non-compliance can be explored through a sensitivity analysis. Finally, the analytic methods described in this paper can also be used to estimate the causal effect of a time-varying treatment from observational data.

Journal ArticleDOI
TL;DR: The proposed method, which extends McNemar's test to the case where the observations are sampled in clusters, is a good alternative to Eliasziw and Donner's test when, in practice, little is known about the correlation pattern of the data.
Abstract: McNemar's test is often used to compare two proportions estimated from paired observations. We propose a method extending this to the case where the observations are sampled in clusters. The proposed method is simple to implement and makes no assumptions about the correlation structure. We conducted a Monte Carlo simulation study to compare the size and power of the proposed method with a test developed earlier by Eliasziw and Donner. In the presence of intracluster correlation, the size of McNemar's test can greatly exceed the nominal level. The size of Eliasziw and Donner's test is also inflated for some correlation patterns. The proposed method, on the other hand, is close to the nominal size for a variety of correlation patterns, although it is slightly less powerful than Eliasziw and Donner's procedure. The proposed method is a good alternative to Eliasziw and Donner's test when, in practice, little is known about the correlation pattern of the data.

Journal ArticleDOI
TL;DR: A rank-invariant non-parametric method of analysis is presented that is valid regardless of the number of response categories and related to the joint distribution of paired observations that makes it possible to measure separately the individual order-preserved categorical changes.
Abstract: Subjective judgements of complex variables are commonly recorded as ordered categorical data. The rank-invariant properties of such data are well known, and there are various statistical approaches to the analysis and modelling of ordinal data. This paper focuses on the non-additive property of ordered categorical data in the analysis of change. A rank-invariant non-parametric method of analysis is presented that is valid regardless of the number of response categories. The unique feature of this method is the augmented ranking approach that is related to the joint distribution of paired observations. This approach makes it possible to measure separately the individual order-preserved categorical changes, which are attributable to the group change, and the individual categorical changes that are not consistent with the pattern of group change. The method is applied to analysis of change in a three-point scale and in a visual analogue scale of continuous ordinal responses.