scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in Medicine in 1991"


Journal ArticleDOI
TL;DR: This paper provides an overview of methods for creating and analysing multiply-imputed data sets, and illustrates the dramatic improvements possible when using multiple rather than single imputation.
Abstract: Multiple imputation for non-response replaces each missing value by two or more plausible values. The values can be chosen to represent both uncertainty about the reasons for non-response and uncertainty about which values to impute assuming the reasons for non-response are known. This paper provides an overview of methods for creating and analysing multiply-imputed data sets, and illustrates the dramatic improvements possible when using multiple rather than single imputation. A major application of multiple imputation to public-use files from the 1970 census is discussed, and several exploratory studies related to health care that have used multiple imputation are described.

1,273 citations


Journal ArticleDOI
TL;DR: A general parametric approach is presented, which utilizes efficient score statistics and Fisher's information, and relates this to different methods suggested by previous authors.
Abstract: Meta-analysis provides a systematic and quantitative approach to the summary of results from randomized studies. Whilst many authors have published actual meta-analyses concerning specific therapeutic questions, less has been published about comprehensive methodology. This article presents a general parametric approach, which utilizes efficient score statistics and Fisher's information, and relates this to different methods suggested by previous authors. Normally distributed, binary, ordinal and survival data are considered. Both the fixed effects and random effects model for treatments are described.

728 citations


Journal ArticleDOI
TL;DR: The construction of standard errors for the parameters of all groups, without the need to select a baseline group, can be regarded as relating to roughly independent parameters, so that groups can be compared efficiently without knowledge of the covariances.
Abstract: We discuss the problem of describing multiple group comparisons in survival analysis using the Cox model, and in matched case-control studies. The standard method of comparing the risk in each group with a baseline group is unsatisfactory because the standard errors and confidence limits relate to correlated parameters, all dependent on precision within the baseline group. We describe the construction of standard errors for the parameters of all groups, without the need to select a baseline group. These standard errors can be regarded as relating to roughly independent parameters, so that groups can be compared efficiently without knowledge of the covariances. The method should assist in graphical presentation of relative risks, and in the combination of results from published studies. Two examples are presented.

397 citations


Journal ArticleDOI
TL;DR: The statistical properties of an alternative estimator of biologic efficacy that avoids the potential selection bias inherent in a comparison of compliant subgroups are derived and illustrated in the analysis of a randomized community trial of the impact of vitamin A supplementation on children's mortality.
Abstract: We define ‘biologic efficacy’ as the effect of treatment for all persons who receive the therapeutic agent to which they were assigned. It measures the biologic action of treatment among compliant persons. In a randomized trial with one treatment and one placebo control, one can theoretically estimate efficacy by comparing persons who complete the treatment regimen with controls who similarly complete the control regimen. In practice, however, we make this comparison with reservation because a control protocol often presents a different challenge for compliance than does the treatment, so that the compliant subgroups are not comparable. Standard practice employs intent-to-treat comparisons in which one compares those randomized to treatment and control without reference to whether they actually received the treatment. Intent-to-treat comparisons estimate the ‘programmatic effectiveness’ of a treatment rather than its biologic efficacy. This paper introduces and derives the statistical properties of an alternative estimator of biologic efficacy that avoids the potential selection bias inherent in a comparison of compliant subgroups. The method applies to randomized trials with a dichotomous outcome measure, whether or not a placebo is given to the control group. The idea is to compare the compliers in the treatment group to an inferred control subgroup chosen to eliminate selection bias. The methodology was motivated by and is illustrated in the analysis of a randomized community trial of the impact of vitamin A supplementation on children's mortality.

335 citations


Journal ArticleDOI
TL;DR: A method is described which is based on low order polynomial curves (linear, quadratic or occasionally cubic), together with guidelines for when and how to apply a logarithmic transformation to the variable analysed, testing for departures from normality, and assessment of the adequacy of the reference range.
Abstract: Reference ranges which take time (such as age) into account are often required in medicine, but simple, systematic and efficient statistical methods for constructing them are lacking. A method is described which is based on low order polynomial curves (linear, quadratic or occasionally cubic), together with guidelines for when and how to apply a logarithmic transformation to the variable analysed, testing for departures from normality, and assessment of the adequacy of the reference range which is constructed from the regression line plus or minus a multiple of the standard deviation. Standard statistical packages may be used to carry out the calculations. The question of comparing two or more groups of patients is addressed. Three examples are discussed in detail.

299 citations


Journal ArticleDOI
TL;DR: A model-based approach developed by Cox is adapted for use in model validation, which allows identification of problematic predictor variables in the prediction model as well as influential observations in the validation data that adversely affect the fit of the model.
Abstract: This paper presents a comprehensive approach to the validation of logistic prediction models. It reviews measures of overall goodness-of-fit, and indices of calibration and refinement. Using a model-based approach developed by Cox, we adapt logistic regression diagnostic techniques for use in model validation. This allows identification of problematic predictor variables in the prediction model as well as influential observations in the validation data that adversely affect the fit of the model. In appropriate situations, recommendations are made for correction of models that provide poor fit.

280 citations


Journal ArticleDOI
TL;DR: The problem of the definition of actual treatment is investigated in the context of a recent clinical trial, which provided results that were at times inconsistent or counter-intuitive, and which neither helped to confirm nor further explain the intention to treat analysis.
Abstract: The primary analysis of a randomized clinical trial should compare patients in their randomly assigned treatment groups (intention to treat analysis) When a substantial number of subjects fail to take a prescribed medication or are switched to a different study medication, it is tempting to consider treatment comparisons using only those subjects with treatment as actually received rather than as prescribed There are several arguments against this approach: the prognostic balance brought about by randomization is likely to be disturbed; sample size will be reduced; and the validity of the statistical test procedures will be undermined Further, results of analysis by treatment actually received may suffer from a bias introduced by using compliance, a factor often related to outcome independently of the treatment received, to determine the groups for comparison The extent and nature of this bias will be related to the definition of compliance in an as treated analysis, a definition which could be unintentionally self-serving We have investigated the problem of the definition of actual treatment in the context of a recent clinical trial We used several definitions to classify patients as having received or not received treatment as prescribed These definitions, when used in as treated analyses, provided results that were at times inconsistent or counter-intuitive, and which neither helped to confirm nor further explain the intention to treat analysis

269 citations


Journal ArticleDOI
TL;DR: A method for obtaining approximateconfidence limits for the weighted sum of Poisson parameters as linear functions of the confidence limits for a single Poisson parameter, the unweighted sum is presented.
Abstract: Directly standardized mortality rates are examples of weighted sums of Poisson rate parameters. If the numbers of events are large then normal approximations can be used to calculate confidence intervals, but these are inadequate if the numbers are small. We present a method for obtaining approximate confidence limits for the weighted sum of Poisson parameters as linear functions of the confidence limits for a single Poisson parameter, the unweighted sum. The location and length of the proposed interval depend on the method used to obtain confidence limits for the single parameter. Therefore several methods for obtaining confidence intervals for a single Poisson parameter are compared. For single parameters and for weighted sums of parameters, simulation suggests that the coverage of the proposed intervals is close to the nominal confidence levels. The method is illustrated using data on rates of myocardial infarction obtained as part of the WHO MONICA Project in Augsburg, Federal Republic of Germany.

264 citations


Journal ArticleDOI
TL;DR: This paper reviews changes in the use of statistics in medical journals during the 1980s, focusing on research design, statistical analysis, the presentation of results, medical journal policy, and the misuse of statistics.
Abstract: This paper reviews changes in the use of statistics in medical journals during the 1980s. Aspects considered are research design, statistical analysis, the presentation of results, medical journal policy (including statistical refereeing), and the misuse of statistics. Despite some notable successes, the misuse of statistics in medical papers remains common.

249 citations


Journal ArticleDOI
TL;DR: The main emphasis is on the concept of the multiple level of significance (controlling the experiment, or family error in the strong sense) which can be achieved by applying the principle of closed tests.
Abstract: The basic ideas of multiple testing are outlined and the problem of how to control the probability of erroneous decisions is discussed. The main emphasis is on the concept of the multiple level of significance (controlling the experiment, or family error in the strong sense) which can be achieved by applying the principle of closed tests. Various practical situations encountered in multiple testing in clinical trials are considered: more than one end point; more than two treatments, such as comparisons with a single control, comparisons using ordered alternatives, all pairwise comparisons and contrast methods; and more than one trial. Tests based on global statistics, the union intersection principle and other criteria are discussed. The application of the multiple test concept in sequential sampling is investigated. Finally some comments are made on multiple power, multiple confidence intervals and directed decisions.

222 citations


Journal ArticleDOI
TL;DR: Channeling is a form of allocation bias, where drugs with similar therapeutic indications are prescribed to groups of patients with prognostic differences, and claimed advantages of a new drug may channel it to patients with special pre-existing morbidity.
Abstract: Channeling is a form of allocation bias, where drugs with similar therapeutic indications are prescribed to groups of patients with prognostic differences. Claimed advantages of a new drug may channel it to patients with special pre-existing morbidity, with the consequence that disease states can be incorrectly attributed to use of the drug. For the study of adverse drug reactions, large databases supply information on co-medication and morbidity of patients. For diseases with a stepped-care approach, the drug history of patients, as available from some databases, can show channeling of drugs to patients with markers of relatively severe disease.

Journal ArticleDOI
TL;DR: It appears that the continual reassessment method is preferable to other contending schemes.
Abstract: We discuss some of the statistical approaches to the design and analysis of phase I clinical trials in cancer. An attempt is made to identify the issues, particular to this type of trial, that should be addressed by an appropriate methodology. A brief review of schemes currently in use is provided together with our views of the extent to which any particular scheme addresses the main issues. Some simulations are provided together with graphical illustration of the operating characteristics of the particular methods. It appears that the continual reassessment method is preferable to other contending schemes.

Journal ArticleDOI
TL;DR: Determination of the day of luteal transition to estimate ovulation requires only first-morning urine specimens, requires no correction for day-to-day variations in urine concentration, and can be applied to a mid-cycle window of data.
Abstract: A modification of the Royston ratio algorithm for estimating day of ovulation from ratios of urinary metabolites of estrogen and progesterone is demonstrated and validated. Ovulation can be diagnoses by frequent ultrasound frequent plasma LH assays or frequent urinary LH assays but daily morning urine LH assays confirm ovulation in 1 standard deviation above baseline defined as mean of Days 1-6 that is followed by 4 consecutive lower values. The algorithm proposed here does not require a peak only a 60% drop in the ratio over a 3-day period. The new algorithm scans 5- day sequences of ratios starting with Day 7 looking for a sequence in which the 1st day is the highest and the ratio values for the last 2 days are 40% or less of the 1st days. The 2nd day is designated as luteal transition. This method was demonstrated using a sample of 87 cycles from 28 healthy women trying to conceive. The algorithm was evaluated in 3 ways using a sample of 283 cycles of 188 women in a WHO study of ovulation prediction: by variation in the difference between day and LH peak and luteal transition; by variance in the length of luteal phase; and by proportion of cycles with identifiable cycles. 40% for the descent cut-off gave the best concordance with LH peak least variance for luteal phase length and best estimate the ovulation day. 97% of LH peaks were identified 88% of selected days of luteal transition fell with 2 days of LH peak and the mean luteal phase length was 13.4 days with a variance of 3.8. Using data from the WHO sample without clear LH peaks a day of luteal transition could be determined in 90% and mean luteal phase length was 13.3 with a variance of 5.0. The method was compared to 4 other steroid algorithms with more precise results in every case. Finally the new algorithm was validated using data from a 3rd sample of 60 cycles of 17 women with comparable results. This algorithm is useful because it determines ovulation day in 90% of cycles does not require a creatinine assay to adjust for urine concentration nor does it require data for very early or late cycle days.

Journal ArticleDOI
TL;DR: This paper shows how one can adapt the pseudo-likelihood analyses developed by Kalbfleisch and Lawless (1988) to the analysis of data from two-stage case-control studies.
Abstract: Nested case-control studies, or case-control studies within a cohort, combine the advantages of cohort studies with the efficiency of case-control studies. Case-control studies can often be viewed as having two stages; the first stage consists of vital status, disease, and basic covariate ascertainment, and the second stage consists of detailed covariate and exposure ascertainment. Breslow and Cain (1988) and Breslow and Zhao (1988) recently showed that conventional analyses of such two-stage studies may ignore some of the available information. In this paper, we show how one can adapt the pseudo-likelihood analyses developed by Kalbfleisch and Lawless (1988) to the analysis of data from two-stage case-control studies.

Journal ArticleDOI
TL;DR: The method for assessing a threshold consists of an estimation procedure using the maximum-likelihood technique and a test procedure based on the likelihood-ratio statistic R, following under the null hypothesis (no threshold) a quasi one-sided chi 2 distribution with one degree of freedom.
Abstract: I describe a method for estimating and testing a threshold value in epidemiological studies. A threshold effect indicates an association between a risk factor and a defined outcome above the threshold value but none below it. An important field of application is occupational medicine where, for a lot of chemical compounds and other agents which are non-carcinogenic health hazards, so-called threshold limit values or TLVs are specified. The method is presented within the framework of the logistic regression model, which is widely used in the analysis of the relationship between some explanatory variables and a dependent dichotomous outcome. In most available programs for this and also for other models the concept of a threshold is disregarded. The method for assessing a threshold consists of an estimation procedure using the maximum-likelihood technique and a test procedure based on the likelihood-ratio statistic R, following under the null hypothesis (no threshold) a quasi one-sided chi 2 distribution with one degree of freedom. This use of this distribution is supported by a simulation study. The method is applied to data from an epidemiological study of the relationship between occupational dust exposure and chronic bronchitic reactions. The results are confirmed by bootstrap resampling.

Journal ArticleDOI
TL;DR: An overview of developments in the assessment of quality of life (QOL) in clinical trials over the last decade from the viewpoint of clinical biostatistics is intended, with critical conclusions outlined and suggestions for further research given.
Abstract: This paper is intended as an overview of developments in the assessment of quality of life (QOL) in clinical trials over the last decade from the viewpoint of clinical biostatistics. In the first part we deal with aspects of obtaining adequate measurements of quality of life. A literature survey shows that a large number of quite heterogeneous measurement approaches for use in clinical trials exist, a substantial percentage of which cannot be regarded as sufficient for their actual measuring purpose. In the second part we review statistical methods applied to and adapted for the analysis of QOL data. Underlying the analysis should be the assumption of QOL as a stochastic process. Applied analysis procedures are again investigated in a literature survey. Finally, critical conclusions are outlined and suggestions for further research are given.

Journal ArticleDOI
TL;DR: Two statistical models are considered: one is the standard variance component model adapted to censored data, and the other is a recent intensity based model with a random proportionality factor representing interindividual variation.
Abstract: For each of several individuals a sequence of repeated events, forming a renewal process, is observed up to some censoring time. The object is to estimate the average interevent time over the population of individuals as well as the variation of interevent times within and between individuals. Medical motivation comes from gastroenterology, and concerns the occurrence of certain cyclic movements in the small bowel during the fasting state. Two statistical models are considered: one is the standard variance component model adapted to censored data, and the other is a recent intensity based model with a random proportionality factor representing interindividual variation. These models are applied to the motility data, and their advantages are discussed. The intensity based model allows simple empirical Bayes estimation of the expected interevent times for an individual in the presence of censoring.

Journal ArticleDOI
TL;DR: A new approach to back-projection is described, which avoids parametric assumptions about the form of the HIV infection intensity and is based on a modification of an EM algorithm for maximum likelihood estimation that incorporates smoothing of the estimated parameters.
Abstract: The method of back-projection has been used to estimate the unobserved past incidence of infection with the human immunodeficiency virus (HIV) and to obtain projections of future AIDS incidence. Here a new approach to back-projection, which avoids parametric assumptions about the form of the HIV infection intensity, is described. This approach gives the data greater opportunity to determine the shape of the estimated intensity function. The method is based on a modification of an EM algorithm for maximum likelihood estimation that incorporates smoothing of the estimated parameters. It is easy to implement on a computer because the computations are based on explicit formulae. The method is illustrated with applications to AIDS data from Australia, U.S.A. and Japanese haemophiliacs.

Journal ArticleDOI
TL;DR: It is shown how a well-known multiple step-down significance testing procedure for comparing treatments with a control in balanced one-way layouts can be applied in unbalanced layouts (unequal sample sizes for the treatments).
Abstract: We show how a well-known multiple step-down significance testing procedure for comparing treatments with a control in balanced one-way layouts can be applied in unbalanced layouts (unequal sample sizes for the treatments). The method we describe has the advantage that it provides p-values, for each treatment versus control comparison, that take account of the multiple step-down testing nature of the procedure. These joint p-values can be used with any value of alpha, the fixed type I family wise error rate bound, that may be specified by the investigator. To determine the p-values, it is necessary to compute a multivariate Student t integral, for which a computer program is available. This procedure is more powerful than the step-down Bonferroni procedure of Holm and the single-step procedure of Dunnett. An example from the pharmaceutical literature is used to illustrate the procedure.

Journal ArticleDOI
TL;DR: This explosion of a relatively new method of evaluating clinical medicine presents a number of challenges to statisticians and those responsible for health care policy, most important are the problems raised by the need to update meta-analyses as new trials are published.
Abstract: Over 150 meta-analyses of randomized control trials have so far been published in the English language, and new ones are appearing at a rate of over fifteen per year. This explosion of a relatively new method of evaluating clinical medicine presents a number of challenges to statisticians and those responsible for health care policy. The pitfalls of retrospective research must be avoided, and the quality of the original trials should be evaluated. Heterogeneity of the control event rates and the treatment differences need to be dealt with statistically. Most important are the problems raised by the need to update meta-analyses as new trials are published.

Journal ArticleDOI
TL;DR: An adjustment to the McNemar test is suggested to account for the repeated measures clustering effect and a Monte Carlo simulation is reported on that evaluates the effectiveness of this approach.
Abstract: McNemar's one degree of freedom chi-square test for the equality of proportions appears frequently in the analysis of pairs of matched, binary outcome data (Y1i, Y2i). An assumption underlying this test is that the responses from pair to pair are mutually independent. In certain applications, however, the pairs may represent repeated measurements on the same experimental unit, and hence this assumption is violated. In this paper we suggest an adjustment to the McNemar test to account for the repeated measures clustering effect and we report on a Monte Carlo simulation that evaluates the effectiveness of this approach.

Journal ArticleDOI
TL;DR: There is a sentiment that well-designed, prospective trials are required to provide credible information on the accuracy of diagnostic technologies, and so a consensus on methodological standards is needed, paralleling the earlier development of such standards in clinical trials and epidemiology.
Abstract: Research on diagnostic medicine has been directed at a number of topics in the past decade. Issues which have received a lot of attention are ROC analysis and the identification and correction of various analytic biases. Other topics of widespread interest include the use of expert systems, the relationship of such systems to statistical data-based systems, and the evaluation of tests using cost-effectiveness analysis. Increasingly there is a sentiment that well-designed, prospective trials are required to provide credible information on the accuracy of diagnostic technologies, and so a consensus on methodological standards is needed, paralleling the earlier development of such standards in clinical trials and epidemiology.

Journal ArticleDOI
TL;DR: Methods of adjustment for estimation of the AR in case-control studies are reviewed and it is argued that this latter method has the greatest generality and flexibility, and includes the two other approaches as special cases.
Abstract: In the 1980's, progress was made in adjusting estimates of the attributable risk (AR) for confounding factors and in calculating associated confidence intervals. In this paper, methods of adjustment for estimation of the AR in case-control studies are reviewed. The limitations and problems associated with two methods based on stratification, the weighted-sum approach and the Mantel-Haenszel approach, are discussed. They include small-sample bias with the weighted-sum approach and the difficulty of taking interaction into account with the Mantel-Haenszel approach. A third method based on logistic regression is reviewed. It is argued that this latter method has the greatest generality and flexibility, and includes the two other approaches as special cases. Throughout the paper, an example of a case-control study of oesophageal cancer illustrates the use of the methods described.

Journal ArticleDOI
TL;DR: EB estimates of mortality rates for common tumours are similar to SMRs and for rare tumours, the EB method identifies the extreme rates more clearly than SMRs by smoothing the SMRs with large variances.
Abstract: We give results of empirical Bayes (EB) estimation of mortality rates designed to smooth observed SMR when random fluctuation of the observed deaths is important. We have specially studied the case where the prior distributions of the EB method have a spatial structure. The need for spatial modelling of cancer mortality rates in France is first shown with testing autocorrelation and fitting autoregressive spatial models, conditional (CAR) or simultaneous (SAR). A positive autocorrelation of the rates is shown for most cancer sites studied. As expected, EB estimates of mortality rates for common tumours are similar to SMRs. For rare tumours, the EB method identifies the extreme rates more clearly than SMRs by smoothing the SMRs with large variances. CAR or SAR models are adequate prior distributions for autocorrelated rates and produce quite similar rate estimates.

Journal ArticleDOI
TL;DR: The best fitting model for females and males indicated an increased relative risk of mortality which lasted for approximately six months after bereavement, and the case of widows this relative risk was significantly increased, being 3.8 with 95 per cent confidence interval.
Abstract: Previous studies on the effect of marital bereavement on mortality have suggested various time periods during which the risk of mortality is increased. As many of the studies compared the widowed group with national mortality statistics for the married, there has been no opportunity to adjust for confounders which might themselves be responsible for this increased risk after bereavement. In this paper the various hypotheses proposed are reviewed and then modelled on a dataset of 344 elderly persons who were living with a spouse and who were part of a survey of a population of people aged 75 years and over. The 344 index-cases and their spouses were followed up for seven years and the times of death (for those who died) of the index-case and spouse were noted. The data were analysed by fitting a proportional hazards model to the subject's survival time after adjustment for other factors such as mental and physical health which had already been shown to be associated with mortality. The bereavement effects were fitted as time-dependent covariates. The best fitting model for females and males indicated an increased relative risk of mortality which lasted for approximately six months after bereavement. In the case of widows this relative risk was significantly increased, being 3.8 with 95 per cent confidence interval (1.4, 10.3) while for widowers the risk was 0.03 with 95 per cent confidence interval (0.00, 37.3).

Journal ArticleDOI
TL;DR: The multivariate probit model, designed to regress a vector of correlated quantal variables on a mixture of continuous and discrete predictors, is reintroduced thereby showing its usefulness in medical problems.
Abstract: The multivariate probit model is designed to regress a vector of correlated quantal variables on a mixture of continuous and discrete predictors. Various applications can be found in the biological, economical and psychosociological literature, but the method is not yet widely used in medical applications. We reintroduce this model thereby showing its usefulness in medical problems. Software for this model is, however, not widely available. We have written a PC program to select predictors and estimate parameters in the multivariate probit framework. The performance and characteristics of the program are briefly illustrated.

Journal ArticleDOI
TL;DR: Indices of departure based on the Shapiro-Francia W' and the Shapiro -Wilk W statistics are derived, and shown to have a natural interpretation in relation to the normal probability plot.
Abstract: Departure of a sample from a normal distribution should be assessed by a quantity that is meaningful in terms of the data, rather than merely by the P-value from a test statistic. Indices of departure based on the Shapiro-Francia W' and the Shapiro-Wilk W statistics are derived, and shown to have a natural interpretation in relation to the normal probability plot. A new diagnostic plot is proposed. An example is given which shows the relationship between one of the new indices and errors in calculated reference ranges due to non-normality of the data.



Journal ArticleDOI
TL;DR: Semiannetric and non-parametric methodology for the analysis of repeated measurements in which, for each subject, one assesses a univariate response variable at multiple fixed time points, is considered.
Abstract: Techniques applicable for the analysis of longitudinal data when the response variable is non-normal are not nearly as comprehensive as for normally-distributed outcomes. However, there have been several recent developments. Semi-parametric and non-parametric methodology for the analysis of repeated measurements is reviewed. The commonly encountered design in which, for each subject, one assesses a univariate response variable at multiple fixed time points, is considered. The types of outcomes considered include binary, ordered categorical, and continuous (but extremely non-normal) response variables. All of the methods considered allow for incomplete data due to the occurrence of missing observations. In addition, discrete and/or continuous covariates, which may be time-dependent, are accommodated by some of the approaches. The methods are demonstrated using data from three clinical trials.