scispace - formally typeset
Search or ask a question

Showing papers in "Statistics in Medicine in 1988"


Journal ArticleDOI
Nan M. Laird1
TL;DR: This paper will review the use of likelihood based analyses for longitudinal data with missing responses, both from the point of view of ease of implementation and appropriateness in view of the non-response mechanism.
Abstract: When observations are made repeatedly over time on the same experimental units, unbalanced patterns of observations are a common occurrence. This complication makes standard analyses more difficult or inappropriate to implement, means loss of efficiency, and may introduce bias into the results as well. Some possible approaches to dealing with missing data include complete case analyses, univariate analyses with adjustments for variance estimates, two-step analyses, and likelihood based approaches. Likelihood approaches can be further categorized as to whether or not an explicit model is introduced for the non-response mechanism. This paper will review the use of likelihood based analyses for longitudinal data with missing responses, both from the point of view of ease of implementation and appropriateness in view of the non-response mechanism. Models for both measured and dichotomous outcome data will be discussed. The appropriateness of some non-likelihood based analyses is briefly considered.

726 citations


Journal ArticleDOI
TL;DR: An alternative display is suggested which represents intervals as points on a bivariate graph, and which has advantages when the data are estimates of odds ratios from studies with a binary response.
Abstract: To display a number of estimates of a parameter obtained from different studies it is common practice to plot a sequence of confidence intervals. This can be useful but is often unsatisfactory. An alternative display is suggested which represents intervals as points on a bivariate graph, and which has advantages. When the data are estimates of odds ratios from studies with a binary response, it is argued that for either type of plot, a log scale should be used rather than a linear scale.

592 citations


Journal ArticleDOI
Odd O. Aalen1
TL;DR: A general class of mixing (or frailty) distributions is applied, extending a model of Hougaard that allows part of the population to be non-susceptible, and contains the traditional gamma distribution as a special case.
Abstract: I discuss the impact of individual heterogeneity in survival analysis. It is well known that this phenomenon may distort what is observed. A general class of mixing (or frailty) distributions is applied, extending a model of Hougaard. The extension allows part of the population to be non-susceptible, and contains the traditional gamma distribution as a special case. I consider the mixing of both a constant and a Weibull individual rate, and also discuss the comparison of rates from two populations. A number of practical examples are mentioned. Finally, I analyse two data sets, the main one containing data from the Norwegian Cancer Registry on the survival of breast cancer patients. The statistical analysis is of necessity speculative, but may still provide some insight.

331 citations


Journal ArticleDOI
TL;DR: The technique is a generalized person-years approach in that it treats each observation interval as a mini-follow-up study in which the current risk factor measurements are employed to predict an event in the interval.
Abstract: The purpose of this paper is to indicate how repeated measures on risk factors have been employed in the prediction of the development of disease in the Framingham Heart Study. Since these measures vary over time, the method accounts for time dependent covariates. The technique is a generalized person-years approach in that it treats each observation interval (of equal length) as a mini-follow-up study in which the current risk factor measurements are employed to predict an event in the interval. Observations over multiple intervals are pooled into a single sample to predict the short term risk of an event. This approach is compared to the long-term prediction of disease which utilizes only the baseline measurements and ignores subsequent repeated measures on the risk factors.

263 citations


Journal ArticleDOI
TL;DR: Tests are proposed, based on the first isotonic regression estimator under an order restriction, which allow for the effects of selecting a study region in the light of the data and have a simple form.
Abstract: Individual reactions to a report which identifies an excess of risk near a putative source are determined mainly by some quoted significance level. One reaction, involving a commonly used 'coincidence' argument is given a simple Bayesian explanation. It is argued that interpretations of such reports should if possible allow both for data selection and for uncertainty in the null expectations underlying the significance levels. Tests are proposed, based on the first isotonic regression estimator under an order restriction, which allow for the effects of selecting a study region in the light of the data and have a simple form. Data on cancer incidence around two nuclear plants are used to illustrate.

192 citations


Journal ArticleDOI
TL;DR: An approach to estimation for certain 'difficult' situations associated with retrospective or incomplete prospective observation is presented and a pseudo-likelihood which enables the simple analysis of some sampling procedures is introduced.
Abstract: Data related to life histories of individuals can be obtained in many different ways, and the usefulness of multi-state models for statistical analysis is generally highly dependent on the type and nature of the data. In this paper, we focus on this, and present an approach to estimation for certain 'difficult' situations associated with retrospective or incomplete prospective observation. The paper begins with the identification of some problem areas in the analysis of data on life history processes. We discuss maximum likelihood estimation in some simple contexts and introduce a pseudo-likelihood which enables the simple analysis of some sampling procedures. This approach is illustrated on standard retrospective and case-cohort designs.

188 citations


Journal ArticleDOI
TL;DR: Three commonly-used statistical tests for assessing the association between an explanatory variable and a measured, binary, or survival-time, response variable are considered, and the loss in efficiency from mismodelling or mismeasuring the explanatory variable is investigated.
Abstract: We consider three commonly-used statistical tests for assessing the association between an explanatory variable and a measured, binary, or survival-time, response variable, and investigate the loss in efficiency from mismodelling or mismeasuring the explanatory variable. With respect to mismodelling, we examine the consequences of using an incorrect dose metameter in a test for trend, of mismodelling a continuous explanatory variable, and of discretizing a continuous explanatory variable. We also examine the consequences of classification errors for a discrete explanatory variable and of measurement errors for a continuous explanatory variable. For all three statistical tests, the asymptotic relative efficiency (ARE) corresponding to each type of mis-specification equals the square of the correlation between the correct and fitted form of the explanatory variable. This result is evaluated numerically for the different types of mis-specification to provide insight into the selection of tests, the interpretation of results, and the design of studies where the 'correct' explanatory variable cannot be measured exactly.

167 citations


Journal ArticleDOI
TL;DR: Several approaches for estimating and comparing the rates of change of a continuous variable in two treatment groups in the presence of informative right censoring are reviewed and compared.
Abstract: Several approaches for estimating and comparing the rates of change of a continuous variable in two treatment groups in the presence of informative right censoring are reviewed and compared. The comparisons are made under different models for the censoring probabilities and various types of treatment effects. Some recommendations are discussed regarding the application of these approaches to the different settings.

160 citations


Journal ArticleDOI
TL;DR: This work prefers the use of Blomqvist's method of correcting the association between change and initial value to allow for the regression effect, and shows results illustrated by computer simulations.
Abstract: Statistical analysis of whether the change in a variable depends on its initial value, in clinical trials and other studies, is complicated by the phenomenon of regression to the mean. We review this problem and examine some approaches for handling it. MacGregor's log-log plot fails to correct for the regression effect, while Oldham's method of plotting the change against the average of initial and final values is shown to give misleading results when the effect of treatment varies between subjects, or when subjects are selected for study if their initial observations fall above or below a specified cut-off point. These results are illustrated by computer simulations. We prefer the use of Blomqvist's method of correcting the association between change and initial value to allow for the regression effect.

135 citations


Journal ArticleDOI
TL;DR: Basic methods for construction of variance estimators for epidemiologic effect estimates when adjusting for misclassification are presented and illustrated in a case-control analysis of the association of antibiotic use during pregnancy with sudden infant death syndrome.
Abstract: This paper presents basic methods for construction of variance estimators for epidemiologic effect estimates when adjusting for misclassification. Methods are described for differential and non-differential misclassification, and for external and internal estimates of classification rates. The methods take account of sampling variability in both the observed data and the estimated classification rates. The methods are illustrated in a case-control analysis of the association of antibiotic use during pregnancy with sudden infant death syndrome.

135 citations


Journal ArticleDOI
TL;DR: This paper presents sample size formulae for both continuous and dichotomous endpoints obtained from intervention studies that use the cluster as the unit of randomization, derived from Student's t-test with use of cluster summary measures and a variance that consists of within and between cluster components.
Abstract: This paper presents sample size formulae for both continuous and dichotomous endpoints obtained from intervention studies that use the cluster as the unit of randomization. The formulae provide the required number of clusters or the required number of individuals per cluster when the other number is given. The proposed formulae derive from Student's t-test with use of cluster summary measures and a variance that consists of within and between cluster components. Power contours are provided to help in the design of intervention studies that use cluster randomization. Sample size formulae for designs with and without stratification of clusters appear separately.

Journal ArticleDOI
TL;DR: The conceptual and technical differences are discussed and recent work advancing both approaches is reviewed, and the two approaches are illustrated through analysis of repeated observations on interval history of the respiratory symptom 'persistent wheeze' in pread adolescents.
Abstract: This paper discusses statistical methods for the analysis of repeated observations of categorical variables as they might arise in longitudinal studies. Two general types of models are described: marginal models that give representations for the marginal distribution of response at each occasion, and transitional models that give representations for the transition probabilities between outcome states at successive occasions. The conceptual and technical differences are discussed and recent work advancing both approaches is reviewed. The two approaches are illustrated through analysis of repeated observations on interval history of the respiratory symptom ‘persistent wheeze’ in preadolescent children.

Journal ArticleDOI
TL;DR: How an analysis of a multistate model for survival data may compare with an ordinary survival analysis is discussed and a study of mortality and incidence of nephropathy in insulin dependent diabetes is used as illustration.
Abstract: How an analysis of a multistate model for survival data may compare with an ordinary survival analysis is discussed. A study of mortality and incidence of nephropathy in insulin dependent diabetes is used as illustration. Proportional hazards regression models with time-dependent covariates and regression models for relative mortality are used for the transition intensities. Alternative basic time scales are also considered.

Journal ArticleDOI
TL;DR: This paper presents a statistical analysis of treatment effects in 24-hour ambulatory blood pressure recordings, and uses a meta-analytical method to combine the results of all subjects in the study.
Abstract: This paper presents a statistical analysis of treatment effects in 24-hour ambulatory blood pressure recordings. The statistical models account for circadian rhythms, subject effects, and the effects of treatment with drugs or relaxation therapy. In view of the heterogeneity of the subjects, we fit a separate linear model to the data of each subject, use robust statistical procedures to estimate the parameters of the linear models, and trim the data on a subject by subject basis. We use a meta-analytical method to combine the results of all subjects in the study.

Journal ArticleDOI
TL;DR: A Bayesian method is proposed for assessing the plausible range of true treatment effect for any trial based on interim results, particularly useful for producing shrinkage of the unexpectedly large and imprecise observed treatment effects that arise in clinical trials that stop early.
Abstract: Stopping rules in clinical trials can lead to bias in point estimation of the magnitude of treatment difference. A simulation exercise, based on estimation of the risk ratio in a typical post-myocardial infarction trial, examines the nature of this exaggeration of treatment effect under various group sequential plans and also under continuous naive monitoring for statistical significance. For a fixed treatment effect the median bias in group sequential design is small, but it is greatest for effects that the trial has reasonable power to detect. Bias is evidently greater in trials that stop early and is dramatic under naive monitoring for significance. Group sequential plans lead to a multimodal sampling distribution of treatment effect, which poses problems for incorporating their estimates into meta-analyses. By simulating a population of trials with treatment effects modelled by an underlying distribution of true risk ratios, a Bayesian method is proposed for assessing the plausible range of true treatment effect for any trial based on interim results. This approach is particularly useful for producing shrinkage of the unexpectedly large and imprecise observed treatment effects that arise in clinical trials that stop early. Its implications for trial design are discussed.

Journal ArticleDOI
TL;DR: In this paper, it is suggested that, if the order of observations is known, a plot by time should be performed, perhaps using a cusum, and examples are given in which this assumption is breached.
Abstract: It is customary to regard datasets as homogeneous with respect to the order of collection of the measurements. Examples are given in which this assumption is breached. Hidden time trends have implications for the design of studies, their analysis and interpretation. It is suggested that, if the order of observations is known, a plot by time should be performed, perhaps using a cusum.

Journal ArticleDOI
TL;DR: The Rvachev-Baroyan-Longini model is a space-time predictive model of the spread of influenza epidemics and the French use of the model has been satisfactory, although precision could be improved with more detailed information about passenger traffic.
Abstract: The Rvachev—Baroyan—Longini model is a space—time predictive model of the spread of influenza epidemics. It has been applied to 128 cities of the USSR, and more recently, to forecasting the spread of the pandemic of 1968–1969 throughout 52 large cities. It is a deterministic, mass-action, space and time continuous model. The model has been applied to the simulation of the influenza epidemic of 1984–1985 in the 22 French Metropolitan districts and results are presented. Estimates of the parameters of the model were made using the French Communicable Diseases Network data. These parameters are the contact rate, a, (estimate = 0.55) which is the number of people with whom an infectious individual will make contact daily sufficient to pass infection and the infectious period, 1/b, estimated as 2.49 days. The mean annual railroad passenger traffic from district i to district j varies from 0 to 1,991,000 persons depending on the districts. The computer spread of the epidemic is presented on weekly maps. Results are also presented on district charts, giving the size of district epidemics and the time of peak of the epidemic. The precision of the computer fittings was judged satisfactory by the calculated size of peak differing from the real one by less than 100 per cent, in 17 out of 18 districts, and by the calculated time of peak differing from the observed by less than two weeks in 14 out of 18 districts. Although precision could be improved with more detailed information about passenger traffic, the French use of the model has been satisfactory.

Journal ArticleDOI
TL;DR: The prediction of outcome of anaesthesia in patients over 40 years of age was assessed using a multifactorial index based on current preoperative factors recorded prospectively using a split sample approach and a logistic regression model.
Abstract: The prediction of outcome of anaesthesia in patients over 40 years of age was assessed using a multifactorial index based on current preoperative factors recorded prospectively. The study was conducted using a representative sample of anaesthetizations (except for cardiac surgery) including 517 cases with major complication (occurring during or within 24 hours of anaesthesia) and a one in fifty random sample comprising 1538 cases without complication. A split sample approach was adopted and a logistic regression model was applied to two subsets of similar size. Four preoperative factors were significantly associated with the occurrence of complications: ASA physical status, age, surgical procedure (major/minor) and type (elective/emergency). Goodness-of-fit of the model was assessed using another sample of 332 cases with complication and a different subset of 987 cases without complication. The model fitted the data well (p = 0.15).

Journal ArticleDOI
TL;DR: This paper identifies a need for software for handling the data structures of complex event histories and shows that, with such software, most Markov and semi-Markov models of event history data may be dealt with in the framework of generalized linear models.
Abstract: This paper reviews methods for the analysis of event history data by both parametric (exponential/Poisson) and semi-parametric methods. It identifies a need for software for handling the data structures of complex event histories and shows that, with such software, most Markov and semi-Markov models of event history data may be dealt with in the framework of generalized linear models. Finally, the emergent ‘frailty’ models for associated risks are discussed together with their implications for statistical software.

Journal ArticleDOI
TL;DR: The authors summarized some of the recent work on the errors-in-variables problem in generalized linear models and the focus is on covariance analysis, and in particular testing for and estimation of treatment effects.
Abstract: We summarize some of the recent work on the errors-in-variables problem in generalized linear models The focus is on covariance analysis, and in particular testing for and estimation of treatment effects There is a considerable difference between the randomized and non-randomized models when testing for an effect In randomized studies, simple techniques exist for testing for a treatment effect In some instances, such as linear and multiplicative regression, simple methods exist for estimating the treatment effect In other examples such as logistic regression, estimating a treatment effect requires careful attention to measurement error In non-randomized studies, there is no recourse to understanding and modelling measurement error In particular ignoring measurement error can lead to the wrong conclusions, for example the true but unobserved data may indicate a positive effect for treatment, while the observed data indicate the opposite Some of the possible methods are outlined and compared

Journal ArticleDOI
TL;DR: The present re-analysis exploits the case-control matching of the study while incorporating the effects of important risk determinants, notably year of birth, trimester of exposure and number of films exposed, to obtain time-invariant estimates of the extra risk per mGy.
Abstract: The association between obstetric X-raying and childhood cancer was first identified by the Oxford Survey of Childhood Cancers in 1956. The present re-analysis exploits the case-control matching of the study while incorporating the effects of important risk determinants, notably year of birth, trimester of exposure and number of films exposed. The decline in risk over time is closely mirrored by the estimated decline in dose per film and, by constraining these two relationships to be parallel, time-invariant estimates of the extra risk per mGy are obtained. For example, it is now estimated that irradiating 10(6) foetuses with 1 mGy of X-rays would, in the absence of other causes of death, yield 175 extra cases of cancer and leukaemia in the first 15 years of life.

Journal ArticleDOI
TL;DR: The utility meta-analysis in evaluating medical technologies described in non-randomized studies, if proper attention is given to biases in those studies, shows strong preferences for inborn status, especially for infants who weigh 1001-2000 grams.
Abstract: We have applied meta-analysis to investigate the relationship between birth place and the likelihood of neonatal survival, for infants of low birth weight (less than 2501 grams) in a series of 19 non-randomized studies. This paper illustrates the utility meta-analysis in evaluating medical technologies described in non-randomized studies, if proper attention is given to biases in those studies. The results of this meta-analysis show strong preferences for inborn status, especially for infants who weigh 1001-2000 grams. For infants of lower or higher birth weight (that is, less than 1001 or greater than 2000 grams), the studies are inconsistent: some favour inborn status while others favour outborn status. This heterogeneity is not surprising, because selection bias is more problematic in studies of infants at these birth weights. We discuss potential causes of and solutions to selection bias and illustrate its potential magnitude by introducing the bias factor, which should be considered in the design of future studies. When selection bias cannot be ruled out, the results shown for those who weigh 1001-2000 grams are more appropriate for generating valid conclusions and subsequent policies regarding birth place preference for low birth weight infants.

Journal ArticleDOI
TL;DR: This work proposes new interval estimators that, in this setting, improve upon the performance of the standard 'binomial confidence interval' under a deterministic outcome model.
Abstract: Consider an unbiased follow-up study designed to investigate the causal effect of a dichotomous exposure on a dichotomous disease outcome. Under a deterministic outcome model, a standard '95 per cent binomial confidence interval' may fail to cover the causal parameter of interest at the nominal rate when we take the causal parameter to be a parameter associated with the observed study population (regardless of whether the observed study population was sampled from a larger superpopulation). I propose new interval estimators that, in this setting, improve upon the performance of the standard 'binomial confidence interval.'

Journal ArticleDOI
TL;DR: A mechanism which allows for patients to be withdrawn during the course of the trial because of inadequate blood pressure control is developed while preserving the potential to make an unbiased comparison of the treatment effects.
Abstract: A major source of bias in hypertension trials can arise from patients who are withdrawn during the course of the trial because of inadequate blood pressure control. We develop a mechanism which allows for such withdrawals while preserving the potential to make an unbiased comparison of the treatment effects. The approach is illustrated using data from a large multicentre trial of two anti-hypertensive agents in patients with mild to moderate essential hypertension.

Journal ArticleDOI
TL;DR: This work compared mortality estimates that resulted from a multivariate model for epidemic forecasting with those obtained from univariate models, finding more accurate prediction of deaths from all age groups using the multivarate model.
Abstract: We employed multiple time series analysis to estimate the impact of influenza on mortality in different age groups, using a procedure for updating estimates as current data become available from national mortality data collected from 1962 to 1983. We compared mortality estimates that resulted from a multivariate model for epidemic forecasting with those obtained from univariate models. We found more accurate prediction of deaths from all age groups using the multivariate model. While the univariate models show an adequate fit to the data, the multivariate model often enables earlier detection of epidemics. Additionally, the multivariate approach provides insight into relationships among different age groups at different points in time. For both models, the largest excess mortality due to pneumonia and influenza during influenza epidemics occurred among those 65 years of age and older. Although multiple time series models appear useful in epidemiologic analysis, the complexity of the modelling process may limit routine application.

Journal ArticleDOI
TL;DR: An overview of issues and methods for analysing repeated measures of a continuous random variable, modelling the mean vector and covariance structure, statistical efficiency, regression diagnostics, and discrepancies between longitudinal and cross-sectional methods are presented.
Abstract: We present an overview of issues and methods for analysing repeated measures of a continuous random variable. We discuss modelling the mean vector and covariance structure, statistical efficiency, regression diagnostics, and discrepancies between longitudinal and cross-sectional methods. We illustrate key points with examples and discuss areas requiring further development.

Journal ArticleDOI
TL;DR: For large samples, the quasi-likelihood estimates of the time-specific regression coefficients over the set of predetermined time points are shown to be approximately jointly normal.
Abstract: Suppose that subjects are observed repeatedly over a common set of time points with possibly time-dependent covariates and possibly missing observations. At each time point we model the marginal distribution of the response variable and the effect of the covariates on that distribution using a class of quasi-likelihood models studied in McCullagh and Nelder. No parametric model of dependence of the repeated observations of the subject is assumed. For large samples, the quasi-likelihood estimates of the time-specific regression coefficients over the set of predetermined time points are shown to be approximately jointly normal. This, coupled with various inference procedures, provides a global picture about the effects of the covariates on the response variable over the entire study period. A lack-of-fit test for testing the adequacy of the assumed quasi-likelihood model is also provided. All the methods considered here are illustrated with real-life examples.

Journal ArticleDOI
TL;DR: Study of quantitative aspects of bias in estimates of treatment effect in survival models when there is failure to adjust on balanced prognostic variables shows that the effect of omitting balanced covariates can be modest unless the variables are strongly prognostic or many in number.
Abstract: This paper discusses the quantitative aspects of bias in estimates of treatment effect in survival models when there is failure to adjust on balanced prognostic variables. A simple numerical example of this bias is given along with approximate formulae for its calculation in the multiplicative exponential survival model. The accuracy of the formulae is checked by simulation. In addition, approximate calculations and simulations of power loss and the effects of omitting more than one prognostic covariate are presented. The Weibull and Cox models are also examined using simulation. Study of this bias is pertinent to much applied work, and shows that the effect of omitting balanced covariates can be modest unless the variables are strongly prognostic or many in number. This work emphasizes the need for thorough comparisons of adjusted and unadjusted analyses for sensible interpretation of treatment effects.

Journal ArticleDOI
TL;DR: A variety of applications of the model are discussed, including univariate and multivariate analysis of incomplete repeated measures data, analysis of growth curves with missing data using random effects and time-series models, and applications to unbalanced longitudinal data.
Abstract: Incomplete and unbalanced multivariate data often arise in longitudinal studies due to missing or unequally-timed repeated measurements and/or the presence of time-varying covariates. A general approach to analysing such data is through maximum likelihood analysis using a linear model for the expected responses, and structural models for the within-subject covariances. Two important advantages of this approach are: (1) the generality of the model allows the analyst to consider a wider range of models than were previously possible using classical methods developed for balanced and complete data, and (2) maximum likelihood estimates obtained from incomplete data are often preferable to other estimates such as those obtained from complete cases from the standpoint of bias and efficiency. A variety of applications of the model are discussed, including univariate and multivariate analysis of incomplete repeated measures data, analysis of growth curves with missing data using random effects and time-series models, and applications to unbalanced longitudinal data.

Journal ArticleDOI
TL;DR: An autoregressive model is presented for the analysis of missing and/or unequally spaced examinations in longitudinal studies for continuous outcome variables and includes consideration of both time-dependent and fixed covariates.
Abstract: Missing and/or unequally spaced examinations are often present in longitudinal studies. An autoregressive model is presented for the analysis of such data for continuous outcome variables. The fitting of the model can be accomplished by weighted non-linear regression methods available in standard statistical packages. Some features of the model include consideration of both time-dependent and fixed covariates, assessment of the relationships between changes in outcome and exposure over short periods of time, and use of all available person-time for an individual. An illustration looking at the role of personal cigarette smoking on changes in pulmonary function in children is included.