scispace - formally typeset
Search or ask a question
Book

Multiple imputation and its application

TL;DR: The issues raised by missing data are clarified, the rationale for MI is outlined, the relationship between the various imputation models and associated algorithms are described, and how to consider and address the issues that arise in its application are described.
Abstract: A practical guide to analysing partially observed data. Collecting, analysing and drawing inferences from data is central to research in the medical and social sciences. Unfortunately, it is rarely possible to collect all the intended data. The literature on inference from the resulting incomplete data is now huge, and continues to grow both as methods are developed for large and complex data structures, and as increasing computer power and suitable software enable researchers to apply these methods. This book focuses on a particular statistical method for analysing and drawing inferences from incomplete data, called Multiple Imputation (MI). MI is attractive because it is both practical and widely applicable. The authors aim is to clarify the issues raised by missing data, describing the rationale for MI, the relationship between the various imputation models and associated algorithms and its application to increasingly complex data structures. Multiple Imputation and its Application: Discusses the issues raised by the analysis of partially observed data, and the assumptions on which analyses rest. Presents a practical guide to the issues to consider when analysing incomplete data from both observational studies and randomized trials. Provides a detailed discussion of the practical use of MI with real-world examples drawn from medical and social statistics. Explores handling non-linear relationships and interactions with multiple imputation, survival analysis, multilevel multiple imputation, sensitivity analysis via multiple imputation, using non-response weights with multiple imputation and doubly robust multiple imputation. Multiple Imputation and its Application is aimed at quantitative researchers and students in the medical and social sciences with the aim of clarifying the issues raised by the analysis of incomplete data data, outlining the rationale for MI and describing how to consider and address the issues that arise in its application.
Citations
More filters
Journal ArticleDOI
08 Jul 2020-Nature
TL;DR: A range of clinical factors associated with COVID-19-related death is quantified in one of the largest cohort studies on this topic so far and includes people of white ethnicity, Black and South Asian people were at higher risk, even after adjustment for other factors.
Abstract: Coronavirus disease 2019 (COVID-19) has rapidly affected mortality worldwide1. There is unprecedented urgency to understand who is most at risk of severe outcomes, and this requires new approaches for the timely analysis of large datasets. Working on behalf of NHS England, we created OpenSAFELY-a secure health analytics platform that covers 40% of all patients in England and holds patient data within the existing data centre of a major vendor of primary care electronic health records. Here we used OpenSAFELY to examine factors associated with COVID-19-related death. Primary care records of 17,278,392 adults were pseudonymously linked to 10,926 COVID-19-related deaths. COVID-19-related death was associated with: being male (hazard ratio (HR) 1.59 (95% confidence interval 1.53-1.65)); greater age and deprivation (both with a strong gradient); diabetes; severe asthma; and various other medical conditions. Compared with people of white ethnicity, Black and South Asian people were at higher risk, even after adjustment for other factors (HR 1.48 (1.29-1.69) and 1.45 (1.32-1.58), respectively). We have quantified a range of clinical factors associated with COVID-19-related death in one of the largest cohort studies on this topic so far. More patient records are rapidly being added to OpenSAFELY, we will update and extend our results regularly.

4,263 citations

Journal ArticleDOI
TL;DR: The missMDA as mentioned in this paper package performs principal component analysis on incomplete data sets, aiming to obtain scores, loadings and graphical representations despite missing values, and can be used to perform single imputation to complete data involving continuous, categorical and mixed variables.
Abstract: We present the R package missMDA which performs principal component methods on incomplete data sets, aiming to obtain scores, loadings and graphical representations despite missing values. Package methods include principal component analysis for continuous variables, multiple correspondence analysis for categorical variables, factorial analysis on mixed data for both continuous and categorical variables, and multiple factor analysis for multi-table data. Furthermore, missMDA can be used to perform single imputation to complete data involving continuous, categorical and mixed variables. A multiple imputation method is also available. In the principal component analysis framework, variability across different imputations is represented by confidence areas around the row and column positions on the graphical outputs. This allows assessment of the credibility of results obtained from incomplete data sets.

758 citations

Journal ArticleDOI
TL;DR: Multiple imputation is an alternative method to deal withMissing data, which accounts for the uncertainty associated with missing data, and provides unbiased and valid estimates of associations based on information from the available data.
Abstract: Missing data are ubiquitous in clinical epidemiological research. Individuals with missing data may differ from those with no missing data in terms of the outcome of interest and prognosis in general. Missing data are often categorized into the following three types: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). In clinical epidemiological research, missing data are seldom MCAR. Missing data can constitute considerable challenges in the analyses and interpretation of results and can potentially weaken the validity of results and conclusions. A number of methods have been developed for dealing with missing data. These include complete-case analyses, missing indicator method, single value imputation, and sensitivity analyses incorporating worst-case and best-case scenarios. If applied under the MCAR assumption, some of these methods can provide unbiased but often less precise estimates. Multiple imputation is an alternative method to deal with missing data, which accounts for the uncertainty associated with missing data. Multiple imputation is implemented in most statistical software under the MAR assumption and provides unbiased and valid estimates of associations based on information from the available data. The method affects not only the coefficient estimates for variables with missing data but also the estimates for other variables with no missing data.

562 citations


Cites background or methods from "Multiple imputation and its applica..."

  • ...Auxiliary variables that are strongly associated with both the value and the missingness are more likely to have an impact on the results of multiple imputation and reduce bias.(19) Based on our knowledge of the data, research question, or literature, we may Figure 4 Selection of variables in order to create multiple imputed datasets when looking into the association between body mass index and transfusion risk....

    [...]

  • ...In the third stage, measures of association from each imputed dataset are combined by Rubin’s rules, with the corresponding standard errors (and hence the confidence intervals [CIs]) accounting for both the between- and withinimputation variations (Figure 5).(19,23) Multiple imputation algorithms are implemented in all major statistical software (eg, SPSS, Stata, SAS, and R), which contain many detailed examples and step-by-step tutorials on both univariate and multivariate multiple imputations....

    [...]

Journal ArticleDOI
TL;DR: An eight-step procedure for better validation of meta-analytic results in systematic reviews of randomised clinical trials is proposed, which will increase the validity of assessments of intervention effects in systematic Reviews of Randomised Clinical trials.
Abstract: Background: Thresholds for statistical significance when assessing meta-analysis results are being insufficiently demonstrated by traditional 95% confidence intervals and P-values. Assessment of intervention effects in systematic reviews with meta-analysis deserves greater rigour. Methods: Methodologies for assessing statistical and clinical significance of intervention effects in systematic reviews were considered. Balancing simplicity and comprehensiveness, an operational procedure was developed, based mainly on The Cochrane Collaboration methodology and the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) guidelines. Results: We propose an eight-step procedure for better validation of meta-analytic results in systematic reviews (1) Obtain the 95% confidence intervals and the P-values from both fixed-effect and random-effects meta-analyses and report the most conservative results as the main results. (2) Explore the reasons behind substantial statistical heterogeneity using subgroup and sensitivity analyses (see step 6). (3) To take account of problems with multiplicity adjust the thresholds for significance according to the number of primary outcomes. (4) Calculate required information sizes (≈ the ap riorirequired number of participants for a meta-analysis to be conclusive) for all outcomes and analyse each outcome with trial sequential analysis. Report whether the trial sequential monitoring boundaries for benefit, harm, or futility are crossed. (5) Calculate Bayes factors for all primary outcomes. (6) Use subgroup analyses and sensitivity analyses to assess the potential impact of bias on the review results. (7) Assess the risk of publication bias. (8) Assess the clinical significance of the statistically significant review results. Conclusions: If followed, the proposed eight-step procedure will increase the validity of assessments of intervention effects in systematic reviews of randomised clinical trials.

431 citations


Cites methods from "Multiple imputation and its applica..."

  • ...For all meta-analyses, we recommend using at least two sensitivity analyses to assess the potential impact of the missing outcome data (risk of attrition bias) on the meta-analysis results [67]....

    [...]

Journal ArticleDOI
TL;DR: The role of HSCs in liver fibrosis is outlined and novel strategies to suppress HSC activity are details, thereby providing new insights into potential treatments for liver Fibrosis.
Abstract: Liver fibrosis is a reversible wound-healing process aimed at maintaining organ integrity, and presents as the critical pre-stage of liver cirrhosis, which will eventually progress to hepatocellular carcinoma in the absence of liver transplantation. Fibrosis generally results from chronic hepatic injury caused by various factors, mainly viral infection, schistosomiasis, and alcoholism; however, the exact pathological mechanisms are still unknown. Although numerous drugs have been shown to have antifibrotic activity in vitro and in animal models, none of these drugs have been shown to be efficacious in the clinic. Importantly, hepatic stellate cells (HSCs) play a key role in the initiation, progression, and regression of liver fibrosis by secreting fibrogenic factors that encourage portal fibrocytes, fibroblasts, and bone marrow-derived myofibroblasts to produce collagen and thereby propagate fibrosis. These cells are subject to intricate cross-talk with adjacent cells, resulting in scarring and subsequent liver damage. Thus, an understanding of the molecular mechanisms of liver fibrosis and their relationships with HSCs is essential for the discovery of new therapeutic targets. This comprehensive review outlines the role of HSCs in liver fibrosis and details novel strategies to suppress HSC activity, thereby providing new insights into potential treatments for liver fibrosis.

363 citations