scispace - formally typeset
Search or ask a question
Author

Stef van Buuren

Bio: Stef van Buuren is an academic researcher from Utrecht University. The author has contributed to research in topics: Imputation (statistics) & Missing data. The author has an hindex of 42, co-authored 120 publications receiving 16146 citations. Previous affiliations of Stef van Buuren include Erasmus University Rotterdam & Netherlands Organisation for Applied Scientific Research.


Papers
More filters
Journal ArticleDOI
TL;DR: Mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs.
Abstract: The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as an S-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic pooling. This article documents mice, which extends the functionality of mice 1.0 in several ways. In mice, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention is paid to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. mice can be downloaded from the Comprehensive R Archive Network. This article provides a hands-on, stepwise approach to solve applied incomplete data problems.

10,234 citations

Book
29 Mar 2012
TL;DR: The problem of missing data concepts of MCAR, MAR and MNAR simple solutions that do not (always) work multiple imputation in a nutshell and some dangers, some do's and some don'ts are covered.
Abstract: Basics Introduction The problem of missing data Concepts of MCAR, MAR and MNAR Simple solutions that do not (always) work Multiple imputation in a nutshell Goal of the book What the book does not cover Structure of the book Exercises Multiple imputation Historic overview Incomplete data concepts Why and when multiple imputation works Statistical intervals and tests Evaluation criteria When to use multiple imputation How many imputations? Exercises Univariate missing data How to generate multiple imputations Imputation under the normal linear normal Imputation under non-normal distributions Predictive mean matching Categorical data Other data types Classification and regression trees Multilevel data Non-ignorable methods Exercises Multivariate missing data Missing data pattern Issues in multivariate imputation Monotone data imputation Joint Modeling Fully Conditional Specification FCS and JM Conclusion Exercises Imputation in practice Overview of modeling choices Ignorable or non-ignorable? Model form and predictors Derived variables Algorithmic options Diagnostics Conclusion Exercises Analysis of imputed data What to do with the imputed data? Parameter pooling Statistical tests for multiple imputation Stepwise model selection Conclusion Exercises Case studies Measurement issues Too many columns Sensitivity analysis Correct prevalence estimates from self-reported data Enhancing comparability Exercises Selection issues Correcting for selective drop-out Correcting for non-response Exercises Longitudinal data Long and wide format SE Fireworks Disaster Study Time raster imputation Conclusion Exercises Extensions Conclusion Some dangers, some do's and some don'ts Reporting Other applications Future developments Exercises Appendices: Software R S-Plus Stata SAS SPSS Other software References Author Index Subject Index

2,156 citations

Journal ArticleDOI
TL;DR: FCS is a semi-parametric and flexible alternative that specifies the multivariate model by a series of conditional models, one for each incomplete variable, but its statistical properties are difficult to establish.
Abstract: The goal of multiple imputation is to provide valid inferences for statistical estimates from incomplete data. To achieve that goal, imputed values should preserve the structure in the data, as well as the uncertainty about this structure, and include any knowledge about the process that generated the missing data. Two approaches for imputing multivariate data exist: joint modeling (JM) and fully conditional specification (FCS). JM is based on parametric statistical theory, and leads to imputation procedures whose statistical properties are known. JM is theoretically sound, but the joint model may lack flexibility needed to represent typical data features, potentially leading to bias. FCS is a semi-parametric and flexible alternative that specifies the multivariate model by a series of conditional models, one for each incomplete variable. FCS provides tremendous flexibility and is easy to apply, but its statistical properties are difficult to establish. Simulation work shows that FCS behaves very well in ...

2,119 citations

Journal ArticleDOI
15 Nov 2011-PLOS ONE
TL;DR: Overweight and obesity prevalences in 2009 were substantially higher than in 1980 and 1997, however, the overweight prevalence stabilized in the major cities, which might be an indication that the rising trend in overweight in the Netherlands is starting to turn.
Abstract: To assess the prevalence of overweight and obesity among Dutch children and adolescents, to examine the 30-years trend, and to create new body mass index reference charts. Design: Nationwide cross-sectional data collection by trained health care professionals. Participants: 10,129 children of Dutch origin aged 0-21 years. Main Outcome Measures: Overweight (including obesity) and obesity prevalences for Dutch children, defined by the cut-off values on body mass index references according to the International Obesity Task Force. Results: In 2009, 12.8% of the Dutch boys and 14.8% of the Dutch girls aged 2-21 years were overweight and 1.8% of the boys and 2.2% of the girls were classified as obese. This is a two to three fold higher prevalence in overweight and four to six fold increase in obesity since 1980. Since 1997, a substantial rise took place, especially in obesity, which increased 1.4 times in girls and doubled in boys. There was no increase in mean BMI SDS in the major cities since 1997. Conclusions: Overweight and obesity prevalences in 2009 were substantially higher than in 1980 and 1997. However, the overweight prevalence stabilized in the major cities. This might be an indication that the rising trend in overweight in the Netherlands is starting to turn. © 2011 Schonbeck et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

323 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs.
Abstract: The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as an S-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic pooling. This article documents mice, which extends the functionality of mice 1.0 in several ways. In mice, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention is paid to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. mice can be downloaded from the Comprehensive R Archive Network. This article provides a hands-on, stepwise approach to solve applied incomplete data problems.

10,234 citations

Journal ArticleDOI
TL;DR: The principles of the method and how to impute categorical and quantitative variables, including skewed variables, are described and shown and the practical analysis of multiply imputed data is described, including model building and model checking.
Abstract: Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments. Copyright © 2010 John Wiley & Sons, Ltd.

6,349 citations

01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading modern applied statistics with s. As you may know, people have search hundreds times for their favorite readings like this modern applied statistics with s, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. modern applied statistics with s is available in our digital library an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Kindly say, the modern applied statistics with s is universally compatible with any devices to read.

5,249 citations

Journal ArticleDOI
TL;DR: These recommendations recognize the importance of social and environmental change to reduce the obesity epidemic but also identify ways healthcare providers and health care systems can be part of broader efforts.
Abstract: To revise 1998 recommendations on childhood obesity, an Expert Committee, comprised of representatives from 15 professional organizations, appointed experienced scientists and clinicians to 3 writing groups to review the literature and recommend approaches to prevention, assessment, and treatment. Because effective strategies remain poorly defined, the writing groups used both available evidence and expert opinion to develop the recommendations. Primary care providers should universally assess children for obesity risk to improve early identification of elevated BMI, medical risks, and unhealthy eating and physical activity habits. Providers can provide obesity prevention messages for most children and suggest weight control interventions for those with excess weight. The writing groups also recommend changing office systems so that they support efforts to address the problem. BMI should be calculated and plotted at least annually, and the classification should be integrated with other information such as growth pattern, familial obesity, and medical risks to assess the child’s obesity risk. For prevention, the recommendations include both specific eating and physical activity behaviors, which are likely to promote maintenance of healthy weight, but also the use of patient-centered counseling techniques such as motivational interviewing, which helps families identify their own motivation for making change. For assessment, the recommendations include methods to screen for current medical conditions and for future risks, and methods to assess diet and physical activity behaviors. For treatment, the recommendations propose 4 stages of obesity care; the first is brief counseling that can be delivered in a health care office, and subsequent stages require more time and resources. The appropriateness of higher stages is influenced by a patient’s age and degree of excess weight. These recommendations recognize the importance of social and environmental change to reduce the obesity epidemic but also identify ways healthcare providers and health care systems can be part of broader efforts.

4,272 citations

Journal ArticleDOI
TL;DR: It is concluded that multiple Imputation for Nonresponse in Surveys should be considered as a legitimate method for answering the question of why people do not respond to survey questions.
Abstract: 25. Multiple Imputation for Nonresponse in Surveys. By D. B. Rubin. ISBN 0 471 08705 X. Wiley, Chichester, 1987. 258 pp. £30.25.

3,216 citations