Home
/
Authors
/
Ewout W. Steyerberg

Author

Ewout W. Steyerberg

Other affiliations: Indiana University, Ottawa Hospital, University of Edinburgh ...read more

Bio: Ewout W. Steyerberg is an academic researcher from Leiden University Medical Center. The author has contributed to research in topics: Population & Medicine. The author has an hindex of 139, co-authored 1226 publications receiving 84896 citations. Previous affiliations of Ewout W. Steyerberg include Indiana University & Ottawa Hospital.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Assessing the performance of prediction models: a framework for traditional and novel measures.

[...]

Ewout W. Steyerberg¹, Andrew J. Vickers², Nancy R. Cook³, Thomas A. Gerds⁴, Mithat Gonen², Nancy A Obuchowski, Michael J. Pencina⁵, Michael W. Kattan⁶ - Show less +4 more•Institutions (6)

Erasmus University Rotterdam¹, Memorial Sloan Kettering Cancer Center², Harvard University³, University of Copenhagen⁴, Boston University⁵, Case Western Reserve University⁶

01 Jan 2010-Epidemiology

TL;DR: It is suggested that reporting discrimination and calibration will always be important for a prediction model and decision-analytic measures should be reported if the predictive model is to be used for clinical decisions.

...read moreread less

Abstract: The performance of prediction models can be assessed using a variety of methods and metrics. Traditional measures for binary and survival outcomes include the Brier score to indicate overall model performance, the concordance (or c) statistic for discriminative ability (or area under the receiver operating characteristic [ROC] curve), and goodness-of-fit statistics for calibration.Several new measures have recently been proposed that can be seen as refinements of discrimination measures, including variants of the c statistic for survival, reclassification tables, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). Moreover, decision-analytic measures have been proposed, including decision curves to plot the net benefit achieved by making decisions based on model predictions.We aimed to define the role of these relatively novel approaches in the evaluation of the performance of prediction models. For illustration, we present a case study of predicting the presence of residual tumor versus benign tissue in patients with testicular cancer (n = 544 for model development, n = 273 for external validation).We suggest that reporting discrimination and calibration will always be important for a prediction model. Decision-analytic measures should be reported if the predictive model is to be used for clinical decisions. Other measures of performance may be warranted in specific applications, such as reclassification metrics to gain insight into the value of adding a novel predictor to an established model.

...read moreread less

3,473 citations

Journal Article•DOI•

Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): explanation and elaboration.

[...]

Karel G.M. Moons¹, Douglas G. Altman², Johannes B. Reitsma¹, John P. A. Ioannidis³, Petra Macaskill⁴, Ewout W. Steyerberg⁵, Andrew J. Vickers⁶, David F. Ransohoff, Gary S. Collins² - Show less +5 more•Institutions (6)

Utrecht University¹, University of Oxford², Stanford University³, University of Sydney⁴, Erasmus University Rotterdam⁵, Memorial Sloan Kettering Cancer Center⁶

06 Jan 2015-Annals of Internal Medicine

TL;DR: In virtually all medical domains, diagnostic and prognostic multivariable prediction models are being developed, validated, updated, and implemented with the aim to assist doctors and individuals in estimating probabilities and potentially influence their decision making.

...read moreread less

Abstract: The TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis) Statement includes a 22-item checklist, which aims to improve the reporting of studies developing, validating, or updating a prediction model, whether for diagnostic or prognostic purposes. The TRIPOD Statement aims to improve the transparency of the reporting of a prediction model study regardless of the study methods used. This explanation and elaboration document describes the rationale; clarifies the meaning of each item; and discusses why transparent reporting is important, with a view to assessing risk of bias and clinical usefulness of the prediction model. Each checklist item of the TRIPOD Statement is explained in detail and accompanied by published examples of good reporting. The document also provides a valuable reference of issues to consider when designing, conducting, and analyzing prediction model studies. To aid the editorial process and help peer reviewers and, ultimately, readers and systematic reviewers of prediction model studies, it is recommended that authors include a completed checklist in their submission. The TRIPOD checklist can also be downloaded from www.tripod-statement.org.

...read moreread less

2,982 citations

Book•

Clinical Prediction Models

[...]

Ewout W. Steyerberg

16 Mar 2009

TL;DR: This paper presents a case study on survival analysis: Prediction of secondary cardiovascular events and lessons from case studies on overfitting and optimism in prediction models.

...read moreread less

Abstract: Introduction.- Applications of prediction models.- Study design for prediction models.- Statistical models for prediction.- Overfitting and optimism in prediction models.- Choosing between alternative statistical models.- Dealing with missing values.- Case study on dealing with missing values.- Coding of categorical and continuous predictors.- Restrictions on candidate predictors.- Selection of main effects.- Assumptions in regression models: Additivity and linearity.- Modern estimation methods.- Estimation with external methods.- Evaluation of performance.- Clinical usefulness.- Validation of prediction models.- Presentation formats.- Patterns of external validity.- Updating for a new setting.- Updating for a multiple settings.- Prediction of a binary outcome: 30-day mortality after acute myocardial infarction.- Case study on survival analysis: Prediction of secondary cardiovascular events.- Lessons from case studies.

...read moreread less

2,771 citations

Journal Article•DOI•

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

[...]

Laure Wynants¹, Laure Wynants², Ben Van Calster², Ben Van Calster³, Gary S. Collins⁴, Gary S. Collins⁵, Richard D Riley⁶, Georg Heinze⁷, Ewoud Schuit⁸, Marc J.M. Bonten⁸, Darren Dahly⁹, Johanna A A G Damen⁸, Thomas P. A. Debray⁸, Valentijn M.T. de Jong⁸, Maarten De Vos², Paula Dhiman⁵, Paula Dhiman⁴, Maria C Haller⁷, Michael O. Harhay¹⁰, Liesbet Henckaerts², Pauline Heus⁸, Michael Kammer⁷, Nina Kreuzberger¹¹, Anna Lohmann³, Kim Luijken³, Jie Ma⁴, Glen P. Martin¹², David J. McLernon¹³, Constanza L Andaur Navarro⁸, Johannes B. Reitsma⁸, Jamie C. Sergeant¹², Chunhu Shi¹⁴, Nicole Skoetz⁷, Luc J.M. Smits¹, Kym I E Snell⁶, Matthew Sperrin¹⁵, René Spijker¹⁶, René Spijker⁸, Ewout W. Steyerberg³, Toshihiko Takada⁸, Ioanna Tzoulaki¹⁷, Ioanna Tzoulaki¹⁸, Sander M. J. van Kuijk¹⁹, Bas C T van Bussel¹⁹, Bas C T van Bussel¹, Iwan C. C. van der Horst¹⁹, Florien S. van Royen⁸, Jan Y Verbakel⁵, Jan Y Verbakel², Christine Wallisch⁷, Christine Wallisch²⁰, Jack Wilkinson¹², Robert Wolff, Lotty Hooft⁸, Karel G.M. Moons⁸, Maarten van Smeden⁸ - Show less +52 more•Institutions (20)

Public Health Research Institute¹, Katholieke Universiteit Leuven², Leiden University³, John Radcliffe Hospital⁴, University of Oxford⁵, Keele University⁶, Medical University of Vienna⁷, University Medical Center Utrecht⁸, University College Cork⁹, University of Pennsylvania¹⁰, University of Cologne¹¹, Manchester Academic Health Science Centre¹², University of Aberdeen¹³, RMIT University¹⁴, University of Manchester¹⁵, University of Amsterdam¹⁶, University of Ioannina¹⁷, Imperial College London¹⁸, Maastricht University Medical Centre¹⁹, Humboldt University of Berlin²⁰

07 Apr 2020-BMJ

TL;DR: Proposed models for covid-19 are poorly reported, at high risk of bias, and their reported performance is probably optimistic, according to a review of published and preprint reports.

...read moreread less

Abstract: Objective To review and appraise the validity and usefulness of published and preprint reports of prediction models for diagnosing coronavirus disease 2019 (covid-19) in patients with suspected infection, for prognosis of patients with covid-19, and for detecting people in the general population at increased risk of covid-19 infection or being admitted to hospital with the disease. Design Living systematic review and critical appraisal by the COVID-PRECISE (Precise Risk Estimation to optimise covid-19 Care for Infected or Suspected patients in diverse sEttings) group. Data sources PubMed and Embase through Ovid, up to 1 July 2020, supplemented with arXiv, medRxiv, and bioRxiv up to 5 May 2020. Study selection Studies that developed or validated a multivariable covid-19 related prediction model. Data extraction At least two authors independently extracted data using the CHARMS (critical appraisal and data extraction for systematic reviews of prediction modelling studies) checklist; risk of bias was assessed using PROBAST (prediction model risk of bias assessment tool). Results 37 421 titles were screened, and 169 studies describing 232 prediction models were included. The review identified seven models for identifying people at risk in the general population; 118 diagnostic models for detecting covid-19 (75 were based on medical imaging, 10 to diagnose disease severity); and 107 prognostic models for predicting mortality risk, progression to severe disease, intensive care unit admission, ventilation, intubation, or length of hospital stay. The most frequent types of predictors included in the covid-19 prediction models are vital signs, age, comorbidities, and image features. Flu-like symptoms are frequently predictive in diagnostic models, while sex, C reactive protein, and lymphocyte counts are frequent prognostic factors. Reported C index estimates from the strongest form of validation available per model ranged from 0.71 to 0.99 in prediction models for the general population, from 0.65 to more than 0.99 in diagnostic models, and from 0.54 to 0.99 in prognostic models. All models were rated at high or unclear risk of bias, mostly because of non-representative selection of control patients, exclusion of patients who had not experienced the event of interest by the end of the study, high risk of model overfitting, and unclear reporting. Many models did not include a description of the target population (n=27, 12%) or care setting (n=75, 32%), and only 11 (5%) were externally validated by a calibration plot. The Jehi diagnostic model and the 4C mortality score were identified as promising models. Conclusion Prediction models for covid-19 are quickly entering the academic literature to support medical decision making at a time when they are urgently needed. This review indicates that almost all pubished prediction models are poorly reported, and at high risk of bias such that their reported predictive performance is probably optimistic. However, we have identified two (one diagnostic and one prognostic) promising models that should soon be validated in multiple cohorts, preferably through collaborative efforts and data sharing to also allow an investigation of the stability and heterogeneity in their performance across populations and settings. Details on all reviewed models are publicly available at https://www.covprecise.org/. Methodological guidance as provided in this paper should be followed because unreliable predictions could cause more harm than benefit in guiding clinical decisions. Finally, prediction model authors should adhere to the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) reporting guideline. Systematic review registration Protocol https://osf.io/ehc47/, registration https://osf.io/wy245. Readers’ note This article is a living systematic review that will be updated to reflect emerging evidence. Updates may occur for up to two years from the date of original publication. This version is update 3 of the original article published on 7 April 2020 (BMJ 2020;369:m1328). Previous updates can be found as data supplements (https://www.bmj.com/content/369/bmj.m1328/related#datasupp). When citing this paper please consider adding the update number and date of access for clarity.

...read moreread less

2,183 citations

Journal Article•DOI•

Internal validation of predictive models: efficiency of some procedures for logistic regression analysis.

[...]

Ewout W. Steyerberg¹, Frank E. Harrell², Gerard J. J. M. Borsboom¹, Marinus J.C. Eijkemans¹, Yvonne Vergouwe¹, J. Dik F. Habbema¹ - Show less +2 more•Institutions (2)

Erasmus University Rotterdam¹, University of Virginia²

01 Aug 2001-Journal of Clinical Epidemiology

TL;DR: It is concluded that split-sample validation is inefficient, and bootstrapping is recommended for estimation of internal validity of a predictive logistic regression model.

...read moreread less

2,155 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Measuring inconsistency in meta-analyses

[...]

Julian P T Higgins¹, Simon G. Thompson¹, Jonathan J Deeks, Douglas G. Altman•Institutions (1)

University of Cambridge¹

04 Sep 2003-BMJ

TL;DR: A new quantity is developed, I 2, which the authors believe gives a better measure of the consistency between trials in a meta-analysis, which is susceptible to the number of trials included in the meta- analysis.

...read moreread less

Abstract: Cochrane Reviews have recently started including the quantity I 2 to help readers assess the consistency of the results of studies in meta-analyses. What does this new quantity mean, and why is assessment of heterogeneity so important to clinical practice? Systematic reviews and meta-analyses can provide convincing and reliable evidence relevant to many aspects of medicine and health care.1 Their value is especially clear when the results of the studies they include show clinically important effects of similar magnitude. However, the conclusions are less clear when the included studies have differing results. In an attempt to establish whether studies are consistent, reports of meta-analyses commonly present a statistical test of heterogeneity. The test seeks to determine whether there are genuine differences underlying the results of the studies (heterogeneity), or whether the variation in findings is compatible with chance alone (homogeneity). However, the test is susceptible to the number of trials included in the meta-analysis. We have developed a new quantity, I 2, which we believe gives a better measure of the consistency between trials in a meta-analysis. Assessment of the consistency of effects across studies is an essential part of meta-analysis. Unless we know how consistent the results of studies are, we cannot determine the generalisability of the findings of the meta-analysis. Indeed, several hierarchical systems for grading evidence state that the results of studies must be consistent or homogeneous to obtain the highest grading.2–4 Tests for heterogeneity are commonly used to decide on methods for combining studies and for concluding consistency or inconsistency of findings.5 6 But what does the test achieve in practice, and how should the resulting P values be interpreted? A test for heterogeneity examines the null hypothesis that all studies are evaluating the same effect. The usual test statistic …

...read moreread less

45,105 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Book•

世界経済・社会統計 = World development indicators

[...]

泰彦鳥居, 裕秋山

01 Jan 1998

9,675 citations

Standards of Medical Care in Diabetes

[...]

Harry J. Morris

01 Jan 2014

TL;DR: These standards of care are intended to provide clinicians, patients, researchers, payors, and other interested individuals with the components of diabetes care, treatment goals, and tools to evaluate the quality of care.

...read moreread less

Abstract: XI. STRATEGIES FOR IMPROVING DIABETES CARE D iabetes is a chronic illness that requires continuing medical care and patient self-management education to prevent acute complications and to reduce the risk of long-term complications. Diabetes care is complex and requires that many issues, beyond glycemic control, be addressed. A large body of evidence exists that supports a range of interventions to improve diabetes outcomes. These standards of care are intended to provide clinicians, patients, researchers, payors, and other interested individuals with the components of diabetes care, treatment goals, and tools to evaluate the quality of care. While individual preferences, comorbidities, and other patient factors may require modification of goals, targets that are desirable for most patients with diabetes are provided. These standards are not intended to preclude more extensive evaluation and management of the patient by other specialists as needed. For more detailed information, refer to Bode (Ed.): Medical Management of Type 1 Diabetes (1), Burant (Ed): Medical Management of Type 2 Diabetes (2), and Klingensmith (Ed): Intensive Diabetes Management (3). The recommendations included are diagnostic and therapeutic actions that are known or believed to favorably affect health outcomes of patients with diabetes. A grading system (Table 1), developed by the American Diabetes Association (ADA) and modeled after existing methods, was utilized to clarify and codify the evidence that forms the basis for the recommendations. The level of evidence that supports each recommendation is listed after each recommendation using the letters A, B, C, or E.

...read moreread less

9,618 citations

Journal Article•DOI•

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls

[...]

Paul Burton¹, David Clayton², Lon R. Cardon, Nicholas John Craddock³ +192 more•Institutions (4)

07 Jun 2007-Nature

TL;DR: This study has demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of multiple disease phenotypes; generated a genome-wide genotype database for future studies of common diseases in the British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population stratification in theBritish population is generally modest.

...read moreread less

Abstract: There is increasing evidence that genome-wide association ( GWA) studies represent a powerful approach to the identification of genes involved in common human diseases. We describe a joint GWA study ( using the Affymetrix GeneChip 500K Mapping Array Set) undertaken in the British population, which has examined similar to 2,000 individuals for each of 7 major diseases and a shared set of similar to 3,000 controls. Case-control comparisons identified 24 independent association signals at P < 5 X 10(-7): 1 in bipolar disorder, 1 in coronary artery disease, 9 in Crohn's disease, 3 in rheumatoid arthritis, 7 in type 1 diabetes and 3 in type 2 diabetes. On the basis of prior findings and replication studies thus-far completed, almost all of these signals reflect genuine susceptibility effects. We observed association at many previously identified loci, and found compelling evidence that some loci confer risk for more than one of the diseases studied. Across all diseases, we identified a large number of further signals ( including 58 loci with single-point P values between 10(-5) and 5 X 10(-7)) likely to yield additional susceptibility loci. The importance of appropriately large samples was confirmed by the modest effect sizes observed at most loci identified. This study thus represents a thorough validation of the GWA approach. It has also demonstrated that careful use of a shared control group represents a safe and effective approach to GWA analyses of multiple disease phenotypes; has generated a genome-wide genotype database for future studies of common diseases in the British population; and shown that, provided individuals with non-European ancestry are excluded, the extent of population stratification in the British population is generally modest. Our findings offer new avenues for exploring the pathophysiology of these important disorders. We anticipate that our data, results and software, which will be widely available to other investigators, will provide a powerful resource for human genetics research.

...read moreread less

9,244 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse