scispace - formally typeset
Search or ask a question
Author

Willi Sauerbrei

Bio: Willi Sauerbrei is an academic researcher from University of Freiburg. The author has contributed to research in topics: Regression analysis & Breast cancer. The author has an hindex of 59, co-authored 227 publications receiving 30936 citations. Previous affiliations of Willi Sauerbrei include University Medical Center Freiburg & University of Erlangen-Nuremberg.


Papers
More filters
Journal ArticleDOI
TL;DR: The 10-year and 15-year effects of various systemic adjuvant therapies on breast cancer recurrence and survival are reported and it is found that the cumulative reduction in mortality is more than twice as big at 15 years as at 5 years after diagnosis.

6,309 citations

Journal Article
TL;DR: The age-specific benefits of polychemotherapy appeared to be largely irrespective of menopausal status at presentation, oestrogen receptor status of the primary tumour, and of whether adjuvant tamoxifen had been given.

2,945 citations

Journal ArticleDOI
TL;DR: In this article, the authors present guidelines for the reporting of tumor marker studies, which encourage transparent and complete reporting so that the relevant information will be available to others to help them to judge the usefulness of the data and understand the context in which the conclusions apply.
Abstract: Despite years of research and hundreds of reports on tumor markers in oncology, the number of markers that have emerged as clinically useful is pitifully small. Often, initially reported studies of a marker show great promise, but subsequent studies on the same or related markers yield inconsistent conclusions or stand in direct contradiction to the promising results. It is imperative that we attempt to understand the reasons that multiple studies of the same marker lead to differing conclusions. A variety of methodologic problems have been cited to explain these discrepancies. Unfortunately, many tumor marker studies have not been reported in a rigorous fashion, and published articles often lack sufficient information to allow adequate assessment of the quality of the study or the generalizability of study results. The development of guidelines for the reporting of tumor marker studies was a major recommendation of the National Cancer Institute-European Organisation for Research and Treatment of Cancer (NCI-EORTC) First International Meeting on Cancer Diagnostics in 2000. As for the successful CONSORT initiative for randomized trials and for the STARD statement for diagnostic studies, we suggest guidelines to provide relevant information about the study design, preplanned hypotheses, patient and specimen characteristics, assay methods, and statistical analysis methods. In addition, the guidelines suggest helpful presentations of data and important elements to include in discussions. The goal of these guidelines is to encourage transparent and complete reporting so that the relevant information will be available to others to help them to judge the usefulness of the data and understand the context in which the conclusions apply.

1,892 citations

Journal ArticleDOI
TL;DR: It is argued that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power and residual confounding.
Abstract: In medical research, continuous variables are often converted into categorical variables by grouping values into two or more categories. We consider in detail issues pertaining to creating just two groups, a common approach in clinical research. We argue that the simplicity achieved is gained at a cost; dichotomization may create rather than avoid problems, notably a considerable loss of power and residual confounding. In addition, the use of a data-derived 'optimal' cutpoint leads to serious bias. We illustrate the impact of dichotomization of continuous predictor variables using as a detailed case study a randomized trial in primary biliary cirrhosis. Dichotomization of continuous data is unnecessary for statistical analysis and in particular should not be applied to explanatory variables in regression models.

1,853 citations

Journal Article
TL;DR: These guidelines are designed to encourage transparent and complete reporting of tumor marker studies so that the relevant information will be available to others to help them to judge the usefulness of the data and understand the context in which the conclusions apply.
Abstract: Despite years of research and hundreds of reports on tumor markers in oncology, the number of markers that have emerged as clinically useful is pitifully small. Often, initially reported studies of a marker show great promise, but subsequent studies on the same or related markers yield inconsistent conclusions or stand in direct contradiction to the promising results. It is imperative that we attempt to understand the reasons that multiple studies of the same marker lead to differing conclusions. A variety of methodologic problems have been cited to explain these discrepancies. Unfortunately, many tumor marker studies have not been reported in a rigorous fashion, and published articles often lack sufficient information to allow adequate assessment of the quality of the study or the generalizability of study results. The development of guidelines for the reporting of tumor marker studies was a major recommendation of the National Cancer Institute - European Organisation for Research and Treatment of Cancer (NCI - EORTC) First International Meeting on Cancer Diagnostics in 2000. As for the successful CONSORT initiative for randomized trials and for the STARD statement for diagnostic studies, we suggest guidelines to provide relevant information about the study design, preplanned hypotheses, patient and specimen characteristics, assay methods, and statistical analysis methods. In addition, the guidelines suggest helpful presentations of data and important elements to include in discussions. The goal of these guidelines is to encourage transparent and complete reporting so that the relevant information will be available to others to help them to judge the usefulness of the data and understand the context in which the conclusions apply.

1,782 citations


Cited by
More filters
Journal ArticleDOI
19 Apr 2000-JAMA
TL;DR: A checklist contains specifications for reporting of meta-analyses of observational studies in epidemiology, including background, search strategy, methods, results, discussion, and conclusion should improve the usefulness ofMeta-an analyses for authors, reviewers, editors, readers, and decision makers.
Abstract: ObjectiveBecause of the pressure for timely, informed decisions in public health and clinical practice and the explosion of information in the scientific literature, research results must be synthesized. Meta-analyses are increasingly used to address this problem, and they often evaluate observational studies. A workshop was held in Atlanta, Ga, in April 1997, to examine the reporting of meta-analyses of observational studies and to make recommendations to aid authors, reviewers, editors, and readers.ParticipantsTwenty-seven participants were selected by a steering committee, based on expertise in clinical practice, trials, statistics, epidemiology, social sciences, and biomedical editing. Deliberations of the workshop were open to other interested scientists. Funding for this activity was provided by the Centers for Disease Control and Prevention.EvidenceWe conducted a systematic review of the published literature on the conduct and reporting of meta-analyses in observational studies using MEDLINE, Educational Research Information Center (ERIC), PsycLIT, and the Current Index to Statistics. We also examined reference lists of the 32 studies retrieved and contacted experts in the field. Participants were assigned to small-group discussions on the subjects of bias, searching and abstracting, heterogeneity, study categorization, and statistical methods.Consensus ProcessFrom the material presented at the workshop, the authors developed a checklist summarizing recommendations for reporting meta-analyses of observational studies. The checklist and supporting evidence were circulated to all conference attendees and additional experts. All suggestions for revisions were addressed.ConclusionsThe proposed checklist contains specifications for reporting of meta-analyses of observational studies in epidemiology, including background, search strategy, methods, results, discussion, and conclusion. Use of the checklist should improve the usefulness of meta-analyses for authors, reviewers, editors, readers, and decision makers. An evaluation plan is suggested and research areas are explored.

17,663 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
31 Jan 2002-Nature
TL;DR: DNA microarray analysis on primary breast tumours of 117 young patients is used and supervised classification is applied to identify a gene expression signature strongly predictive of a short interval to distant metastases (‘poor prognosis’ signature) in patients without tumour cells in local lymph nodes at diagnosis, providing a strategy to select patients who would benefit from adjuvant therapy.
Abstract: Breast cancer patients with the same stage of disease can have markedly different treatment responses and overall outcome. The strongest predictors for metastases (for example, lymph node status and histological grade) fail to classify accurately breast tumours according to their clinical behaviour. Chemotherapy or hormonal therapy reduces the risk of distant metastases by approximately one-third; however, 70-80% of patients receiving this treatment would have survived without it. None of the signatures of breast cancer gene expression reported to date allow for patient-tailored therapy strategies. Here we used DNA microarray analysis on primary breast tumours of 117 young patients, and applied supervised classification to identify a gene expression signature strongly predictive of a short interval to distant metastases ('poor prognosis' signature) in patients without tumour cells in local lymph nodes at diagnosis (lymph node negative). In addition, we established a signature that identifies tumours of BRCA1 carriers. The poor prognosis signature consists of genes regulating cell cycle, invasion, metastasis and angiogenesis. This gene expression profile will outperform all currently used clinical parameters in predicting disease outcome. Our findings provide a strategy to select patients who would benefit from adjuvant therapy.

9,664 citations

Journal ArticleDOI
TL;DR: In this article, an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities are discussed, which are particularly needed for binary, ordinal, and time-to-event outcomes.
Abstract: Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.

7,879 citations

Journal ArticleDOI
TL;DR: The principles of the method and how to impute categorical and quantitative variables, including skewed variables, are described and shown and the practical analysis of multiply imputed data is described, including model building and model checking.
Abstract: Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments. Copyright © 2010 John Wiley & Sons, Ltd.

6,349 citations