On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

doi:10.1023/A:1009752403260

Journal ArticleDOI

On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Steven L. Salzberg

- 31 Jan 1997 -

Data Mining and Knowledge Discovery

- Vol. 1, Iss: 3, pp 317-328

Chats0

TLDR

Several phenomena that can, if ignored, invalidate an experimental comparison and the conclusions that follow apply not only to classification, but to computational experiments in almost any aspect of data mining.

Abstract:

An important component of many data mining projects is finding a good classification algorithm, a process that requires very careful thought about experimental design. If not done very carefully, comparative studies of classification and other types of algorithms can easily result in statistically invalid conclusions. This is especially true when one is using data mining techniques to analyze very large databases, which inevitably contain some statistically unlikely data. This paper describes several phenomena that can, if ignored, invalidate an experimental comparison. These phenomena and the conclusions that follow apply not only to classification, but to computational experiments in almost any aspect of data mining. The paper also discusses why comparative analysis is more important in evaluating some types of algorithms than for others, and provides some suggestions about how to avoid the pitfalls suffered by many experimental studies.

Citations

PDF

Open Access

More filters

Journal Article

Statistical Comparisons of Classifiers over Multiple Data Sets

Janez Demšar

- 01 Dec 2006 -

Journal of Machine Learning Research

TL;DR: A set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers is recommended: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparisons of more classifiers over multiple data sets.

...read moreread less

Posted Content

Principles of data mining

David J. Hand, +2 more

TL;DR: This paper gives a lightning overview of data mining and its relation to statistics, with particular emphasis on tools for the detection of adverse drug reactions.

...read moreread less

Journal ArticleDOI

Neural networks for classification: a survey

G.P. Zhang

TL;DR: The issues of posterior probability estimation, the link between neural and conventional classifiers, learning and generalization tradeoff in classification, the feature variable selection, as well as the effect of misclassification costs are examined.

...read moreread less

Journal ArticleDOI

Logistic regression and artificial neural network classification models: a methodology review

Stephan Dreiseitl, +1 more

- 01 Oct 2002 -

Journal of Biomedical Informatics

TL;DR: In this paper, the differences and similarities of these models from a technical point of view, and compare them with other machine learning algorithms are summarized and compared using a set of quality criteria for logistic regression and artificial neural networks.

...read moreread less

Journal ArticleDOI

Querying and mining of time series data: experimental comparison of representations and distance measures

Hui Ding, +4 more

TL;DR: An extensive set of time series experiments are conducted re-implementing 8 different representation methods and 9 similarity measures and their variants and testing their effectiveness on 38 time series data sets from a wide variety of application domains to provide a unified validation of some of the existing achievements.

...read moreread less

Collapse

References

PDF

Open Access

More filters

UCI Repository of machine learning databases

Catherine Blake

Journal ArticleDOI

Bayesian Model Selection in Social Research

Adrian E. Raftery

- 01 Jan 1995 -

Sociological Methodology

TL;DR: In this article, a Bayesian approach to hypothesis testing, model selection, and accounting for model uncertainty is presented, which is straightforward through the use of the simple and accurate BIC approximation, and it can be done using the output from standard software.

...read moreread less

Book

Experimental Designs, 2nd Edition

William Gemmell Cochran

Proceedings Article

Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning

Usama M. Fayyad, +1 more

TL;DR: This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuous-valued attribute into multiple intervals.

...read moreread less

Journal ArticleDOI

Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Robert C. Holte

- 01 Apr 1993 -

Machine Learning

TL;DR: On most datasets studied, the best of very simple rules that classify examples on the basis of a single attribute is as accurate as the rules induced by the majority of machine learning systems.

...read moreread less

ACM Transactions on Intelligent Systems ...

Data Mining: Practical Machine Learning Tools and Techniques

Ian H. Witten, +2 more

Random Forests

Leo Breiman

On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Citations

Statistical Comparisons of Classifiers over Multiple Data Sets

Principles of data mining

Neural networks for classification: a survey

Logistic regression and artificial neural network classification models: a methodology review

Querying and mining of time series data: experimental comparison of representations and distance measures

References

UCI Repository of machine learning databases

Bayesian Model Selection in Social Research

Experimental Designs, 2nd Edition

Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning

Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Related Papers (5)

The Nature of Statistical Learning Theory

C4.5: Programs for Machine Learning

LIBSVM: A library for support vector machines

Data Mining: Practical Machine Learning Tools and Techniques

Random Forests