Statistical Comparisons of Classifiers over Multiple Data Sets
Citations
28,685 citations
15,935 citations
Cites background from "Statistical Comparisons of Classifi..."
...As noted above, there has recently been considerable interest in learning recognition from “weak” supervision (Duygulu et al 2002; Fergus et al 2007)....
[...]
3,945 citations
Cites background from "Statistical Comparisons of Classifi..."
...Demsar (2006) surveys how classifiers are compared over multiple data sets....
[...]
...Demsar (2006) surveys how classifiers are compared over multiple data sets....
[...]
3,832 citations
Cites methods from "Statistical Comparisons of Classifi..."
...For the Wilcoxon’s test, a maximum of 30 domains is suggested [4]....
[...]
3,170 citations
Cites background or methods from "Statistical Comparisons of Classifi..."
...[87] found this non-parametric approach to be more robust....
[...]
...(b) Critical difference diagram [87]: the x-axis shows mean rank, blue bars link detectors for which there is insufficient evidence to declare them statistically significantly different (due to the relatively low number of performance samples and fairly high variance)....
[...]
...A further in-depth study by Garcı́a and Herrera [88] concludes that the Nemenyi post-hoc test which was used by [87] (and also in the PASCAL challenge [14]) is too conservative for n × n comparisons such as in a benchmark....
[...]
...[87] introduced a series of powerful statistical tests that operate on an m dataset by n algorithm performance matrix (e....
[...]
References
28,685 citations
20,459 citations
"Statistical Comparisons of Classifi..." refers methods in this paper
...The simplest such methods are due to Holm (1979) and Hochberg (1988)....
[...]
12,940 citations
"Statistical Comparisons of Classifi..." refers methods in this paper
...We have compiled a sample of forty real-world data sets,2 from the UCI machine learning repository (Blake and Merz, 1998); we have used the data sets with discrete classes and avoided artificial data sets like Monk problems....
[...]
12,871 citations
"Statistical Comparisons of Classifi..." refers methods in this paper
...3.1.3 WILCOXON SIGNED-RANKS TEST The Wilcoxon signed-ranks test (Wilcoxon, 1945) is a non-parametric altern tive to the paired t-test, which ranks the differences in performances of two classifiers for each d ta set, ignoring the signs, and compares the ranks for the positive and the negative…...
[...]
...Since we will finally recommend the Wilcoxon (1945) signed-ranks test, it will be presented with more details....
[...]
10,225 citations
"Statistical Comparisons of Classifi..." refers methods in this paper
...5), naive Bayesian learner that models continuous probabilities using LOESS (Cleveland, 1979), naive Bayesian learner with continuous attributes discretized using Fayyad-Irani’s discretization (Fayyad and Irani, 1993) and kNN (k=10, neighbour weights adjusted with the Gaussian kernel)....
[...]
...4.1.1 DATA SETS AND LEARNING ALGORITHMS We based our experiments on several common learning algorithms and their variations: C4.5, C4.5 with m and C4.5 with cf fitted for optimal accuracy, another tree learning algorithm implemented in Orange (with features similar to the original C4.5), naive Bayesian learnerthat models continuous probabilities using LOESS (Cleveland, 1979), naive Bayesian learner with continuous attributes discretized using Fayyad-Irani’s discretization (Fayyad and Irani, 1993) and kNN (k=10, neighbour weights adjusted with the Gaussian kernel)....
[...]
...…implemented in Orange (with features similar to the original C4.5), naive Bayesian learnerthat models continuous probabilities using LOESS (Cleveland, 1979), naive Bayesian learner with continuous attributes discretized using Fayyad-Irani’s discretization (Fayyad and Irani, 1993) and kNN…...
[...]