A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

doi:10.1023/A:1010920819831

Home
/
Papers
/
A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

Journal Article•DOI•

A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems

David J. Hand¹, Robert Till¹•Institutions (1)

Imperial College London¹

18 Oct 2001-Machine Learning (Kluwer Academic Publishers)-Vol. 45, Iss: 2, pp 171-186

TL;DR: This work extends the definition of the area under the ROC curve to the case of more than two classes by averaging pairwise comparisons and proposes an alternative definition of proportion correct based on pairwise comparison of classes for a simple artificial case.

read less

Abstract: The area under the ROC curve, or the equivalent Gini index, is a widely used measure of performance of supervised classification rules. It has the attractive property that it side-steps the need to specify the costs of the different kinds of misclassification. However, the simple form is only applicable to the case of two classes. We extend the definition to the case of more than two classes by averaging pairwise comparisons. This measure reduces to the standard form in the two class case. We compare its properties with the standard measure of proportion correct and an alternative definition of proportion correct based on pairwise comparison of classes for a simple artificial case and illustrate its application on eight data sets. On the data sets we examined, the measures produced similar, but not identical results, reflecting the different aspects of performance that they were measuring. Like the area under the ROC curve, the measure we propose is useful in those many situations where it is impossible to give costs for the different kinds of misclassification.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

An introduction to ROC analysis

[...]

Tom Fawcett

01 Jun 2006-Pattern Recognition Letters

TL;DR: The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.

...read moreread less

17,017 citations

Cites background from "A Simple Generalisation of the Area..."

...Hand and Till (2001) take a different approach in their derivation of a multi-class generalization of the AUC....
[...]
... Hand and Till (2001) take a different approach in their derivation of a multi-class generalization of the AUC....
[...]
...Hand and Till (2001) point out that Gini + 1 = 2 · AUC....
[...]
...Hand and Till (2001) point out that Gini + 1 = 2 · AUC....
[...]

Journal Article•DOI•

Learning from Imbalanced Data

[...]

Haibo He¹, E.A. Garcia¹•Institutions (1)

Stevens Institute of Technology¹

01 Sep 2009-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario is provided.

...read moreread less

Abstract: With the continuous expansion of data availability in many large-scale, complex, and networked systems, such as surveillance, security, Internet, and finance, it becomes critical to advance the fundamental understanding of knowledge discovery and analysis from raw data to support decision-making processes. Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews. Due to the inherent complex characteristics of imbalanced data sets, learning from such data requires new understandings, principles, algorithms, and tools to transform vast amounts of raw data efficiently into information and knowledge representation. In this paper, we provide a comprehensive review of the development of research in learning from imbalanced data. Our focus is to provide a critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario. Furthermore, in order to stimulate future research in this field, we also highlight the major opportunities and challenges, as well as potential important research directions for learning from imbalanced data.

...read moreread less

6,320 citations

Cites background or methods from "A Simple Generalisation of the Area..."

...Interested readers can refer to [ 131 ] for a more detailed overview of this technique....
[...]
...To eliminate this constraint, Hand and Till [ 131 ] proposed the M measure, a generalization approach that aggregates all pairs of classes based on the inherent characteristics of the AUC....
[...]
...Similarly, under the multiclass imbalanced learning scenario, the AUC values for two-class problems become multiple pairwise discriminability values [ 131 ]....
[...]

Book•

Applied Predictive Modeling

[...]

Max Kuhn, Kjell Johnson

17 May 2013

TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.

...read moreread less

Abstract: General Strategies.- Regression Models.- Classification Models.- Other Considerations.- Appendix.- References.- Indices.

...read moreread less

3,672 citations

Cites methods from "A Simple Generalisation of the Area..."

...When there are multiple classes, the extensions of ROC curves described by Hanley and McNeil (1982), DeLong et al....
[...]

Journal Article•DOI•

A survey of collaborative filtering techniques

[...]

Xiaoyuan Su¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

01 Jan 2009-Advances in Artificial Intelligence

TL;DR: From basic techniques to the state-of-the-art, this paper attempts to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

...read moreread less

Abstract: As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

...read moreread less

3,406 citations

Journal Article•DOI•

What's In A Name? Malay Seals As Onomastic Sources

[...]

Annabel Teh Gallop¹•Institutions (1)

British Library¹

01 Jun 2018-Machine Learning

TL;DR: This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research.

...read moreread less

Abstract: Receiver Operating Characteristics (ROC) graphs are a useful technique for organizing classifiers and visualizing their performance. ROC graphs are commonly used in medical decision making, and in recent years have been increasingly adopted in the machine learning and data mining research communities. Although ROC graphs are apparently simple, there are some common misconceptions and pitfalls when using them in practice. This article serves both as a tutorial introduction to ROC graphs and as a practical guide for using them in research.

...read moreread less

2,046 citations

Cites background from "A Simple Generalisation of the Area..."

...Hand and Till (2001) take a different approach in their derivation of a multi-class generalization of the AUC....
[...]
...Hand and Till (2001) point out that Gini + 1 = 2 × AUC....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book•

An introduction to the bootstrap

[...]

Bradley Efron¹, Robert Tibshirani•Institutions (1)

South Dakota School of Mines and Technology¹

01 Jan 1993

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.

...read moreread less

Abstract: This article presents bootstrap methods for estimation, using simple arguments. Minitab macros for implementing these methods are given.

...read moreread less

37,183 citations

Journal Article•DOI•

Classification and Regression Trees.

[...]

John Van Ryzin, Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone - Show less +1 more

01 Mar 1986-Journal of the American Statistical Association

21,694 citations

Journal Article•DOI•

The meaning and use of the area under a receiver operating characteristic (ROC) curve.

[...]

James A. Hanley, Barbara J. McNeil

01 Apr 1982-Radiology

TL;DR: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented and it is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a random chosen non-diseased subject.

...read moreread less

Abstract: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect difference...

...read moreread less

19,398 citations

"A Simple Generalisation of the Area..." refers background or methods in this paper

...Amongst the most popular are misclassification (or error) rate, and the criterion with which this paper is concerned, the area under the Receiver Operating Characteristic (ROC) curve (e.g., see Hanley & McNeil, 1982; Zweig & Campbell, 1993; Bradley, 1997)....
[...]
...This relationship is well-known (e.g., see Hanley & McNeil, 1982 ), but both it and the implications for estimation seem not always to be appreciated, so we describe it here for completeness....
[...]
...Referring to Hanley and McNeil (1982) we obtain the standard error of ˆ...
[...]
...Amongst the most popular are misclassification (or error) rate, and the criterion with which this paper is concerned, the area under the Receiver Operating Characteristic (ROC) curve (e.g., see Hanley & McNeil, 1982; Zweig & Campbell, 1993; Bradley, 1997)....
[...]
...Referring to Hanley and McNeil (1982) we obtain the standard error of Â to be se( Â) = √ θ̂ (1 − θ̂ ) + (n0 − 1)(Q0 − θ̂2) + (n1 − 1)(Q1 − θ̂2) n0n1 (6) with θ̂ = S0 n0n1 and Q0 = 1 6 (2n0 + 2n1 + 1)(n0 + n1 + 1)(n0 + n1) − Q1 where Q1 = n0∑ j=1 (r j − 1)2....
[...]

Book•

Classification and regression trees

[...]

Leo Breiman

01 Jan 1983

TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

...read moreread less

Abstract: The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

...read moreread less

14,825 citations

UCI Repository of machine learning databases

[...]

Catherine Blake

01 Jan 1998

12,940 citations