C4.5: Programs for Machine Learning (書評)

Home
/
Papers
/
C4.5: Programs for Machine Learning (書評)

C4.5: Programs for Machine Learning (書評)

重郎金田

01 May 1995-Vol. 10, Iss: 3, pp 475-476

About: The article was published on 1995-05-01 and is currently open access. It has received 1164 citations till now.

...read moreread less

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

[...]

Thorsten Joachims¹•Institutions (1)

Technical University of Dortmund¹

21 Apr 1998

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.

...read moreread less

Abstract: This paper explores the use of Support Vector Machines (SVMs) for learning text classifiers from examples. It analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task. Empirical results support the theoretical findings. SVMs achieve substantial improvements over the currently best performing methods and behave robustly over a variety of different learning tasks. Furthermore they are fully automatic, eliminating the need for manual parameter tuning.

...read moreread less

8,658 citations

Journal Article•DOI•

Extremely randomized trees

[...]

Pierre Geurts¹, Damien Ernst¹, Louis Wehenkel¹•Institutions (1)

University of Liège¹

01 Apr 2006-Machine Learning

TL;DR: A new tree-based ensemble method for supervised classification and regression problems that consists of randomizing strongly both attribute and cut-point choice while splitting a tree node and builds totally randomized trees whose structures are independent of the output values of the learning sample.

...read moreread less

Abstract: This paper proposes a new tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a tree node. In the extreme case, it builds totally randomized trees whose structures are independent of the output values of the learning sample. The strength of the randomization can be tuned to problem specifics by the appropriate choice of a parameter. We evaluate the robustness of the default choice of this parameter, and we also provide insight on how to adjust it in particular situations. Besides accuracy, the main strength of the resulting algorithm is computational efficiency. A bias/variance analysis of the Extra-Trees algorithm is also provided as well as a geometrical and a kernel characterization of the models induced.

...read moreread less

5,246 citations

Cites methods from "C4.5: Programs for Machine Learning..."

...…s, this measure is given by: ScoreC (s, S) = 2I s c (S) Hs(S) + Hc(S) , where Hc(S) is the (log) entropy of the classification in S, Hs(S) is the split entropy (also called split information by Quinlan (1986)), and I sc (S) is the mutual information of the split outcome and the classification....
[...]

Journal Article•DOI•

Bayesian Network Classifiers

[...]

Nir Friedman¹, Dan Geiger², Moises Goldszmidt³•Institutions (3)

University of California, Berkeley¹, Technion – Israel Institute of Technology², SRI International³

01 Nov 1997-Machine Learning

TL;DR: Tree Augmented Naive Bayes (TAN) is single out, which outperforms naive Bayes, yet at the same time maintains the computational simplicity and robustness that characterize naive Baye.

...read moreread less

Abstract: Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.

...read moreread less

4,775 citations

Cites background or methods from "C4.5: Programs for Machine Learning..."

...5: the decision-tree induction method developed by Quinlan (1993)...
[...]
...Table 2 displays the accuracies of the main classification approaches we have discussed throughout the paper using the abbreviations: NB: the naive Bayesian classifier BN: unrestricted Bayesian networks learned with the MDL score TANs: TAN networks learned according to Theorem 2, with smoothed parameters CLs: CL multinet classifier—Bayesian multinets learned according to Theorem 1—with smoothed parameters C4.5: the decision-tree induction method developed by Quinlan (1993) SNB: theselective naive Bayesian classifier, a wrapper-based feature selection applied to naive Bayes, using the implementation of John and Kohavi (1997) In the previous sections we discussed these results in some detail....
[...]
...5 (Quinlan, 1993), a state-of-the-art decision tree learner, we may infer that TAN should perform rather well in comparison to C4....
[...]
...To confirm this prediction, we performed experiments comparing TAN to C4.5, and also to theselective naive Bayesianclassifier (Langley & Sage, 1994; John & Kohavi, 1997)....
[...]
...These results also show that both TAN and the CL multinet classifier are roughly equivalent in terms of accuracy, dominate the naive Bayesian classifier, and compare favorably with both C4.5 and the selective naive Bayesian classifier....
[...]

A Short Introduction to Boosting

[...]

Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

AT&T¹

01 Jan 1999

TL;DR: This short overview paper introduces the boosting algorithm AdaBoost, and explains the underlying theory of boosting, including an explanation of why boosting often does not suffer from overfitting as well as boosting’s relationship to support-vector machines.

...read moreread less

Abstract: Boosting is a general method for improving the accuracy of any given learning algorithm. This short overview paper introduces the boosting algorithm AdaBoost, and explains the underlying theory of boosting, including an explanation of why boosting often does not suffer from overfitting as well as boosting’s relationship to support-vector machines. Some examples of recent applications of boosting are also described.

...read moreread less

3,212 citations

Journal Article•DOI•

[...]

David W. Opitz¹, Richard Maclin²•Institutions (2)

University of Montana¹, University of Minnesota²

01 Jul 1999-Journal of Artificial Intelligence Research

TL;DR: This work suggests that most of the gain in an ensemble's performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees.

...read moreread less

Abstract: An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging (Breiman, 1996c) and Boosting (Freund & Schapire, 1996; Schapire, 1990) are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm. Our results clearly indicate a number of conclusions. First, while Bagging is almost always more accurate than a single classifier, it is sometimes much less accurate than Boosting. On the other hand, Boosting can create ensembles that are less accurate than a single classifier - especially when using neural networks. Analysis indicates that the performance of the Boosting methods is dependent on the characteristics of the data set being examined. In fact, further results show that Boosting ensembles may overfit noisy data sets, thus decreasing its performance. Finally, consistent with previous studies, our work suggests that most of the gain in an ensemble's performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees.

...read moreread less

2,672 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Book Chapter•DOI•

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

[...]

Thorsten Joachims¹•Institutions (1)

Technical University of Dortmund¹

21 Apr 1998

...read moreread less

8,658 citations

Journal Article•DOI•

Extremely randomized trees

[...]

Pierre Geurts¹, Damien Ernst¹, Louis Wehenkel¹•Institutions (1)

University of Liège¹

01 Apr 2006-Machine Learning

...read moreread less

5,246 citations

Journal Article•DOI•

Bayesian Network Classifiers

[...]

Nir Friedman¹, Dan Geiger², Moises Goldszmidt³•Institutions (3)

University of California, Berkeley¹, Technion – Israel Institute of Technology², SRI International³

01 Nov 1997-Machine Learning

TL;DR: Tree Augmented Naive Bayes (TAN) is single out, which outperforms naive Bayes, yet at the same time maintains the computational simplicity and robustness that characterize naive Baye.

...read moreread less

4,775 citations

A Short Introduction to Boosting

[...]

Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

AT&T¹

01 Jan 1999

...read moreread less

3,212 citations