Home
/
Topics
/
Decision tree

Topic

Decision tree

About: Decision tree is a research topic. Over the lifetime, 26193 publications have been published within this topic receiving 588185 citations.

...read moreread less

Papers published on a yearly basis

2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

An assessment of the effectiveness of a random forest classifier for land-cover classification

[...]

Victor Rodriguez-Galiano¹, Bardan Ghimire², John Rogan², Mario Chica-Olmo¹, J.P. Rigol-Sánchez³ - Show less +1 more•Institutions (3)

University of Granada¹, Clark University², University of Jaén³

01 Jan 2012-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: In this paper, the performance of the random forest classifier for land cover classification of a complex area is explored based on several criteria: mapping accuracy, sensitivity to data set size and noise.

...read moreread less

Abstract: Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.

...read moreread less

1,901 citations

Journal Article•DOI•

Improved use of continuous attributes in C4.5

[...]

J. R. Quinlan¹•Institutions (1)

University of Sydney¹

01 Jan 1996-Journal of Artificial Intelligence Research

TL;DR: A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes with an MDL-inspired penalty, leading to smaller decision trees with higher predictive accuracies.

...read moreread less

Abstract: A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modifications lead to smaller decision trees with higher predictive accuracies. Results also confirm that a new version of C4.5 incorporating these changes is superior to recent approaches that use global discretization and that construct small trees with multi-interval splits.

...read moreread less

1,832 citations

Proceedings Article•DOI•

Mining time-changing data streams

[...]

Geoff Hulten¹, Laurie Spencer, Pedro Domingos¹•Institutions (1)

University of Washington¹

26 Aug 2001

TL;DR: An efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner is proposed, called CVFDT, which stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate.

...read moreread less

Abstract: Most statistical and machine-learning algorithms assume that the data is a random sample drawn from a stationary distribution. Unfortunately, most of the large databases available for mining today violate this assumption. They were gathered over months or years, and the underlying processes generating them changed during this time, sometimes radically. Although a number of algorithms have been proposed for learning time-changing concepts, they generally do not scale well to very large databases. In this paper we propose an efficient algorithm for mining decision trees from continuously-changing data streams, based on the ultra-fast VFDT decision tree learner. This algorithm, called CVFDT, stays current while making the most of old data by growing an alternative subtree whenever an old one becomes questionable, and replacing the old with the new when the new becomes more accurate. CVFDT learns a model which is similar in accuracy to the one that would be learned by reapplying VFDT to a moving window of examples every time a new example arrives, but with O(1) complexity per example, as opposed to O(w), where w is the size of the window. Experiments on a set of large time-changing data streams demonstrate the utility of this approach.

...read moreread less

1,790 citations

Book•

Decision Analysis: Introductory Lectures on Choices Under Uncertainty

[...]

Howard Raiffa

01 Jan 1968

1,721 citations

Proceedings Article•

Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid

[...]

Ron Kohavi

02 Aug 1996

TL;DR: A new algorithm, NBTree, is proposed, which induces a hybrid of decision-tree classifiers and Naive-Bayes classifiers: the decision-Tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naïve-Bayesian classifiers.

...read moreread less

Abstract: Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decision-tree classifiers and Naive-Bayes classifiers: the decision-tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naive-Bayesian classifiers. The approach retains the interpretability of Naive-Bayes and decision trees, while resulting in classifiers that frequently outperform both constituents, especially in the larger databases tested.

...read moreread less

1,667 citations

1
2
…
3
4
5
6
7
8
9
…
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

34,221

Papers

708,895

Citations

No. of papers in the topic in previous years
Year	Papers
2024	2
2023	2,596
2022	5,452
2021	1,895
2020	2,049
2019	1,935

Decision tree

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics