Topic

Random forest

About: Random forest is a research topic. Over the lifetime, 13345 publications have been published within this topic receiving 345395 citations. The topic is also known as: random forests & randomized trees.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

An empirical comparison of supervised learning algorithms

[...]

Rich Caruana¹, Alexandru Niculescu-Mizil¹•Institutions (1)

Cornell University¹

25 Jun 2006

TL;DR: A large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps is presented.

...read moreread less

Abstract: A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods.

...read moreread less

2,450 citations

Journal Article•DOI•

Random forest classifier for remote sensing classification

[...]

Mahesh Pal¹•Institutions (1)

Yahoo!¹

01 Jan 2005-International Journal of Remote Sensing

TL;DR: It is suggested that the random forest classifier performs equally well to SVMs in terms of classification accuracy and training time and the number of user‐defined parameters required byrandom forest classifiers is less than the number required for SVMs and easier to define.

...read moreread less

Abstract: Growing an ensemble of decision trees and allowing them to vote for the most popular class produced a significant increase in classification accuracy for land cover classification. The objective of this study is to present results obtained with the random forest classifier and to compare its performance with the support vector machines (SVMs) in terms of classification accuracy, training time and user defined parameters. Landsat Enhanced Thematic Mapper Plus (ETM+) data of an area in the UK with seven different land covers were used. Results from this study suggest that the random forest classifier performs equally well to SVMs in terms of classification accuracy and training time. This study also concludes that the number of user‐defined parameters required by random forest classifiers is less than the number required for SVMs and easier to define.

...read moreread less

2,255 citations

Journal Article•DOI•

An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

[...]

Carolin Strobl¹, James D. Malley¹, Gerhard Tutz²•Institutions (2)

Ludwig Maximilian University of Munich¹, Center for Information Technology²

01 Dec 2009-Psychological Methods

TL;DR: The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high-dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application.

...read moreread less

Abstract: Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and bioinformatics within the past few years. High-dimensional problems are common not only in genetics, but also in some areas of psychological research, where only a few subjects can be measured because of time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications and to provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high-dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated with freely available implementations in the R system for statistical computing.

...read moreread less

2,001 citations

Journal Article•DOI•

An assessment of the effectiveness of a random forest classifier for land-cover classification

[...]

Victor Rodriguez-Galiano¹, Bardan Ghimire², John Rogan², Mario Chica-Olmo¹, J.P. Rigol-Sánchez³ - Show less +1 more•Institutions (3)

University of Granada¹, Clark University², University of Jaén³

01 Jan 2012-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: In this paper, the performance of the random forest classifier for land cover classification of a complex area is explored based on several criteria: mapping accuracy, sensitivity to data set size and noise.

...read moreread less

Abstract: Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.

...read moreread less

1,901 citations

Journal Article•DOI•

Probability Estimates for Multi-class Classification by Pairwise Coupling

[...]

Tingfan Wu, Chih-Jen Lin, Ruby C. Weng

01 Dec 2004-Journal of Machine Learning Research

TL;DR: In this paper, the authors present two approaches for obtaining class probabilities, which can be reduced to linear systems and are easy to implement, and show conceptually and experimentally that the proposed approaches are more stable than the two existing popular methods: voting and the method by Hastie and Tibshirani (1998).

...read moreread less

Abstract: Pairwise coupling is a popular multi-class classification method that combines all comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. We show conceptually and experimentally that the proposed approaches are more stable than the two existing popular methods: voting and the method by Hastie and Tibshirani (1998)

...read moreread less

1,888 citations

Collapse

Network Information

Performance

Metrics

29,141

Papers

532,363

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	5,459
2022	10,287
2021	2,325
2020	2,251
2019	1,961

Random forest

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics