RRF-BD: Ranger Random Forest Algorithm for Big Data Classification

doi:10.1007/978-981-13-8676-3_2

Book ChapterDOI

RRF-BD: Ranger Random Forest Algorithm for Big Data Classification

- pp 15-25

TLDR

A Ranger Random forest (RRF) algorithm for high-dimensional data classification using three different datasets in order to accommodate the runtime and memory utilization effectively with the same efficiency as given by the traditional random forest is presented.

Abstract:

In the current era, data are growing with a faster rate in terms of exponential form where these data create a major challenge for suitable classification to classify the statistical data. The relevance of this topic is extraction of data, insights, mining of information from the dataset with an efficient and faster manner has attracted attention towards the best classification strategy. This paper presents a Ranger Random forest (RRF) algorithm for high-dimensional data classification. Random Forest (RF) has been treated as a most popular ensemble technique of classification due to its measure variable importance, out-of-bag error, proximities, etc. To make the classification constraint possible, in this paper, we use three different datasets in order to accommodate the runtime and memory utilization effectively with the same efficiency as given by the traditional random forest. We also depict the improvements of Random Forest in terms of computational time and memory without affecting the efficiency of the traditional Random Forest. Experimental results show that the proposed RRF outperforms with others in terms of memory utilization and computation time.

RRF-BD: Ranger Random Forest Algorithm for Big Data Classification

Citations

A Hybrid and Improved Isolation Forest Algorithm for Anomaly Detection

Ranger Random Forest-Based Efficient Ensemble Learning Approach for Detecting Malicious URLs

References

Random Forests

Classification and Regression by randomForest

Evaluating the yield of medical tests

An assessment of the effectiveness of a random forest classifier for land-cover classification

GenABEL: an R library for genome-wide association analysis

Related Papers (5)

New machine learning algorithm: random forest

Reinforced random forest

Effective Learning and Classification using Random Forest Algorithm

Application Research of Text Classification Based on Random Forest Algorithm

Random Forest based Traffic Classification Method In SDN