scispace - formally typeset
Book ChapterDOI

RRF-BD: Ranger Random Forest Algorithm for Big Data Classification

TLDR
A Ranger Random forest (RRF) algorithm for high-dimensional data classification using three different datasets in order to accommodate the runtime and memory utilization effectively with the same efficiency as given by the traditional random forest is presented.
Abstract
In the current era, data are growing with a faster rate in terms of exponential form where these data create a major challenge for suitable classification to classify the statistical data. The relevance of this topic is extraction of data, insights, mining of information from the dataset with an efficient and faster manner has attracted attention towards the best classification strategy. This paper presents a Ranger Random forest (RRF) algorithm for high-dimensional data classification. Random Forest (RF) has been treated as a most popular ensemble technique of classification due to its measure variable importance, out-of-bag error, proximities, etc. To make the classification constraint possible, in this paper, we use three different datasets in order to accommodate the runtime and memory utilization effectively with the same efficiency as given by the traditional random forest. We also depict the improvements of Random Forest in terms of computational time and memory without affecting the efficiency of the traditional Random Forest. Experimental results show that the proposed RRF outperforms with others in terms of memory utilization and computation time.

read more

Citations
More filters
Book ChapterDOI

A Hybrid and Improved Isolation Forest Algorithm for Anomaly Detection

TL;DR: This paper presents a hybrid anomaly detection algorithm that outperforms the existing Isolation forest algorithm and a basic introduction of the existing algorithms given and then a comparative study performed between theexisting algorithms and the hybrid algorithm.
Book ChapterDOI

Ranger Random Forest-Based Efficient Ensemble Learning Approach for Detecting Malicious URLs

TL;DR: This paper presents ensemble learning-based, faster, and memory-efficient random forest algorithm for detecting malicious URLs, which proves scale best with the number of instances and variables.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Journal ArticleDOI

Evaluating the yield of medical tests

TL;DR: The treadmill exercise test is shown to provide surprisingly little prognostic information beyond that obtained from basic clinical measurements.
Journal ArticleDOI

An assessment of the effectiveness of a random forest classifier for land-cover classification

TL;DR: In this paper, the performance of the random forest classifier for land cover classification of a complex area is explored based on several criteria: mapping accuracy, sensitivity to data set size and noise.
Journal ArticleDOI

GenABEL: an R library for genome-wide association analysis

TL;DR: An R library for genome-wide association (GWA) analysis that implements effective storage and handling of GWA data, fast procedures for genetic data quality control, testing of association of single nucleotide polymorphisms with binary or quantitative traits, visualization of results and also provides easy interfaces to standard statistical and graphical procedures.