scispace - formally typeset
Proceedings ArticleDOI

Very Fast Decision Tree (VFDT) algorithm on Hadoop

Reads0
Chats0
TLDR
This paper has replaced the serial execution of VFDT algorithm by a series of Map and Reduce functions and conducted an extensive analysis on various datasets which have proved the proposed algorithm to be more efficient in terms of time compared to the other existing decision tree models.
Abstract
In the era of Big Data where voluminous data is handled on a very large scale, traditional decision trees might be very time consuming and sometimes might even fail to work owing to its dataset size. Handling Big Data can also be a costly affair because of its high demand for memory and other hardware requirements. To the end of this paper, we have chosen a decision tree algorithm named Very Fast Decision Tree (VFDT) after comparing it with other decision tree algorithms like ID3 and C4.5. We have also proposed an algorithm for implementing VFDT on a Distributed Environment called Hadoop. This implementation can form a base for a large number of applications for handling Big Data. We have replaced the serial execution of VFDT algorithm by a series of Map and Reduce functions. We have also conducted an extensive analysis on various datasets which have proved our proposed algorithm to be more efficient in terms of time compared to the other existing decision tree models.

read more

Citations
More filters
Journal ArticleDOI

A Secure AI-Driven Architecture for Automated Insurance Systems: Fraud Detection and Risk Measurement

TL;DR: A secure and automated insurance system framework that reduces human interaction, secures the insurance activities, alerts and informs about risky customers, detects fraudulent claims, and reduces monetary loss for the insurance sector is developed.
Proceedings ArticleDOI

Models for Hand Gesture Recognition using Deep Learning

TL;DR: A hand gesture recognition system which works in 4 steps to eliminate the ambiguity introduced in the results by inculcating variation in the background by developing a system which acts as a mediator between both.
Proceedings ArticleDOI

A Classification for Patients with Heart Disease Based on Hoeffding Tree

TL;DR: Data from the UCI Machine Learning Repository, a Dataset, has 199 samples, including thirteen features, to predict the outcome of cardiovascular disease, suitable for constructing a predictive system for people with heart disease 10-fold cross validation.
Book ChapterDOI

Improving Accuracy of Classification Based on C4.5 Decision Tree Algorithm Using Big Data Analytics

TL;DR: The main objective of this research is to boost up the classification accuracy and roll back timing to build a classification model and has reduced input space using Bhattacharya distance.
Proceedings ArticleDOI

Use of Ensemble Machine Learning to Detect Depression in Social Media Posts

TL;DR: In this article, a system to detect depression using ensembled learning and Natural Language Processing (NLP) techniques was proposed, and the best performing configuration gave an accuracy of 96.35%.
References
More filters
Journal ArticleDOI

A MapReduce Implementation of C4.5 Decision Tree Algorithm

TL;DR: This work proposes to implement a typical decision tree algorithm, C4.5, using MapReduce programming model, and transforms the traditional algorithm into a series of Map and Reduce procedures, showing both time efficiency and scalability.
Proceedings ArticleDOI

Efficient regression algorithms for classification of social media data

TL;DR: This paper has implemented decision tree algorithm and compared it with other machine learning algorithms like NaiveBayes, Adaboost etc and found that decision tree algorithms gives accurate results as compared to other algorithms.

A Comparative study of Stream Data mining Algorithms

TL;DR: A comparative study between Hoeffding tree, VF DT (Very Fast Decision Tree) and CVFDT (Concept Adaptinf Very Fast decision Tree) from various algorithms which are used for stream data classification are made.
Journal ArticleDOI

Efficient classification of big data using vfdt (very fast decision tree)

TL;DR: A comparative study on three of these popular decision tree algorithms - (Iterative Dichotomizer 3), C4.5 which is an evolution of ID3 and VFDT (Very Fast Decision Tree) has been made and various conclusions have been drawn.