Book ChapterDOI
An Empirical Comparison of Six Supervised Machine Learning Techniques on Spark Platform for Health Big Data
Gayathri Nagarajan,L. D. Dhinesh Babu +1 more
- pp 299-307
Reads0
Chats0
TLDR
The goal of this paper is to compare the performance of the six different machine learning techniques in spark platform specifically for health big data and discuss the results from the experiments conducted on datasets of different characteristics, thereby drawing inferences and conclusion.Abstract:
Health care is one of the prominent industries that generate voluminous data, thereby finding the need for machine learning techniques with big data solutions. The goal of this paper is to (i) compare the performance of the six different machine learning techniques in spark platform specifically for health big data and (ii) discuss the results from the experiments conducted on datasets of different characteristics, thereby drawing inferences and conclusion. Five benchmark health datasets are considered for experimentation. The metric chosen for comparison is the accuracy, and the computational time of the algorithms and the experiments are conducted with different proportions of training data. The experimental results show that the logistic regression and random forests perform well in terms of accuracy and naive Bayes outperforms other techniques in terms of computational time for health big datasets.read more
Citations
More filters
Journal Article
Health big data analytics : current perspectives, challenges and potential solutions
TL;DR: The characteristics of Health Big Data as well as the challenges and solutions for health Big Data Analytics (BDA) are discussed and a pipelined framework for use as a guideline/reference in health BDA is designed and evaluated.
Proceedings ArticleDOI
TrORF: Building Trading Areas Around Organizations Based on Machine Learning Techniques
TL;DR: This paper proposes a machine-learning based methodology of determining business types and trading areas, named TrORF, to help investors selectbusiness types and build trading areas around organizations and results indicate that this approach has strong ability to build the suitable business types around each university's trading area.
Journal ArticleDOI
An Intelligent Technique to Predict the Autism Spectrum Disorder Using Big Data Platform
TL;DR: In this paper , the Intelligent Classification Based on Association rules (ICBA) algorithm is proposed for finding the correlations between the features to decide whether an individual has autism in its early stage, especially in childhood.
References
More filters
Proceedings Article
Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing
Matei Zaharia,Mosharaf Chowdhury,Tathagata Das,Ankur Dave,Justin Ma,Murphy McCauley,Michael J. Franklin,Scott Shenker,Ion Stoica +8 more
TL;DR: Resilient Distributed Datasets is presented, a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner and is implemented in a system called Spark, which is evaluated through a variety of user applications and benchmarks.
Proceedings ArticleDOI
An empirical comparison of supervised learning algorithms
TL;DR: A large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps is presented.
Journal ArticleDOI
Logistic regression and artificial neural network classification models: a methodology review
TL;DR: In this paper, the differences and similarities of these models from a technical point of view, and compare them with other machine learning algorithms are summarized and compared using a set of quality criteria for logistic regression and artificial neural networks.
Journal ArticleDOI
A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification
TL;DR: The performance impact of feature set reduction, using Consistency-based and Correlation-based feature selection, is demonstrated on Na naïve Bayes, C4.5, Bayesian Network and Naïve Bayes Tree algorithms.
Journal ArticleDOI
Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500
TL;DR: In this article, the effectiveness of deep neural networks (DNN), gradient-boosted-trees (GBT), random forests (RAF), and several ensembles of these methods in the context of statistical arbitrage was evaluated.