scispace - formally typeset
Open AccessJournal ArticleDOI

Survey of Machine Learning Algorithms for Disease Diagnostic

Meherwar Fatima, +1 more
- 24 Jan 2017 - 
- Vol. 9, Iss: 1, pp 1-16
Reads0
Chats0
TLDR
The comparative analysis of different machine learning algorithms for diagnosis of different diseases such as heart disease, diabetes disease, liver disease, dengue disease and hepatitis disease is provided.
Abstract
In medical imaging, Computer Aided Diagnosis (CAD) is a rapidly growing dynamic area of research. In recent years, significant attempts are made for the enhancement of computer aided diagnosis applications because errors in medical diagnostic systems can result in seriously misleading medical treatments. Machine learning is important in Computer Aided Diagnosis. After using an easy equation, objects such as organs may not be indicated accurately. So, pattern recognition fundamentally involves learning from examples. In the field of bio-medical, pattern recognition and machine learning promise the improved accuracy of perception and diagnosis of disease. They also promote the objectivity of decision-making process. For the analysis of high-dimensional and multimodal bio-medical data, machine learning offers a worthy approach for making classy and automatic algorithms. This survey paper provides the comparative analysis of different machine learning algorithms for diagnosis of different diseases such as heart disease, diabetes disease, liver disease, dengue disease and hepatitis disease. It brings attention towards the suite of machine learning algorithms and tools that are used for the analysis of diseases and decision-making process accordingly.

read more

Content maybe subject to copyright    Report

Journal of Intelligent Learning Systems and Applications, 2017, 9, 1-16
http://www.scirp.org/journal/jilsa
ISSN Online: 2150-8410
ISSN Print: 2150-8402
DOI:
10.4236/jilsa.2017.91001
January 24, 2017
Survey of Machine Learning Algorithms for
Disease Diagnostic
Meherwar Fatima
1
, Maruf Pasha
2
1
Institute of CS & IT, The Women University Multan, Multan, Pakistan
2
Department of Information Technology, Bahauddin Zakariya University, Multan, Pakistan
Abstract
In medical imaging, Computer Aided Diagnosis (CAD) is a
rapidly growing
dynamic area of research. In recent years, significant attempts are made fo
r
the enhancement of computer
aided diagnosis applications because errors in
medical diagnostic systems can result in seriously misleading medical trea
t-
ments. Machine learning is important in Computer Aided Diagnosis. After u
s-
ing an easy equation, objects such as organs may not
be indicated accurately.
So, pattern recognition fundamentally involves learning from examples. I
n the
field of bio-medical, pattern recognition and machine learning promise the
improved accuracy of perception and diagnosis of disease. They also promote
the objectivity of decision-making process. For the analysis of high-dimen
sional
and multimodal bio-
medical data, machine learning offers a worthy approach
for making classy and automatic algorithms. This survey paper provides the
comparative analysis of different machine learning algorithms for diagnosis of
different diseases such as heart disease, diabetes disease, liver disease, d
engue
disease and h
epatitis disease. It brings attention towards the suite of machine
learning algorithms and tools that are used for the analysis of diseases and d
e-
cision-making process accordingly.
Keywords
Machine Learning, Artificial Intelligence, Machine Learning Techniques
1. Introduction
Artificial Intelligence can enable the computer to think. Computer is made much
more intelligent by AI. Machine learning is the subfield of AI study. Various re-
searchers think that without learning, intelligence cannot be developed. There
are many types of Machine Learning Techniques that are shown in
Figure 1. Su-
pervised, Unsupervised, Semi Supervised, Reinforcement, Evolutionary Learning
How to cite this paper:
Fatima, M
. and
Pasha
, M. (2017) Survey of Machine
Learning
Algorithms for Disease Diagnostic
.
Journal
of Intelligent Learning Systems and Appli-
cations
,
9
, 1-16.
https://doi.org/10.4236/jilsa.2017.91001
Received:
October 17, 2016
Accepted:
January 21, 2017
Published:
January 24, 2017
Copyright © 201
7 by authors and
Scientific
Research Publishing Inc.
This work is licensed under the Creative
Commons Attribution International
License (CC BY
4.0).
http://creativecommons.org/licenses/by/4.0/
Open Access

M. Fatima, M. Pasha
2
Figure 1. Types of machine learning techniques.
and Deep Learning are the types of machine learning techniques. These tech-
niques are used to classify the data set.
1) Supervised learning: Offered a training set of examples with suitable targets
and on the basis of this training set, algorithms respond correctly to all feasible
inputs. Learning from exemplars is another name of Supervised Learning. Clas-
sification and regression are the types of Supervised Learning.
Classification: It gives the prediction of Yes or No, for example, “Is this tumor
cancerous?”, “Does this cookie meet our quality standards?”
Regression: It gives the answer of “How much” and “How many”.
2) Unsupervised learning: Correct responses or targets are not provided. Un-
supervised learning technique tries to find out the similarities between the input
data and based on these similarities, un-supervised learning technique classify
the data. This is also known as density estimation. Unsupervised learning con-
tains clustering
[1].
Clustering: it makes clusters on the basis of similarity.
3) Semi supervised learning: Semi supervised learning technique is a class of
supervised learning techniques. This learning also used unlabeled data for train-
ing purpose (generally a minimum amount of labeled-data with a huge amount
of unlabeled-data). Semi-supervised learning lies between unsupervised-learning
(unlabeled-data) and supervised learning (labeled-data).
4) Reinforcement learning: This learning is encouraged by behaviorist psy-
chology. Algorithm is informed when the answer is wrong, but does not inform
that how to correct it. It has to explore and test various possibilities until it finds
the right answer. It is also known as learning with a critic. It does not recom-
mend improvements. Reinforcement learning is different from supervised learn-

M. Fatima, M. Pasha
3
ing in the sense that accurate input and output sets are not offered, nor sub-
optimal actions clearly précised. Moreover, it focuses on on-line performance.
5) Evolutionary Learning: This biological evolution learning can be consi-
dered as a learning process: biological organisms are adapted to make progress
in their survival rates and chance of having off springs. By using the idea of fit-
ness, to check how accurate the solution is, we can use this model in a computer
[1].
6) Deep learning: This branch of machine learning is based on set of algo-
rithms. In data, these learning algorithms model high-level abstraction. It uses
deep graph with various processing layer, made up of many linear and nonlinear
transformation.
Pattern recognition process and data classification are valuable for a long
time. Humans have very strong skill for sensing the environment. They take
action against what they perceive from environment
[2]. Big data turns into
Chunks due to multidisciplinary combined effort of machine learning, databases
and statistics. Today, in medical sciences disease diagnostic test is a serious task.
It is very important to understand the exact diagnosis of patients by clinical ex-
amination and assessment. For effective diagnosis and cost effective manage-
ment, decision support systems that are based upon computer may play a vital
role. Health care field generates big data about clinical assessment, report re-
garding patient, cure, follow-ups, medication etc. It is complex to arrange in a
suitable way. Quality of the data organization has been affected due to inappro-
priate management of the data. Enhancement in the amount of data needs some
proper means to extract and process data effectively and efficiently
[3]. One of
the many machine-learning applications is employed to build such classifier that
can divide the data on the basis of their attributes. Data set is divided into two or
more than two classes. Such classifiers are used for medical data analysis and
disease detection.
Initially, algorithms of ML were designed and employed to observe medical
data sets. Today, for efficient analysis of data, ML recommended various tools.
Especially in the last few years, digital revolution has offered comparatively low-
cost and obtainable means for collection and storage of data. Machines for data
collection and examination are placed in new and modern hospitals to make them
capable for collection and sharing data in big information systems. Technologies
of ML are very effective for the analysis of medical data and great work is done
regarding diagnostic problems. Correct diagnostic data are presented as a medi-
cal record or reports in modern hospitals or their particular data section. To run
an algorithm, correct diagnostic patient record is entered in a computer as an
input. Results can be automatically obtained from the previous solved cases. Phy-
sicians take assistance from this derived classifier while diagnosing novel patient
at high speed and enhanced accuracy. These classifiers can be used to train non-
specialists or students to diagnose the problem
[4].
In past, ML has offered self-driving cars, speech detection, efficient web search,
and improved perception of the human generation. Today machine learning is

M. Fatima, M. Pasha
4
present everywhere so that without knowing it, one can possibly use it many
times a day. A lot of researchers consider it as the excellent way in moving to-
wards human level. The machine learning techniques discovers electronic health
record that generally contains high dimensional patterns and multiple data sets.
Pattern recognition is the theme of MLT that offers support to predict and make
decisions for diagnosis and to plan treatment. Machine learning algorithms are
capable to manage huge number of data, to combine data from dissimilar re-
sources, and to integrate the background information in the study
[3].
2. Diagnosis of Diseases by Using Different Machine
Learning Algorithms
Many researchers have worked on different machine learning algorithms for
disease diagnosis. Researchers have been accepted that machine-learning algo-
rithms work well in diagnosis of different diseases. Figurative approach of dis-
eases diagnosed by Machine Learning Techniques is shown in
Figure 2. In this
survey paper diseases diagnosed by MLT are heart, diabetes, liver, dengue and
hepatitis.
2.1. Heart Disease
Otoom
et al
. [5] presented a system for the purpose of analysis and monitoring.
Coronary artery disease is detected and monitored by this proposed system.
Cleveland heart data set is taken from UCI. This data set consists of 303 cases
and 76 attributes/features. 13 features are used out of 76 features. Two tests with
three algorithms Bayes Net, Support vector machine, and Functional Trees FT
are performed for detection purpose. WEKA tool is used for detection. After
Figure 2. Diseases diagnosed by MLT.

M. Fatima, M. Pasha
5
experimenting Holdout test, 88.3% accuracy is attained by using SVM technique.
In Cross Validation test, Both SVM and Bayes net provide the accuracy of 83.8%.
81.5% accuracy is attained after using FT. 7 best features are picked up by using
Best First selection algorithm. For validation Cross Validation test are used. By
applying the test on 7 best selected features, Bayes Net attained 84.5% of cor-
rectness, SVM provides 85.1% accuracy and FT classify 84.5% correctly.
Vembandasamy
et al
. [6] performed a work, to diagnose heart disease by using
Naive Bayes algorithm. Bayes theorem is used in Naive Bayes. Therefore, Naive
Bayes have powerful independence assumption. The employed data-set are ob-
tained from one of the leading diabetic research institute in Chennai. Data set
consists of 500 patients. Weka is used as a tool and executes classification by us-
ing 70% of Percentage Split. Naive Bayes offers 86.419% of accuracy.
Use of data mining approaches has been suggested by Chaurasia and Pal
[7]
for heart disease detection. WEKA data mining tool is used that contains a set of
machine learning algorithms for mining purpose. Naive Bayes, J48 and bagging
are used for this perspective. UCI machine learning laboratory provide heart
disease data set that consists of 76 attributes. Only 11 attributes are employed for
prediction. Naive bayes provides 82.31% accuracy. J48 gives 84.35% of correct-
ness. 85.03% of accuracy is achieved by Bagging. Bagging offers better classifica-
tion rate on this data set.
Parthiban and Srivatsa
[8] put their effort for diagnosis of heart disease in di-
abetic patients by using the methods of machine learning. Algorithms of Naive
Bayes and SVM are applied by using WEKA. Data set of 500 patients is used that
are collected from Research Institute of Chennai. Patients that have the disease
are 142 and disease is missing in 358 patients. By using Naive Bayes Algorithm
74% of accuracy is obtained. SVM provide the highest accuracy of 94.60.
Tan
et al
. [9] proposed hybrid technique in which two machine-learning algo-
rithms named Genetic Algorithm (G.A) and Support Vector Machine (SVM) are
joined effectively by using wrapper approach. LIBSVM and WEKA data mining
tool are used in this analysis. Five data sets (Iris, Diabetes disease, disease of breast
Cancer, Heart and Hepatitis disease) are picked up from UC Irvine machine
learning repository for this experiment. After applying GA and SVM hybrid ap-
proach, 84.07% accuracy is attained for heart disease. For data set of diabetes
78.26% accuracy is achieved. Accuracy for Breast cancer is 76.20%. Correctness
of 86.12% is resulting for hepatitis disease. Graphical representation of Accuracy
according to time for detection of heart disease is shown in
Figure 3.
Analysis:
In existing literature, SVM offers highest accuracy of 94.60% in 2012 as in
Ta-
ble 1. In many application areas, SVM shows good performance result. Attribute
or features used by Parthiban and Srivatsa in 2012 are correctly responded by
SVM. In 2015, Otoom
et al
. used SVM variant called SMO. It also uses FS tech-
nique to find best features. SVM responds to these features and offers the accu-
racy of 85.1% but it is comparatively low as in 2012. Training and testing set of
both data sets are different, as well as, data types are different.

Citations
More filters
Journal ArticleDOI

Machine Learning: Algorithms, Real-World Applications and Research Directions

TL;DR: In this paper, the authors present a comprehensive view on these machine learning algorithms that can be applied to enhance the intelligence and the capabilities of an application and highlight the challenges and potential research directions based on their study.
Journal ArticleDOI

Process mining in healthcare

TL;DR: A literature review of the usage of process mining in healthcare and the most commonly used categories and emerging topics have been identified, as well as future trends, such as enhancing Hospital Information Systems to become process-aware.
Journal ArticleDOI

Prediction of Diabetes using Classification Algorithms

TL;DR: Three machine learning classification algorithms namely Decision Tree, SVM and Naive Bayes are used in this experiment to detect diabetes at an early stage using Pima Indians Diabetes Database which is sourced from UCI machine learning repository.
Book

Machine Learning Approach

TL;DR: This book applied different combinations of feature selection / extraction methods, as a novel hybrid dimension reduction method for SVM, ANN and NB classifiers, and the obtained results are compared with other popular published dimension reduction methods for S VM, NB and ANN classifiers.
Journal ArticleDOI

Artificial Intelligence in Cardiovascular Imaging: JACC State-of-the-Art Review.

TL;DR: Recent promising applications of AI in cardiology and cardiac imaging, which potentially add value to patient care are summarized.
References
More filters
Journal ArticleDOI

Machine learning for medical diagnosis: history, state of the art and perspective

TL;DR: An overview of the development of intelligent data analysis in medicine from a machine learning perspective: a historical view, a state-of-the-art view, and a view on some future trends in this subfield of applied artificial intelligence.
Book

Machine Learning: An Algorithmic Perspective

TL;DR: This book describes algorithms with code examples backed up by a website that provides working implementations in Python and includes examples based on widely available datasets and practical and theoretical problems to test understanding and application of the material.
Journal ArticleDOI

Diagnosis of diabetes using classification mining techniques

TL;DR: The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients, by employing Decision Tree and Naive Bayes algorithms.
Proceedings ArticleDOI

Advantage and drawback of support vector machine functionality

TL;DR: The Support Vector Machine is one of the most efficient machine learning algorithms, which is mostly used for pattern recognition since its introduction in 1990s, and statistics was collected from journals and electronic sources published in the period of 2000 to 2013.

Classification Of Diabetes Disease Using Support Vector Machine

TL;DR: Results obtained show that support vector machine can be successfully used for diagnosing diabetes disease and the machine learning method focus on classifying diabetes disease from high dimensional medical dataset is successful.
Related Papers (5)