A Survey of Data Mining Techniques on Medical Data for Finding Locally Frequent Diseases

Home
/
Papers
/
A Survey of Data Mining Techniques on Medical Data for Finding Locally Frequent Diseases

A Survey of Data Mining Techniques on Medical Data for Finding Locally Frequent Diseases

Mohammed Abdul Khaleel, Sateesh Kumar Pradham

01 Jan 2013-

TL;DR: The main focus of this paper is to analyze data mining techniques required for medical data mining especially to discover locally frequent diseases such as heart ailments, lung cancer, breast cancer and so on.

read less

Abstract: In the last decade there has been increasing usage of data mining techniques on medical data for discovering useful trends or patterns that are used in diagnosis and decision making. Data mining techniques such as clustering, classification, regression, association rule mining, CART (Classification and Regression Tree) are widely used in healthcare domain. Data mining algorithms, when appropriately used, are capable of improving the quality of prediction, diagnosis and disease classification. The main focus of this paper is to analyze data mining techniques required for medical data mining especially to discover locally frequent diseases such as heart ailments, lung cancer, breast cancer and so on. We evaluate the data mining techniques for finding locally frequent patterns in terms of cost, performance, speed and accuracy. We also compare data mining techniques with conventional methods.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Analysis of data mining techniques for heart disease prediction

[...]

Marjia Sultana¹, Afrin Haider¹, Mohammad Shorif Uddin¹•Institutions (1)

Jahangirnagar University¹

01 Sep 2016

TL;DR: Based on performance factor SMO and Bayes Net techniques show optimum performances than the performances of KStar, Multilayer Perceptron and J48 techniques.

...read moreread less

Abstract: Heart disease is considered as one of the major causes of death throughout the world. It cannot be easily predicted by the medical practitioners as it is a difficult task which demands expertise and higher knowledge for prediction. This paper addresses the issue of prediction of heart disease according to input attributes on the basis of data mining techniques. We have investigated the heart disease prediction using KStar, J48, SMO, Bayes Net and Multilayer Perceptron through Weka software. The performance of these data mining techniques is measured by combining the results of predictive accuracy, ROC curve and AUC value using a standard data set as well as a collected data set. Based on performance factor SMO and Bayes Net techniques show optimum performances than the performances of KStar, Multilayer Perceptron and J48 techniques.

...read moreread less

79 citations

Cites background from "A Survey of Data Mining Techniques ..."

...Papers [2], [6], [7] studied the applications of different data mining techniques for prediction of various diseases....
[...]
...Khaled and Das [2] evaluated the different data mining techniques to find frequent pattern based on cost, performance, speed and accuracy....
[...]

Journal Article•DOI•

Optimistic Multi-granulation Rough Set Based Classification for Medical Diagnosis☆

[...]

Sudhir Kumar¹, H. Hannah Inbarani¹•Institutions (1)

Periyar University¹

01 Jan 2015-Procedia Computer Science

TL;DR: The performance of the proposed optimistic multi granulation Rough set based classification is compared with other rough set based (RS), K th Nearest Neighbor (KNN) and Back propagation neural network (BPN) approaches using various classification Measures.

...read moreread less

44 citations

Proceedings Article•DOI•

Dietary prediction for patients with Chronic Kidney Disease (CKD) by considering blood potassium level using machine learning algorithms

[...]

M.P.N.M. Wickramasinghe¹, D.M. Perera¹, K.A.D.C.P. Kahandawaarachchi¹•Institutions (1)

Sri Lanka Institute of Information Technology¹

01 Dec 2017

TL;DR: The experimental results show that Multiclass Decision Forest algorithm gives a better result than the other classification algorithms and produces 99.17% accuracy.

...read moreread less

Abstract: Kidney damage and diminished function that lasts longer than three months is known as Chronic Kidney Disease (CKD) The primary goal of this research study is to identify the suitable diet plan for a CKD patient by applying the classification algorithms on the test result obtained from patients' medical records The aim of this work is to control the disease using the suitable diet plan and to identify that suitable diet plan using classification algorithms The suggested work pacts with the recommendation of various diet plans by using predicted potassium zone for CKD patients according to their blood potassium level The experiment is performed on different algorithms like Multiclass Decision Jungle, Multiclass Decision Forest, Multiclass Neural Network and Multiclass Logistic Regression The experimental results show that Multiclass Decision Forest algorithm gives a better result than the other classification algorithms and produces 9917% accuracy

...read moreread less

34 citations

Cites background from "A Survey of Data Mining Techniques ..."

...For example, data mining can be utilized to mining medicinal information as health area generates a lot of data about ailments, pathologies and patients [2]....
[...]

Proceedings Article•DOI•

Data mining techniques for medical data: A review

[...]

Subhash Chandra Pandey¹•Institutions (1)

Birla Institute of Technology, Mesra¹

01 Oct 2016

TL;DR: The merits and demerits of frequently used data mining techniques in the domain of health care and medical data have been compared and an analytical approach regarding the uniqueness of medical data in health care is presented.

...read moreread less

Abstract: Data mining is an important area of research and is pragmatically used in different domains like finance, clinical research, education, healthcare etc. Further, the scope of data mining have thoroughly been reviewed and surveyed by many researchers pertaining to the domain of healthcare which is an active interdisciplinary area of research. In fact, the task of knowledge extraction from the medical data is a challenging endeavor and it is a complex task. The main motive of this review paper is to give a review of data mining in the purview of healthcare. Moreover, intertwining and interrelation of previous researches have been presented in a novel manner. Furthermore, merits and demerits of frequently used data mining techniques in the domain of health care and medical data have been compared. The use of different data mining tasks in health care is also discussed. An analytical approach regarding the uniqueness of medical data in health care is also presented.

...read moreread less

25 citations

Book Chapter•DOI•

Machine Learning Techniques for Thyroid Disease Diagnosis: A Systematic Review

[...]

Shaik Razia¹, P. Siva Kumar¹, A. Srinivasa Rao¹•Institutions (1)

K L University¹

01 Jan 2020

TL;DR: In Disease Diagnosis recognition of patterns is so important for identifying the disease accurately and machine learning is the field which is used for building the models that can predict the output based upon the inputs which are correlated based on the previous data.

...read moreread less

Abstract: In Disease Diagnosis recognition of patterns is so important for identifying the disease accurately. Machine learning is the field which is used for building the models that can predict the output based upon the inputs which are correlated based upon the previous data. Disease identification is the most crucial task for treating any disease. Classification algorithms are used for classifying the disease. There are several classification algorithms and dimensionality reduction algorithms used. Machine Learning gives the PCs the capacity to learn without being modified externally. By using the Classification Algorithm a hypothesis can be selected from the set of alternatives the best fits a set of observations. Machine Learning is used for the high-dimensional and the multi-dimensional data. Classy and automatic algorithms can be developed using Machine Learning.

...read moreread less

13 citations

1
2
3
4
…
5
6
7
8

Collapse

References

PDF

Open Access

More filters

Book•

Data mining techniques

[...]

Arun K. Pujari

01 Jan 2001

TL;DR: This chapter discusses the design and analysis of experiments in the context of response surface methodology, and some of the techniques used in this work were new to the literature at the time.

...read moreread less

Abstract: Funkenbusch, P. (2005), Practical Guide to Designed Experiments, New York: Marcel Dekker. Grice, J. (2000), Review of Design and Analysis of Experiments (4th ed.), by D. Montgomery, Technometrics, 42, 208–209. Myers, R., and Montgomery, D. (2002), Response Surface Methodology (2nd ed.), New York: Wiley. Ziegel, E. (2001), Editor’s Report on Design and Analysis of Experiments (5th ed.), by R. Myers and D. Montgomery, Technometrics, 43, 245. (2002), Editor’s Report on Response Surface Methodology (2nd ed.), by R. Myers and D. Mongtomery, Technometrics, 44, 298–299.

...read moreread less

1,294 citations

Journal Article•DOI•

A comparison of linear genetic programming and neural networks in medical data mining

[...]

Markus Brameier, Wolfgang Banzhaf

01 Feb 2001-IEEE Transactions on Evolutionary Computation

TL;DR: An efficient algorithm that eliminates intron code and a demetic approach to virtually parallelize the system on a single processor are discussed, which show that GP performs comparably in classification and generalization.

...read moreread less

Abstract: We introduce a new form of linear genetic programming (GP). Two methods of acceleration of our GP approach are discussed: 1) an efficient algorithm that eliminates intron code and 2) a demetic approach to virtually parallelize the system on a single processor. Acceleration of runtime is especially important when operating with complex data sets, because they are occurring in real-world applications. We compare GP performance on medical classification problems from a benchmark database with results obtained by neural networks. Our results show that GP performs comparably in classification and generalization.

...read moreread less

482 citations

Journal Article•DOI•

From the guest editor medical data mining and knowledge discovery

[...]

Krzysztof J. Cios¹•Institutions (1)

University of Toledo¹

01 Jul 2000-IEEE Engineering in Medicine and Biology Magazine

TL;DR: This book discusses data Mining-Based Modeling of Human Visual Perception, and the discovery of Clinical Knowledge in Databases Extracted from Hospital Information Systems and Knowledge Discovery in Time Series.

...read moreread less

Abstract: Medical Data Mining and Knowledge Discovery * Legal Policy and Security Issues in the Handling of Medical Data * Medical Natural Language Understanding as a Supporting Technology for Data Mining in Healthcare * Anatomic Pathology Data Mining * A Data Clustering and Visualization Methodology for Epidemiological Pathology Discoveries * Mining Structure-Function Associations in a Brain Image Database * ADRIS * Knowledge Discovery in Mortality Records: An Info-Fuzzy Approach * Consistent and Complete Data and "Expert" Mining in Medicine * A Medical Data Mining Application Based on Evolutionary Computation * Methods of Temporal Data Validation and Abstraction in High-Frequency Domains * Data Mining the Matrix Associated Regions for Gene Therapy * Discovery of Temporal Patterns in Sparse Course-of-Disease Data * Data Mining-Based Modeling of Human Visual Perception * Discovery of Clinical Knowledge in Databases Extracted from Hospital Information Systems * Knowledge Discovery in Time Series.

...read moreread less

136 citations

Proceedings Article•DOI•

Combination Data Mining Methods with New Medical Data to Predicting Outcome of Coronary Heart Disease

[...]

Yanwei Xing¹, Jie Wang¹, Zhihong Zhao¹, Yonghong Gao²•Institutions (2)

New York Academy of Medicine¹, University of Science and Technology of China²

21 Nov 2007

TL;DR: The comparative study of multiple prediction models for survival of CHD patients along with a 10-fold cross-validation provided us with an insight into the relative prediction ability of different data.

...read moreread less

Abstract: The prediction of survival of Coronary heart disease (CHD) has been a challenging research problem for medical society The goal of this paper is to develop data mining algorithms for predicting survival of CHD patients based on 1000 cases We carry out a clinical observation and a 6-month follow up to include 1000 CHD cases The survival information of each case is obtained via follow up Based on the data, we employed three popular data mining algorithms to develop the prediction models using the 502 cases We also used 10-fold cross-validation methods to measure the unbiased estimate of the three prediction models for performance comparison purposes The results indicated that the SVM is the best predictor with 921 % accuracy on the holdout sample artificial neural networks came out to be the second with910% accuracy and the decision trees models came out to be the worst of the three with 896% accuracy The comparative study of multiple prediction models for survival of CHD patients along with a 10-fold cross-validation provided us with an insight into the relative prediction ability of different data

...read moreread less

125 citations

Proceedings Article•DOI•

A Comparative Study of Medical Data Classification Methods Based on Decision Tree and Bagging Algorithms

[...]

My Chau Tu¹, Dongil Shin¹, Dongkyoo Shin¹•Institutions (1)

Sejong University¹

12 Dec 2009

TL;DR: This paper proposes the use of decision tree C4.5 algorithm, bagging with decision treeC4.

...read moreread less

Abstract: Medical data mining has been a popular data mining topic of late. Especially, diagnosing of the heart disease is one of the important issue and many researchers investigated to develop intelligent medical decision support systems to help the physicians. In this paper, we propose the use of decision tree C4.5 algorithm, bagging with decision tree C4.5 algorithm and bagging with Naive Bayes algorithm to identify the heart disease of a patient and compare the effectiveness, correction rate among them. The data we study is collected from patients with coronary artery disease.

...read moreread less

102 citations