scispace - formally typeset
Search or ask a question
Proceedings Article

An empirical study on prediction of heart disease using classification data mining techniques

TL;DR: The use of pattern recognition and data mining techniques into risk prediction models in the clinical domain of cardiovascular medicine is proposed and the data is to be modelled and classified by using classification data mining technique.
Abstract: In this research paper, the use of pattern recognition and data mining techniques into risk prediction models in the clinical domain of cardiovascular medicine is proposed. The data is to be modelled and classified by using classification data mining technique. Some of the limitations of the conventional medical scoring systems are that there is a presence of intrinsic linear combinations of variables in the input set and hence they are not adept at modelling nonlinear complex interactions in medical domains. This limitation is handled in this research by use of classification models which can implicitly detect complex nonlinear relationships between dependent and independent variables as well as the ability to detect all possible interactions between predictor variables.
Citations
More filters
Journal ArticleDOI
TL;DR: This study presents a novel hybrid method for CAD diagnosis, including risk factor identification using correlation based feature subset (CFS) selection with particle swam optimization (PSO) search method and K-means clustering algorithms, which outperforms other techniques.
Abstract: Coronary artery disease (CAD) is caused by atherosclerosis in coronary arteries and results in cardiac arrest and heart attack. For diagnosis of CAD, angiography is used which is a costly time consuming and highly technical invasive method. Researchers are, therefore, prompted for alternative methods such as machine learning algorithms that could use noninvasive clinical data for the disease diagnosis and assessing its severity. In this study, we present a novel hybrid method for CAD diagnosis, including risk factor identification using correlation based feature subset (CFS) selection with particle swam optimization (PSO) search method and K-means clustering algorithms. Supervised learning algorithms such as multi-layer perceptron (MLP), multinomial logistic regression (MLR), fuzzy unordered rule induction algorithm (FURIA) and C4.5 are then used to model CAD cases. We tested this approach on clinical data consisting of 26 features and 335 instances collected at the Department of Cardiology, Indira Gandhi Medical College, Shimla, India. MLR achieves highest prediction accuracy of 88.4 %.We tested this approach on benchmarked Cleaveland heart disease data as well. In this case also, MLR, outperforms other techniques. Proposed hybridized model improves the accuracy of classification algorithms from 8.3 % to 11.4 % for the Cleaveland data. The proposed method is, therefore, a promising tool for identification of CAD patients with improved prediction accuracy.

161 citations


Cites methods from "An empirical study on prediction of..."

  • ...[47] employed Naïve Bayes K-Nearest Neighbor and Decision Tree with CFS, Chi square, consistency subset, filtered attribute, filtered subset and gain ratio methods for dimensionality reduction....

    [...]

Journal Article
TL;DR: This research intends to provide a detailed description of Naive Bayes and decision tree classifier that are applied in the research particularly in the prediction of Heart Disease.
Abstract: The successful experiment of data mining in highly visible fields like marketing, e-business, and retail has led to its application in other sectors and industries. Healthcare is being discovered among these areas. There is an opulence of data available within the healthcare systems. However, there is a scarcity of useful analysis tool to find hidden relationships in data. This research intends to provide a detailed description of Naive Bayes and decision tree classifier that are applied in our research particularly in the prediction of Heart Disease. Some experiment has been conducted to compare the execution of predictive data mining technique on the same dataset, and the consequence reveals that Decision Tree outperforms over Bayesian classification.

117 citations


Cites methods from "An empirical study on prediction of..."

  • ...Somasundaram Professo r, Dept of CSE presented a paper, “An Empirical Study on Prediction of Heart Disease using classification d ata mining technique”[5]....

    [...]

Proceedings ArticleDOI
01 Jan 2016
TL;DR: The objective of the paper is to predict Chronic Kidney Disease using classification techniques like Naive Bayes and Artificial Neural Network and the experimental results implemented in Rapidminer tool show that Naïve Bayes produce more accurate results than Artificial Neural network.
Abstract: Data mining has been a current trend for attaining diagnostic results. Huge amount of unmined data is collected by the healthcare industry in order to discover hidden information for effective diagnosis and decision making. Data mining is the process of extracting hidden information from massive dataset, categorizing valid and unique patterns in data. There are many data mining techniques like clustering, classification, association analysis, regression etc. The objective of our paper is to predict Chronic Kidney Disease(CKD) using classification techniques like Naive Bayes and Artificial Neural Network(ANN). The experimental results implemented in Rapidminer tool show that Naive Bayes produce more accurate results than Artificial Neural Network.

99 citations


Additional excerpts

  • ...in predicting heart disease [6], [7], [8], [9], [11], [12], [13], [16], [18], [22], [24]....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors proposed a system for out-patient (OP) centric Long Term Evolution-Advanced (LTE-A) network optimization, where big data harvested from the OPs' medical records, along with current readings from their body-connected medical IoT sensors are processed and analyzed to predict the likelihood of a life-threatening medical condition, for instance, an imminent stroke.
Abstract: Big data analytics is one of the state-of-the-art tools to optimize networks and transform them from merely being a blind tube that conveys data, into a cognitive, conscious, and self-optimizing entity that can intelligently adapt according to the needs of its users. This, in fact, can be regarded as one of the highest forthcoming priorities of future networks. In this paper, we propose a system for Out-Patient (OP) centric Long Term Evolution-Advanced (LTE-A) network optimization. Big data harvested from the OPs' medical records, along with current readings from their body-connected medical IoT sensors are processed and analyzed to predict the likelihood of a life-threatening medical condition, for instance, an imminent stroke. This prediction is used to ensure that the OP is assigned an optimal LTE-A Physical Resource Blocks (PRBs) to transmit their critical data to their healthcare provider with minimal delay. To the best of our knowledge, this is the first time big data analytics are utilized to optimize a cellular network in an OP-conscious manner. The PRBs assignment is optimized using Mixed Integer Linear Programming (MILP) and a real-time heuristic. Two approaches are proposed, the Weighted Sum Rate Maximization (WSRMax) approach and the Proportional Fairness (PF) approach. The approaches increased the OPs' average SINR by 26.6% and 40.5%, respectively. The WSRMax approach increased the system's total SINR to a level higher than that of the PF approach, however, the PF approach reported higher SINRs for the OPs, better fairness and a lower margin of error.

89 citations

Proceedings Article
11 Mar 2015
TL;DR: The main objective of this paper is to develop a prototype which can determine and extract unknown knowledge related with heart disease from a past heart disease database record which can solve complicated queries for detecting heart disease and assist medical practitioners to make smart clinical decisions which traditional decision support systems were not able to.
Abstract: Heart disease prediction is treated as most complicated task in the field of medical sciences. Thus there arises a need to develop a decision support system for detecting heart disease of a patient. In this paper, we propose efficient genetic algorithm hybrid with the back propagation technique approach for heart disease prediction. Today medical field have come a long way to treat patients with various kind of diseases. Among the most threatening one is the Heart disease which cannot be observed with a naked eye and comes instantly when its limitations are reached. Bad clinical decisions would cause death of a patient which cannot be afforded by any hospital. To achieve a correct and cost effective treatment computer-based and support Systems can be developed to make good decision. Many hospitals use hospital information systems to manage their healthcare or patient data. These systems produce huge amounts of data in the form of images, text, charts and numbers. Sadly, this data is rarely used to support the medical decision making. There is a bulk of hidden information in this data that is not yet explored which give rise to an important query of how to make useful information out of the data. So there is necessity of creating an excellent project which will help practitioners predict the heart disease before it occurs. The main objective of this paper is to develop a prototype which can determine and extract unknown knowledge (patterns and relations) related with heart disease from a past heart disease database record. It can solve complicated queries for detecting heart disease and thus assist medical practitioners to make smart clinical decisions which traditional decision support systems were not able to. By providing efficient treatments, it can help to reduce costs of treatment.

77 citations

References
More filters
Journal ArticleDOI
TL;DR: In this paper, a survey of the available data mining techniques is provided and a comparative study of such techniques is presented, based on a database researcher's point-of-view.
Abstract: Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems and machine learning, and by many industrial companies as an important area with an opportunity of major revenues. Researchers in many different fields have shown great interest in data mining. Several emerging applications in information-providing services, such as data warehousing and online services over the Internet, also call for various data mining techniques to better understand user behavior, to improve the service provided and to increase business opportunities. In response to such a demand, this article provides a survey, from a database researcher's point of view, on the data mining techniques developed recently. A classification of the available data mining techniques is provided and a comparative study of such techniques is presented.

2,327 citations

Proceedings ArticleDOI
29 Nov 2001
TL;DR: The authors propose a new associative classification method, CMAR, i.e., Classification based on Multiple Association Rules, which extends an efficient frequent pattern mining method, FP-growth, constructs a class distribution-associated FP-tree, and mines large databases efficiently.
Abstract: Previous studies propose that associative classification has high classification accuracy and strong flexibility at handling unstructured data. However, it still suffers from the huge set of mined rules and sometimes biased classification or overfitting since the classification is based on only a single high-confidence rule. The authors propose a new associative classification method, CMAR, i.e., Classification based on Multiple Association Rules. The method extends an efficient frequent pattern mining method, FP-growth, constructs a class distribution-associated FP-tree, and mines large databases efficiently. Moreover, it applies a CR-tree structure to store and retrieve mined association rules efficiently, and prunes rules effectively based on confidence, correlation and database coverage. The classification is performed based on a weighted /spl chi//sup 2/ analysis using multiple strong association rules. Our extensive experiments on 26 databases from the UCI machine learning database repository show that CMAR is consistent, highly effective at classification of various kinds of databases and has better average classification accuracy in comparison with CBA and C4.5. Moreover, our performance study shows that the method is highly efficient and scalable in comparison with other reported associative classification methods.

1,336 citations

Journal ArticleDOI
TL;DR: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases.
Abstract: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases. The contributors to the AAAI Press book Knowledge Discovery in Databases were excited at the potential benefits of this research. The editors hope that some of this excitement will communicate itself to "AI Magazine readers of this article.

1,332 citations

01 Jan 1991
TL;DR: In the 1990s, the AAAI Press book Knowledge Discovery in Databases was published, and the potential benefits of this research were discussed by the contributors to the book as discussed by the authors, who hope that some of this excitement will communicate itself to "AI Magazine readers of this article".
Abstract: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases. The contributors to the AAAI Press book Knowledge Discovery in Databases were excited at the potential benefits of this research. The editors hope that some of this excitement will communicate itself to "AI Magazine readers of this article.

1,292 citations