scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Heart Disease Prediction Using Data Mining Techniques

TL;DR: Taking advantage of the various data mining techniques and develop prediction models for heart disease survivability is reported about.
Abstract: Studies have shown that heart diseases have emerged as the number one cause of deaths. Heart disease is accountable for deaths in all age groups and is common among males and females. A good solution to this problem is to be able to predict what a patient's health status will be like in the future so the doctors can start treatment much sooner which will yield better results. It's a lot better than acting at the last minute where the patient is already at risk and hence the prediction of heart disease is widely researched area. A lot of research and technological advancement has been recorded in similar fields. This paper aims to report about taking advantage of the various data mining techniques and develop prediction models for heart disease survivability.
Citations
More filters
Proceedings ArticleDOI
01 Mar 2018
TL;DR: This project proposes a prediction model to predict whether a people have a heart disease or not and to provide an awareness or diagnosis on that and compares the accuracies of applying rules to the individual results of Support Vector Machine, Gradient Boosting, Random forest, Naive Bayes classifier and logistic regression on the dataset taken in a region to present an accurate model of predicting cardiovascular disease.
Abstract: Healthcare is an inevitable task to be done in human life. Cardiovascular disease is a broad category for a range of diseases that are affecting heart and blood vessels. The early methods of forecasting the cardiovascular diseases helped in making decisions about the changes to have occurred in high-risk patients which resulted in the reduction of their risks. The health care industry contains lots of medical data, therefore machine learning algorithms are required to make decisions effectively in the prediction of heart diseases. Recent research has delved into uniting these techniques to provide hybrid machine learning algorithms. In the proposed research, data pre-processing uses techniques like the removal of noisy data, removal of missing data, filling default values if applicable and classification of attributes for prediction and decision making at different levels. The performance of the diagnosis model is obtained by using methods like classification, accuracy, sensitivity and specificity analysis. This project proposes a prediction model to predict whether a people have a heart disease or not and to provide an awareness or diagnosis on that. This is done by comparing the accuracies of applying rules to the individual results of Support Vector Machine, Gradient Boosting, Random forest, Naive Bayes classifier and logistic regression on the dataset taken in a region to present an accurate model of predicting cardiovascular disease.

117 citations


Cites background from "Heart Disease Prediction Using Data..."

  • ...A large portion of the instruments can't deal with huge information and most are not brought together, not conveyed on cloud and consequently not open on the web [12]....

    [...]

Proceedings ArticleDOI
06 May 2021
TL;DR: In this paper, the authors describe a qualitative study examining community health workers' perceptions of an AI application for automated disease diagnosis in rural India, and characterize CHWs' knowledge, perceptions, and understandings of AI; and the benefits and challenges that CHWs anticipate as AI applications are integrated into their workflows.
Abstract: Recent advances in Artificial Intelligence (AI) suggest that AI applications could transform healthcare delivery in the Global South. However, as researchers and technology companies rush to develop AI applications that aid the health of marginalized communities, it is critical to consider the needs and perceptions of the community health workers (CHWs) who will have to integrate these AI applications into the essential healthcare services they provide to rural communities. We describe a qualitative study examining CHWs’ perceptions of an AI application for automated disease diagnosis. Drawing on data from 21 interviews with CHWs in rural India, we characterize (1) CHWs’ knowledge, perceptions, and understandings of AI; and (2) the benefits and challenges that CHWs anticipate as AI applications are integrated into their workflows, including their opinions on automation of their work, possible misdiagnosis and errors, data access and surveillance issues, security and privacy challenges, and questions concerning trust. We conclude by discussing the implications of our work for HCI and AI research in low-resource environments.

29 citations

Proceedings ArticleDOI
01 Dec 2018
TL;DR: A cloud-based 4- tier architecture that can significantly improve the prediction and monitoring of patient's health information and the Artificial Neural Network (ANN) achieved the highest performance of all.
Abstract: Heart disease prediction and detection has long been considered as a critical issue. Early detection of heart disease is an important issue in health care services (HCS). In growing amount of health care systems, patients are offered expensive therapies and operation that is quiet expensive for developing countries. Recently, heart disease is a prominent public chronic disease, ex. it's a growing concern in the US. The main reason of these diseases are tobacco consumption, bad life style, lack of physical activity and the intake of alcohol. Therefore, there is a need for the cloud based architecture that can efficiently predict and track health information. Recently, machine learning techniques have already been established to solve clinical problem and medical diagnosis. In this study, we proposed a cloud-based 4- tier architecture that can significantly improve the prediction and monitoring of patient's health information. Hence, we used five popular supervised learning based machine learning technique for early detection of heart disease. The major purpose of this study is to examine the performance of the selected classification techniques. In addition, we use prominent evaluation criteria to observe the best performance of these machine learning techniques. Moreover, we used the ten-fold cross-validation technique to evaluate the performance of the five classifiers. The analysis results indicate that the Artificial Neural Network (ANN) achieved the highest performance of all. However, health care researchers and practitioners can obtain independent understanding from this work while selecting machine learning techniques to apply in their area.

29 citations


Cites background from "Heart Disease Prediction Using Data..."

  • ...Machine learning (ML) methods have drawn that aim to solve different medical and clinical problem [9]....

    [...]

Proceedings ArticleDOI
25 Apr 2019
TL;DR: Health care field has a vast amount of data, for processing those data certain techniques are used, and this system evaluates those parameters using data mining classification technique which shows the best algorithm in terms of accuracy level of heart disease.
Abstract: Health care field has a vast amount of data, for processing those data certain techniques are used. Data mining is one of the techniques often used. Heart disease is the Leading cause of death worldwide. This System predicts the arising possibilities of Heart Disease. The outcomes of this system provide the chances of occurring heart disease in terms of percentage. The datasets used are classified in terms of medical parameters. This system evaluates those parameters using data mining classification technique. The datasets are processed in python programming using two main Machine Learning Algorithm namely Decision Tree Algorithm and Naive Bayes Algorithm which shows the best algorithm among these two in terms of accuracy level of heart disease.

26 citations


Cites methods from "Heart Disease Prediction Using Data..."

  • ...Advantages and Disadvantages of each technique can be known using this paper [1]....

    [...]

Book ChapterDOI
24 Apr 2020
TL;DR: This paper assess and analyze using three unique data mining arrangement methods, for example, Naive Bayes (NB), Support Vector Machine (SVM) and Decision Tree to decide the potential approaches to predict the possibility of heart disease for diabetic patients dependent on their predictive accuracy.
Abstract: The healthcare sectors have many difficulties and challenges in finding diseases. Healthcare organizations are collecting bulk amount of patient data. The Data mining methods are utilized to decide covered data that is valuable to healthcare specialists with effective analytic decision making. Data mining strategies are utilized in the field of the healthcare industry for different purposes. The objective of this paper is to assess and analyze using three unique data mining arrangement methods, for example, Naive Bayes (NB), Support Vector Machine (SVM) and Decision Tree to decide the potential approaches to predict the possibility of heart disease for diabetic patients dependent on their predictive accuracy.

25 citations

References
More filters
Book
13 May 2011
TL;DR: The amount of data in the authors' world has been exploding, and analyzing large data sets will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, according to research by MGI and McKinsey.
Abstract: The amount of data in our world has been exploding, and analyzing large data sets—so-called big data— will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, according to research by MGI and McKinsey's Business Technology Office. Leaders in every sector will have to grapple with the implications of big data, not just a few data-oriented managers. The increasing volume and detail of information captured by enterprises, the rise of multimedia, social media, and the Internet of Things will fuel exponential growth in data for the foreseeable future.

4,700 citations

Proceedings ArticleDOI
19 Oct 2003
TL;DR: This technique is the only one, to the best of the knowledge, that reflects in the resulting embedding both the data coordinates and pairwise similarities and/or dissimilarities between the data elements.
Abstract: We present a novel family of data-driven linear transformations, aimed at visualizing multivariate data in a low-dimensional space in a way that optimally preserves the structure of the data. The well-studied PCA and Fisher's LDA (linear discriminant analysis) are shown to be special members in this family of transformations, and we demonstrate how to generalize these two methods such as to enhance their performance. Furthermore, our technique is the only one, to the best of our knowledge, that reflects in the resulting embedding both the data coordinates and pairwise similarities and/or dissimilarities between the data elements. Even more so, when information on the clustering (labeling) decomposition of the data is known, this information can be integrated in the linear transformation, resulting in embeddings that clearly show the separation between the clusters, as well as their infrastructure. All this makes our technique very flexible and powerful, and lets us cope with kinds of data that other techniques fail to describe properly.

82 citations

Proceedings ArticleDOI
05 Jul 2005
TL;DR: This paper uses multivariate visualization techniques and interactive visual exploration to study three problems: which dimensionality reduction technique best preserves the interrelationships within a set of text documents; what is the of the results to the number of output dimensions; and can the authors automatically remove redundant or unimportant words from the vector extracted from the documents while still preserving the majority of information.
Abstract: In the text document visualization community, statistical analysis tools (e.g., principal component analysis and multidimensional scaling) and neurocomputation models (e.g., self-organizing feature maps) have been widely used for dimensionality reduction. Often the resulting dimensionality is set to two, as this facilitates plotting the results. The validity and effectiveness of these approaches largely depend on the specific data sets used and semantics of the targeted applications. To date, there has been little evaluation to assess and compare dimensionality reduction methods and dimensionality reduction processes, either numerically or empirically. The focus of this paper is to propose a mechanism for comparing and evaluating the effectiveness of dimensionality reduction techniques in the visual exploration of text document archives. We use multivariate visualization techniques and interactive visual exploration to study three problems: (a) Which dimensionality reduction technique best preserves the interrelationships within a set of text documents; (b) What is the sensitivity of the results to the number of output dimensions; (c) Can we automatically remove redundant or unimportant words from the vector extracted from the documents while still preserving the majority of information, and thus make dimensionality reduction more efficient. To study each problem, we generate supplemental dimensions based on several dimensionality reduction algorithms and parameters controlling these algorithms. We then visually analyze and explore the characteristics of the reduced dimensional spaces as implemented within a linked, multiview multidimensional visual exploration tool, XmdvTool. We compare the derived dimensions to features known to be present in the original data. Quantitative measures are also used in identifying the quality of results using different numbers of output dimensions.

58 citations

Proceedings ArticleDOI
06 Jul 2017
TL;DR: The stored patterns created by the self-organizing map algorithm will detect and prevent the unauthorized access on banking transactions, and be used to identify and recognize the user patterns related to e-transactions.
Abstract: Online banking is the one most common service availed by almost all banking customers in the current era. Every second the banking organization, generate enormous amount of valuable data from their customers and their transactions. These valuable data need to be saved and analysed effectively using big data analytic techniques so as to get the necessary insights for the banking organizations. In today's market trend, analysing large data sets comprising of variety of data is of high importance to discover hidden patterns, market tendencies, customer likings and other business insights. The purpose of this research paper is to suggest a machine learning and big data analytics technique to detect and prevent any fraudulent online transactions. The model allows storage of the huge volume of online transaction data, which is then cleaned and features were extracted and reduced using the principal component analysis method. The reduced features are used to train the machine learning model, which is used to identify and recognize the user patterns related to e-transactions. Any e-transactions carried out by the user, the algorithm first checks for the matching user patterns, if there is a match, then the transaction will be successful otherwise the transaction will be reported as fraudulent. Thus the stored patterns created by the self-organizing map algorithm will detect and prevent the unauthorized access on banking transactions.

13 citations