Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques
Abid Ishaq,Saima Sadiq,Muhammad Umer,Saleem Ullah,Seyedali Mirjalili,Vaibhav Rupapara,Michele Nappi +6 more
TLDR
In this paper, the authors analyzed the heart failure survivors from the dataset of 299 patients admitted in hospital and found significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient's survivor prediction.Abstract:
Cardiovascular disease is a substantial cause of mortality and morbidity in the world. In clinical data analytics, it is a great challenge to predict heart disease survivor. Data mining transforms huge amounts of raw data generated by the health industry into useful information that can help in making informed decisions. Various studies proved that significant features play a key role in improving performance of machine learning models. This study analyzes the heart failure survivors from the dataset of 299 patients admitted in hospital. The aim is to find significant features and effective data mining techniques that can boost the accuracy of cardiovascular patient’s survivor prediction. To predict patient’s survival, this study employs nine classification models: Decision Tree (DT), Adaptive boosting classifier (AdaBoost), Logistic Regression (LR), Stochastic Gradient classifier (SGD), Random Forest (RF), Gradient Boosting classifier (GBM), Extra Tree Classifier (ETC), Gaussian Naive Bayes classifier (G-NB) and Support Vector Machine (SVM). The imbalance class problem is handled by Synthetic Minority Oversampling Technique (SMOTE). Furthermore, machine learning models are trained on the highest ranked features selected by RF. The results are compared with those provided by machine learning algorithms using full set of features. Experimental results demonstrate that ETC outperforms other models and achieves 0.9262 accuracy value with SMOTE in prediction of heart patient’s survival.read more
Citations
More filters
Journal ArticleDOI
A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection
TL;DR: Wang et al. as discussed by the authors proposed an efficient approach to detect credit card fraud using a neural network ensemble classifier and a hybrid data resampling method, which is obtained using a long short-term memory (LSTM) neural network as the base learner in the adaptive boosting technique.
Journal ArticleDOI
A Neural Network Ensemble With Feature Engineering for Improved Credit Card Fraud Detection
TL;DR: Results show that the classifiers performed better when trained with the resampled data, and the proposed LSTM ensemble outperformed the other algorithms by obtaining a sensitivity and specificity of 0.996 and 0.998, respectively.
Journal ArticleDOI
A Comprehensive Investigation of the Performances of Different Machine Learning Classifiers with SMOTE-ENN Oversampling Technique and Hyperparameter Optimization for Imbalanced Heart Failure Dataset
Mirza Muntasir Nishat,Ishrak Jahan Ratul,Abdullah Al-Monsur,Abrar Mohammad Ar-Rafi,Sarker Mohammad Nasrullah,Md. Taslim Reza,Md. Rezaul Hoque Khan +6 more
TL;DR: This comprehensive investigation portrays a vivid visualization of the applicability and compatibility of different machine learning algorithms in such an imbalanced dataset and presents the role of the SMOTE-ENN algorithm and hyperparameter optimization for enhancing the performances of the machinelearning algorithms.
Journal ArticleDOI
A CNN-based novel solution for determining the survival status of heart failure patients with clinical record data: numeric to image
TL;DR: In this paper, a heart failure dataset consisting of numerical values only, needs to be converted into image data for analysis using the advantages of CNN and the highest accuracy of 95.13 % is obtained with the ResNet18 model and this accuracy is superior to studies using previous numerical raw data.
Journal ArticleDOI
Bidimensional and Tridimensional Poincaré Maps in Cardiology: A Multiclass Machine Learning Study
Leandro Donisi,Carlo Ricciardi,Giuseppe Cesarelli,Armando Coccia,Federica Amitrano,Sarah Adamo,Giovanni D'Addio +6 more
TL;DR: The study shows the proposed combination of unconventional features extracted from Poincaré maps and well-known machine learning algorithms represents a valuable approach to automatically classify patients with different cardiac diseases.
References
More filters
Journal ArticleDOI
Random Forests
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal ArticleDOI
Greedy function approximation: A gradient boosting machine.
TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Journal ArticleDOI
Heart Disease and Stroke Statistics—2019 Update: A Report From the American Heart Association
Emelia J. Benjamin,Paul Muntner,Alvaro Alonso,Márcio Sommer Bittencourt,Clifton W. Callaway,April P. Carson,Alanna M. Chamberlain,Alex R. Chang,Susan Cheng,Sandeep R Das,Francesca N. Delling,Luc Djoussé,Mitchell S.V. Elkind,Jane F. Ferguson,Myriam Fornage,Lori C. Jordan,Sadiya S. Khan,Brett M. Kissela,Kristen L. Knutson,Tak W. Kwan,Daniel T. Lackland,Tené T. Lewis,Judith H. Lichtman,Chris T. Longenecker,Matthew Shane Loop,Pamela L. Lutsey,Seth S. Martin,Kunihiro Matsushita,Andrew E. Moran,Michael E. Mussolino,Martin O'Flaherty,Ambarish Pandey,Amanda M. Perak,Wayne D. Rosamond,Gregory A. Roth,Uchechukwu K.A. Sampson,Gary Satou,Emily B. Schroeder,Svati H. Shah,Nicole L. Spartano,Andrew Stokes,David L. Tirschwell,Connie W. Tsao,Mintu P. Turakhia,Lisa B. VanWagner,John T. Wilkins,Sally S. Wong,Salim S. Virani +47 more
TL;DR: March 5, 2019 e1 WRITING GROUP MEMBERS Emelia J. Virani, MD, PhD, FAHA, Chair Elect On behalf of the American Heart Association Council on Epidemiology and Prevention Statistics Committee and Stroke Statistics Subcommittee.
Journal ArticleDOI
Extremely randomized trees
TL;DR: A new tree-based ensemble method for supervised classification and regression problems that consists of randomizing strongly both attribute and cut-point choice while splitting a tree node and builds totally randomized trees whose structures are independent of the output values of the learning sample.