Hybrid prediction model for Type-2 diabetic patients

doi:10.1016/J.ESWA.2010.05.078

Journal ArticleDOI

Hybrid prediction model for Type-2 diabetic patients

B. M. Patil, +2 more

- 01 Dec 2010 -

Expert Systems With Applications

- Vol. 37, Iss: 12, pp 8102-8108

TLDR

This study proposes Hybrid Prediction Model (HPM) which uses Simple K-means clustering algorithm aimed at validating chosen class label of given data and subsequently applying the classification algorithm to the result set.

Abstract:

A wide range of computational methods and tools for data analysis are available. In this study we took advantage of those available technological advancements to develop prediction models for the prediction of a Type-2 Diabetic Patient. We aim to investigate how the diabetes incidents are affected by patients' characteristics and measurements. Efficient predictive modeling is required for medical researchers and practitioners. This study proposes Hybrid Prediction Model (HPM) which uses Simple K-means clustering algorithm aimed at validating chosen class label of given data (incorrectly classified instances are removed, i.e. pattern extracted from original data) and subsequently applying the classification algorithm to the result set. C4.5 algorithm is used to build the final classifier model by using the k-fold cross-validation method. The Pima Indians diabetes data was obtained from the University of California at Irvine (UCI) machine learning repository datasets. A wide range of different classification methods have been applied previously by various researchers in order to find the best performing algorithm on this dataset. The accuracies achieved have been in the range of 59.4-84.05%. However the proposed HPM obtained a classification accuracy of 92.38%. In order to evaluate the performance of the proposed method, sensitivity and specificity performance measures that are used commonly in medical classification studies were used.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Type 2 diabetes mellitus prediction model based on data mining

Wu Han, +4 more

- 01 Jan 2018 -

Informatics in Medicine Unlocked

TL;DR: A novel model based on data mining techniques for predicting type 2 diabetes mellitus (T2DM) based on a series of preprocessing procedures is proposed and is shown to be useful for the realistic health management of diabetes.

...read moreread less

Journal ArticleDOI

Feature selection and classification systems for chronic disease prediction: A review

Divya Jain, +1 more

- 01 Nov 2018 -

Egyptian Informatics Journal

TL;DR: This work presents a comprehensive overview of various feature selection methods and their inherent pros and cons, and analyzes adaptive classification systems and parallel classification systems for chronic disease prediction.

...read moreread less

Journal ArticleDOI

Review: Knowledge discovery in medicine: Current issue and future trend

Nura Esfandiari, +3 more

- 01 Jul 2014 -

Expert Systems With Applications

TL;DR: The main idea in this paper is to describe key papers and provide some guidelines to help medical practitioners to explore previous works and identify interesting areas for future research.

...read moreread less

Journal ArticleDOI

Hybrid Prediction Model for Type 2 Diabetes and Hypertension Using DBSCAN-Based Outlier Detection, Synthetic Minority Over Sampling Technique (SMOTE), and Random Forest

Muhammad Fazal Ijaz, +3 more

- 08 Aug 2018 -

Applied Sciences

TL;DR: The result showed that by integrating DBSCAN-based outlier detection, SMOTE, and RF, diabetes and hypertension could be successfully predicted and the proposed HPM provided the best performance result as compared to other models for predicting diabetes as well as hypertension.

...read moreread less

Journal ArticleDOI

Hybrid prediction model with missing value imputation for medical data

Archana Purwar, +1 more

- 01 Aug 2015 -

Expert Systems With Applications

TL;DR: The proposed HPM-MI system has significantly improved data quality by use of best imputation technique after quantitative analysis of eleven imputation approaches and will be very useful in prediction for medical domain especially when numbers of missing value are large in the data set.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Some methods for classification and analysis of multivariate observations

James B. MacQueen

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.

...read moreread less

Book

Data Mining: Concepts and Techniques

Jiawei Han, +2 more

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

Book

C4.5: Programs for Machine Learning

J. Ross Quinlan

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.

...read moreread less

Book

Data Mining: Practical Machine Learning Tools and Techniques

Ian H. Witten, +2 more

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.

...read moreread less

Journal ArticleDOI

SMOTE: synthetic minority over-sampling technique

Nitesh V. Chawla, +3 more

- 01 Jan 2002 -

Journal of Artificial Intelligence Resea...

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.

...read moreread less