scispace - formally typeset
Proceedings ArticleDOI

Visual data mining techniques for classification of diabetic patients

Reads0
Chats0
TLDR
This research was based on three techniques of EM Algorithm, h-means+ clustering and Genetic Algorithm to form clusters with similar symptoms of diabetic patients, and result analyses proved that h-Means+ and double crossover genetics process based techniques were better on performance comparison scale.
Abstract
Clustering is a data mining technique for finding important patterns in unorganized and huge data collections. The likelihood approach of clustering technique is quite often used by many researchers for classifications due to its' being simple and easy to implement. It uses Expectation-Maximization (EM) algorithm for sampling. The study of classification of diabetic patients was main focus of this research work. Diabetic patients were classified by data mining techniques for medical data obtained from Pima Indian Diabetes (PID) data set. This research was based on three techniques of EM Algorithm, h-means+ clustering and Genetic Algorithm (GA). These techniques were employed to form clusters with similar symptoms. Result analyses proved that h-means+ and double crossover genetics process based techniques were better on performance comparison scale. The simulation tests were performed on WEKA software tool for three models used to test classification. The hypothesis of similar patterns of diabetes case among PID and local hospital data was tested and found positive with correlation coefficient of 0.96 for two types of the data sets. About 35% of a total of 768 test samples were found with diabetes presence.

read more

Citations
More filters
Journal ArticleDOI

Diagnosis of diabetes using classification mining techniques

TL;DR: The research hopes to propose a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients, by employing Decision Tree and Naive Bayes algorithms.
Journal ArticleDOI

Diagnosis of Diabetes Using Classification Mining Techniques

TL;DR: In this article, the authors proposed a quicker and more efficient technique of diagnosing the disease, leading to timely treatment of the patients by employing Decision Tree and Naive Bayes algorithms.
Journal ArticleDOI

Rule Extraction From Support Vector Machines Using Ensemble Learning Approach: An Application for Diagnosis of Diabetes

TL;DR: Results on China Health and Nutrition Survey data show that the proposed ensemble learning method generates rule sets with weighted average precision 94.2% and weighted average recall 93.9% for all classes, and it supports a second opinion for lay users.
Proceedings ArticleDOI

Prediction and diagnosis of diabetes mellitus — A machine learning approach

TL;DR: A decision support system is proposed that uses AdaBoost algorithm with Decision Stump as base classifier for classification that is greater compared to that of Support Vector Machine, Naive Bayes and Decision Tree.
Book ChapterDOI

Classification of Diabetes Mellitus Disease (DMD): A Data Mining (DM) Approach

TL;DR: J48 and Naive Bayesian techniques are used for the early detection of diabetes and a model is proposed and elaborated, in order to make medical practitioner to explore and to understand the discovered rules better.
References
More filters
Journal ArticleDOI

Unsupervised learning of finite mixture models

TL;DR: The novelty of the approach is that it does not use a model selection criterion to choose one among a set of preestimated candidate models; instead, it seamlessly integrate estimation and model selection in a single algorithm.
Journal ArticleDOI

Feature Selection for Unsupervised Learning

TL;DR: This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.
Journal ArticleDOI

Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models

TL;DR: Simple methods to choose sensible starting values for the EM algorithm to get maximum likelihood parameter estimation in mixture models are compared and the simple random initialization which is probably the most employed way of initiating EM is often outperformed by strategies using CEM, SEM or shorts runs of EM before running EM.
Journal ArticleDOI

A Component-Wise EM Algorithm for Mixtures

TL;DR: In this paper, a component-wise expectation maximization (EM) algorithm for mixtures is proposed, which uses, at each iteration, the smallest admissible missing data spaces.
Journal ArticleDOI

Mean shift-based clustering

TL;DR: A mean shift-based clustering algorithm that can solve bandwidth selection problems from a different point of view, as well as those of computational complexity, cluster validity and improvements of mean shift in large continuous, discrete data sets is proposed.