TL;DR: The data mining methods and techniques will be explored to identify the suitable methods and Techniques for efficient classification of Diabetes dataset and in mining useful patterns.
Abstract: Medical professionals need a reliable prediction methodology to diagnose Diabetes. Data mining is the process of analysing data from different perspectives and summarizing it into useful information. The main goal of data mining is to discover new patterns for the users and to interpret the data patterns to provide meaningful and useful information for the users. Data mining is applied to find useful patterns to help in the important tasks of medical diagnosis and treatment. This project aims for mining the relationship in Diabetes data for efficient classification. The data mining methods and techniques will be explored to identify the suitable methods and techniques for efficient classification of Diabetes dataset and in mining useful patterns.
TL;DR: A diabetes prediction model for better classification of diabetes which includes few external factors responsible for diabetes along with regular factors like Glucose, BMI, Age, Insulin, etc is proposed.
TL;DR: The predictive analysis algorithm in Hadoop/Map Reduce environment is used to predict the diabetes types prevalent, complications associated with it and the type of treatment to be provided and this system provides an efficient way to cure and care the patients with better outcomes like affordability and availability.
135 citations
Cites methods from "Application of Data Mining Methods ..."
...5 classification algorithm was carried out in Pima Indians Diabetes Database [3]....
[...]
...Prediction and classification of various type of diabetes
using C4.5 classification algorithm was carried out in Pima Indians Diabetes Database [3]....
TL;DR: J48 and Naive Bayesian techniques are used for the early detection of diabetes and a model is proposed and elaborated, in order to make medical practitioner to explore and to understand the discovered rules better.
Abstract: The diabetes mellitus disease (DMD) commonly referred as diabetes is a significant public health problem. Predicting the disease at the early stage can save the valuable human resource. Voluminous datasets are available in various medical data repositories in the form of clinical patient records and pathological test reports which can be used for real-world applications to disclose the hidden knowledge. Various data mining (DM) methods can be applied to these datasets, stored in data warehouses for predicting DMD. The aim of this research is to predict diabetes based on some of the DM techniques like classification and clustering. Out of which, classification is one of the most suitable methods for predicting diabetes. In this study, J48 and Naive Bayesian techniques are used for the early detection of diabetes. This research will help to propose a quicker and more efficient technique for diagnosis of disease, leading to timely and proper treatment of patients. We have also proposed a model and elaborated it step-by-step, in order to make medical practitioner to explore and to understand the discovered rules better. The study also shows the algorithm generated on the dataset collected from college medical hospital as well as from online repository. In the end, an article also outlines how an intelligent diagnostic system works. A clinical trial of this proposed method involves local patients, which is still continuing and requires longer research and experimentation.
TL;DR: Experimental results and evaluation show that Bagging ensemble technique shows better performance as compared to single as well as other ensemble techniques.
Abstract: Conventional techniques for clinical decision support systems are based on a single classifier or simple combination of these classifiers used for disease diagnosis and prediction. Recently much attention has been paid on improving the performance of disease prediction by using ensemble-based methods. In this paper, we use multiple ensemble classification techniques for diabetes datasets. Three types of decision trees ID3, C4.5 and CART are used as the base classifiers. The ensemble techniques used are Majority Voting, Adaboost, Bayesian Boosting, Stacking and Bagging. Two benchmark diabetes datasets are used from UCI and Bio Stat repositories respectively. Experimental results and evaluation show that Bagging ensemble technique shows better performance as compared to single as well as other ensemble techniques.
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data
"Application of Data Mining Methods ..." refers background in this paper
...Data Mining [4] refers to extracting or mining knowledge from large amounts of data....
[...]
...The accuracy [4] of a classifier on a given test set is the percentage of test set tuples that are correctly classified by the classifier....
[...]
...Knowledge Discovery in Databases (KDD) is the process of finding useful information and patterns in data which involves Selection, Pre-processing, Transformation, Data Mining and Evaluation....
[...]
...Volume 2, Issue 3, September 2012
Application of Data Mining Methods and
Techniques for Diabetes Diagnosis K. Rajesh, V. Sangeetha
Abstract-- Medical professionals need a reliable prediction methodology to diagnose Diabetes....
TL;DR: Feature Selection for Knowledge Discovery and Data Mining offers an overview of the methods developed since the 1970's and provides a general framework in order to examine these methods and categorize them and suggests guidelines for how to use different methods under various circumstances.
Abstract: From the Publisher:
With advanced computer technologies and their omnipresent usage, data accumulates in a speed unmatchable by the human's capacity to process data. To meet this growing challenge, the research community of knowledge discovery from databases emerged. The key issue studied by this community is, in layman's terms, to make advantageous use of large stores of data. In order to make raw data useful, it is necessary to represent, process, and extract knowledge for various applications. Feature Selection for Knowledge Discovery and Data Mining offers an overview of the methods developed since the 1970's and provides a general framework in order to examine these methods and categorize them. This book employs simple examples to show the essence of representative feature selection methods and compares them using data sets with combinations of intrinsic properties according to the objective of feature selection. In addition, the book suggests guidelines for how to use different methods under various circumstances and points out new challenges in this exciting area of research. Feature Selection for Knowledge Discovery and Data Mining is intended to be used by researchers in machine learning, data mining, knowledge discovery, and databases as a toolbox of relevant tools that help in solving large real-world problems. This book is also intended to serve as a reference book or secondary text for courses on machine learning, data mining, and databases.
TL;DR: This study is applying Naive Bayes data mining classifier technique which produces an optimal prediction model using minimum training set which predicts attributes such as age, sex, blood pressure and blood sugar and the chances of a diabetic patient getting heart disease.
Abstract: objective of our paper is to predict the chances of diabetic patient getting heart disease. In this study, we are applying Naive Bayes data mining classifier technique which produces an optimal prediction model using minimum training set. Data mining is the analysis step of the Knowledge Discovery in Databases process (KDD). Data mining involves use of techniques to find underlying structures and relationships in a large database. Diabetes is a set of related diseases in which body cannot regulate the amount of sugar specifically glucose (hyperglycemia) in the blood. The diagnosis of diseases is a vital role in medical field. Using diabetic"s diagnosis, the proposed system predicts attributes such as age, sex, blood pressure and blood sugar and the chances of a diabetic patient getting a heart disease.