scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Design of a hybrid system for the diabetes and heart diseases

01 Jul 2008-Expert Systems With Applications (Pergamon Press, Inc.)-Vol. 35, Iss: 1, pp 82-89
TL;DR: A new method for classification of data of a medical database is presented and one of the best results compared with results obtained from related previous studies and reported in the UCI web sites is observed.
Abstract: Data can be classified according to their properties. Classification is implemented by developing a model with existing records by using sample data. One of the aims of classification is to increase the reliability of the results obtained from the data. Fuzzy and crisp values are used together in medical data. Regarding to this, a new method is presented for classification of data of a medical database in this study. Also a hybrid neural network that includes artificial neural network (ANN) and fuzzy neural network (FNN) was developed. Two real-time problem data were investigated for determining the applicability of the proposed method. The data were obtained from the University of California at Irvine (UCI) machine learning repository. The datasets are Pima Indians diabetes and Cleveland heart disease. In order to evaluate the performance of the proposed method accuracy, sensitivity and specificity performance measures that are used commonly in medical classification studies were used. The classification accuracies of these datasets were obtained by k-fold cross-validation. The proposed method achieved accuracy values 84.24% and 86.8% for Pima Indians diabetes dataset and Cleveland heart disease dataset, respectively. It has been observed that these results are one of the best results compared with results obtained from related previous studies and reported in the UCI web sites.
Citations
More filters
Journal ArticleDOI
TL;DR: The proposed machine-learning-based decision support system will assist the doctors to diagnosis heart patients efficiently and can easily identify and classify people with heart disease from healthy people.
Abstract: Heart disease is one of the most critical human diseases in the world and affects human life very badly. In heart disease, the heart is unable to push the required amount of blood to other parts of the body. Accurate and on time diagnosis of heart disease is important for heart failure prevention and treatment. The diagnosis of heart disease through traditional medical history has been considered as not reliable in many aspects. To classify the healthy people and people with heart disease, noninvasive-based methods such as machine learning are reliable and efficient. In the proposed study, we developed a machine-learning-based diagnosis system for heart disease prediction by using heart disease dataset. We used seven popular machine learning algorithms, three feature selection algorithms, the cross-validation method, and seven classifiers performance evaluation metrics such as classification accuracy, specificity, sensitivity, Matthews’ correlation coefficient, and execution time. The proposed system can easily identify and classify people with heart disease from healthy people. Additionally, receiver optimistic curves and area under the curves for each classifier was computed. We have discussed all of the classifiers, feature selection algorithms, preprocessing methods, validation method, and classifiers performance evaluation metrics used in this paper. The performance of the proposed system has been validated on full features and on a reduced set of features. The features reduction has an impact on classifiers performance in terms of accuracy and execution time of classifiers. The proposed machine-learning-based decision support system will assist the doctors to diagnosis heart patients efficiently.

336 citations


Cites methods from "Design of a hybrid system for the d..."

  • ...Kahramanli and Allahverdi [16] designed a heart disease classification system used a hybrid technique in which a neural network integrates a fuzzy neural network and artificial neural network....

    [...]

Journal ArticleDOI
Wu Han1, Shengqi Yang1, Zhangqin Huang1, Jian He1, Xiaoyi Wang1 
TL;DR: A novel model based on data mining techniques for predicting type 2 diabetes mellitus (T2DM) based on a series of preprocessing procedures is proposed and is shown to be useful for the realistic health management of diabetes.

258 citations

Journal ArticleDOI
TL;DR: The experimental results show that the proposed feature selection algorithm (FCMIM) is feasible with classifier support vector machine for designing a high-level intelligent system to identify heart disease and it achieved good accuracy as compared to previously proposed methods.
Abstract: Heart disease is one of the complex diseases and globally many people suffered from this disease. On time and efficient identification of heart disease plays a key role in healthcare, particularly in the field of cardiology. In this article, we proposed an efficient and accurate system to diagnosis heart disease and the system is based on machine learning techniques. The system is developed based on classification algorithms includes Support vector machine, Logistic regression, Artificial neural network, K-nearest neighbor, Naive bays, and Decision tree while standard features selection algorithms have been used such as Relief, Minimal redundancy maximal relevance, Least absolute shrinkage selection operator and Local learning for removing irrelevant and redundant features. We also proposed novel fast conditional mutual information feature selection algorithm to solve feature selection problem. The features selection algorithms are used for features selection to increase the classification accuracy and reduce the execution time of classification system. Furthermore, the leave one subject out cross-validation method has been used for learning the best practices of model assessment and for hyperparameter tuning. The performance measuring metrics are used for assessment of the performances of the classifiers. The performances of the classifiers have been checked on the selected features as selected by features selection algorithms. The experimental results show that the proposed feature selection algorithm (FCMIM) is feasible with classifier support vector machine for designing a high-level intelligent system to identify heart disease. The suggested diagnosis system (FCMIM-SVM) achieved good accuracy as compared to previously proposed methods. Additionally, the proposed system can easily be implemented in healthcare for the identification of heart disease.

247 citations


Cites methods from "Design of a hybrid system for the d..."

  • ...[23] designed HD classification system by utilizing a neural network with the integration of Fuzzy logic....

    [...]

Journal ArticleDOI
01 Feb 2011
TL;DR: A novel fuzzy expert system can work effectively for diabetes decision support application and the semantic fuzzy decision making mechanism simulates the semantic description of medical staff for diabetes-related application.
Abstract: An increasing number of decision support systems based on domain knowledge are adopted to diagnose medical conditions such as diabetes and heart disease. It is widely pointed that the classical ontologies cannot sufficiently handle imprecise and vague knowledge for some real world applications, but fuzzy ontology can effectively resolve data and knowledge problems with uncertainty. This paper presents a novel fuzzy expert system for diabetes decision support application. A five-layer fuzzy ontology, including a fuzzy knowledge layer, fuzzy group relation layer, fuzzy group domain layer, fuzzy personal relation layer, and fuzzy personal domain layer, is developed in the fuzzy expert system to describe knowledge with uncertainty. By applying the novel fuzzy ontology to the diabetes domain, the structure of the fuzzy diabetes ontology (FDO) is defined to model the diabetes knowledge. Additionally, a semantic decision support agent (SDSA), including a knowledge construction mechanism, fuzzy ontology generating mechanism, and semantic fuzzy decision making mechanism, is also developed. The knowledge construction mechanism constructs the fuzzy concepts and relations based on the structure of the FDO. The instances of the FDO are generated by the fuzzy ontology generating mechanism. Finally, based on the FDO and the fuzzy ontology, the semantic fuzzy decision making mechanism simulates the semantic description of medical staff for diabetes-related application. Importantly, the proposed fuzzy expert system can work effectively for diabetes decision support application.

243 citations


Cites result from "Design of a hybrid system for the d..."

  • ...The final experiment compares the accuracy of the proposed method with results of studies involving the PIDD [4], [5], [8]....

    [...]

Journal ArticleDOI
TL;DR: The main idea in this paper is to describe key papers and provide some guidelines to help medical practitioners to explore previous works and identify interesting areas for future research.
Abstract: Data mining is a powerful method to extract knowledge from data. Raw data faces various challenges that make traditional method improper for knowledge extraction. Data mining is supposed to be able to handle various data types in all formats. Relevance of this paper is emphasized by the fact that data mining is an object of research in different areas. In this paper, we review previous works in the context of knowledge extraction from medical data. The main idea in this paper is to describe key papers and provide some guidelines to help medical practitioners. Medical data mining is a multidisciplinary field with contribution of medicine and data mining. Due to this fact, previous works should be classified to cover all users' requirements from various fields. Because of this, we have studied papers with the aim of extracting knowledge from structural medical data published between 1999 and 2013. We clarify medical data mining and its main goals. Therefore, each paper is studied based on the six medical tasks: screening, diagnosis, treatment, prognosis, monitoring and management. In each task, five data mining approaches are considered: classification, regression, clustering, association and hybrid. At the end of each task, a brief summarization and discussion are stated. A standard framework according to CRISP-DM is additionally adapted to manage all activities. As a discussion, current issue and future trend are mentioned. The amount of the works published in this scope is substantial and it is impossible to discuss all of them on a single work. We hope this paper will make it possible to explore previous works and identify interesting areas for future research.

220 citations

References
More filters
Book
08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Abstract: The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection much easier, it's still always evolving and there is a constant need for new techniques and tools that can help us transform this data into useful information and knowledge. Since the previous edition's publication, great advances have been made in the field of data mining. Not only does the third of edition of Data Mining: Concepts and Techniques continue the tradition of equipping you with an understanding and application of the theory and practice of discovering patterns hidden in large data sets, it also focuses on new, important topics in the field: data warehouses and data cube technology, mining stream, mining social networks, and mining spatial, multimedia and other complex data. Each chapter is a stand-alone guide to a critical topic, presenting proven algorithms and sound implementations ready to be used directly or with strategic modification against live data. This is the resource you need if you want to apply today's most powerful data mining techniques to meet real business challenges. * Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects. * Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields. *Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data

23,600 citations

Proceedings Article
Ron Kohavi1
20 Aug 1995
TL;DR: The results indicate that for real-word datasets similar to the authors', the best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.
Abstract: We review accuracy estimation methods and compare the two most common methods crossvalidation and bootstrap. Recent experimental results on artificial data and theoretical re cults in restricted settings have shown that for selecting a good classifier from a set of classifiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment--over half a million runs of C4.5 and a Naive-Bayes algorithm--to estimate the effects of different parameters on these algrithms on real-world datasets. For crossvalidation we vary the number of folds and whether the folds are stratified or not, for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, The best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.

11,185 citations

Book
01 Jan 2009
TL;DR: A survey of previous comparisons and theoretical work descriptions of methods dataset descriptions criteria for comparison and methodology (including validation) empirical results machine learning on machine learning can be found in this article, where the authors also discuss their own work.
Abstract: Survey of previous comparisons and theoretical work descriptions of methods dataset descriptions criteria for comparison and methodology (including validation) empirical results machine learning on machine learning.

2,325 citations