Medical data mining using BGA and RGA for weighting of features in fuzzy k-NN classification

doi:10.1109/ICMLC.2009.5212633

Home
/
Papers
/
Medical data mining using BGA and RGA for weighting of features in fuzzy k-NN classification

Proceedings Article•DOI•

Medical data mining using BGA and RGA for weighting of features in fuzzy k-NN classification

Ping-Hung Tang¹, Ming-Hseng Tseng¹•Institutions (1)

Chung Shan Medical University¹

12 Jul 2009-Vol. 5, pp 3070-3075

TL;DR: The k-nearest neighbor (k-NN) algorithm is commonly used in applications of classifiers and data mining and the related area due to its simplicity and effectiveness and all of features and optimal feature subsets with three features are investigated.

read less

Abstract: The k-nearest neighbor (k-NN) algorithm is commonly used in applications of classifiers and data mining and the related area due to its simplicity and effectiveness. In this study, all of features and optimal feature subsets with three features are investigated. For classification, crisp k-NN, fuzzy k-NN, and weighting fuzzy k-NN classifiers are compared. For weighting of features, two types of coding including binary-coded genetic algorithms (BGA) and real-coded genetic algorithms (BGA) are evaluated. Experiments are conducted on the Wisconsin diagnosis breast cancer (WDBC) dataset and the Pima (PIMA) Indians diabetes dataset, and the classification accuracy, false negative, and computation time are reported in this paper.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Adaptive directed mutation for real-coded genetic algorithms

[...]

Ping-Hung Tang¹, Ming-Hseng Tseng¹•Institutions (1)

Chung Shan Medical University¹

01 Jan 2013

TL;DR: Results indicate that the proposed ADM-RCGA is fast, accurate, and reliable, and outperforms all the other GAs considered in the present study.

...read moreread less

Abstract: Adaptive directed mutation (ADM) operator, a novel, simple, and efficient real-coded genetic algorithm (RCGA) is proposed and then employed to solve complex function optimization problems. The suggested ADM operator enhances the abilities of GAs in searching global optima as well as in speeding convergence by integrating the local directional search strategy and the adaptive random search strategies. Using 41 benchmark global optimization test functions, the performance of the new algorithm is compared with five conventional mutation operators and then with six genetic algorithms (GAs) reported in literature. Results indicate that the proposed ADM-RCGA is fast, accurate, and reliable, and outperforms all the other GAs considered in the present study.

...read moreread less

94 citations

Journal Article•DOI•

Design of fuzzy classifier for diabetes disease using Modified Artificial Bee Colony algorithm

[...]

Fayssal Beloufa, M.A. Chikh

01 Oct 2013-Computer Methods and Programs in Biomedicine

TL;DR: A novel Artificial Bee Colony (ABC) algorithm in which a mutation operator is added to an ArtificialBee Colony for improving its performance is proposed, in order to enhance the diversity of ABC, without compromising with the solution quality.

...read moreread less

86 citations

Cites methods from "Medical data mining using BGA and R..."

...6 [29] RGA-fuzzy-KNN [29] (5xCV) 82 [29] ML-NN [31] (10xCV) 79....
[...]
...Tand and Tseng [29] developed GA-based methods to estimate a weight vector of the feature vector applied in the fuzzy k-NN estimation....
[...]

Journal Article•DOI•

Survey of Data Mining Techniques used in Healthcare Domain

[...]

Sheenal Patel, Hardik J. Patel

31 Mar 2016

TL;DR: Various Data Mining techniques such as classification, clustering, association, and also related work to analyse and predict human disease are highlighted.

...read moreread less

Abstract: Health care industry produces enormous quantity of data that clutches complex information relating to patients and their medical conditions. Data mining is gaining popularity in different research arenas due to its infinite applications and methodologies to mine the information in correct manner. Data mining techniques have the capabilities to discover hidden patterns or relationships among the objects in the medical data. In last decade, there has been increase in usage of data mining techniques on medical data for determining useful trends or patterns that are used in analysis and decision making. Data mining has an infinite potential to utilize healthcare data more efficiently and effectually to predict different kind of disease. This paper features various Data Mining techniques such as classification, clustering, association and also highlights related work to analyse and predict human disease.

...read moreread less

57 citations

Proceedings Article•DOI•

Implementing WEKA for medical data classification and early disease prediction

[...]

Narander Kumar¹, Sabita Khatri¹•Institutions (1)

University Institute of Engineering and Technology, Panjab University¹

01 Feb 2017

TL;DR: This research work comprehensively compared different data classification techniques and their prediction accuracy for chronic kidney disease dataset using performance measures like ROC, kappa statistics, RMSE and MAE using WEKA tool.

...read moreread less

Abstract: In recent years, the advent of latest web and data technologies has encouraged massive data growth in almost every sector. Businesses and leading industries are viewing these huge data repositories as a tool to design future strategies, prediction models by analyzing patterns and gaining knowledge from this unstructured data by applying different data mining techniques. Medical domain has now become richer in term of maintaining digital records of patients related to their diagnosis and treatment. These huge data repositories can range from patient personnel data, diagnosis, treatment histories, test diagnosis, images and various scans. This terabytes of medical data is quantity rich but weaker in information in terms of knowledge and robust tools to identify hidden patterns of knowledge specifically in medical sector. Data Mining as a field of research has already well proven capabilities of identifying hidden patterns, analysis and knowledge applied on different research domains, now gaining popularity day by day among researchers and scientist towards generating novel and deep insights of these large biomedical datasets also. Uncovering new biomedical and healthcare related knowledge in order to support clinical decision making, is another dimension of data mining. Through massive literature survey, it is found that early disease prediction is the most demanded area of research in health care sector. As health care domain is bit wider domain and having different disease characteristics, different techniques have their own prediction efficiencies, which can be enhanced and changed in order to get into most optimize way. In this research work, authors have comprehensively compared different data classification techniques and their prediction accuracy for chronic kidney disease. Authors have compared J48, Naive Bayes, Random Forest, SVM and k-NN classifiers using performance measures like ROC, kappa statistics, RMSE and MAE using WEKA tool. Authors have also compared these classifiers on various accuracy measures like TP rate, FP rate, precision, recall and f-measure by implementing on WEKA. Experimental result shows that random forest classifier has better classification accuracy over others for chronic kidney disease dataset.

...read moreread less

40 citations

A Survey of Data Mining Techniques on Medical Data for Finding Locally Frequent Diseases

[...]

Mohammed Abdul Khaleel, Sateesh Kumar Pradham

01 Jan 2013

TL;DR: The main focus of this paper is to analyze data mining techniques required for medical data mining especially to discover locally frequent diseases such as heart ailments, lung cancer, breast cancer and so on.

...read moreread less

Abstract: In the last decade there has been increasing usage of data mining techniques on medical data for discovering useful trends or patterns that are used in diagnosis and decision making. Data mining techniques such as clustering, classification, regression, association rule mining, CART (Classification and Regression Tree) are widely used in healthcare domain. Data mining algorithms, when appropriately used, are capable of improving the quality of prediction, diagnosis and disease classification. The main focus of this paper is to analyze data mining techniques required for medical data mining especially to discover locally frequent diseases such as heart ailments, lung cancer, breast cancer and so on. We evaluate the data mining techniques for finding locally frequent patterns in terms of cost, performance, speed and accuracy. We also compare data mining techniques with conventional methods.

...read moreread less

38 citations

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A fuzzy K-nearest neighbor algorithm

[...]

James M. Keller¹, M. R. Gray¹, J. A. Givens•Institutions (1)

University of Missouri¹

01 Jul 1985

TL;DR: The theory of fuzzy sets is introduced into the K-nearest neighbor technique to develop a fuzzy version of the algorithm, and three methods of assigning fuzzy memberships to the labeled samples are proposed.

...read moreread less

Abstract: Classification of objects is an important area of research and application in a variety of fields. In the presence of full knowledge of the underlying probabilities, Bayes decision theory gives optimal error rates. In those cases where this information is not present, many algorithms make use of distance or similarity among samples as a means of classification. The K-nearest neighbor decision rule has often been used in these pattern recognition problems. One of the difficulties that arises when utilizing this technique is that each of the labeled samples is given equal importance in deciding the class memberships of the pattern to be classified, regardless of their `typicalness'. The theory of fuzzy sets is introduced into the K-nearest neighbor technique to develop a fuzzy version of the algorithm. Three methods of assigning fuzzy memberships to the labeled samples are proposed, and experimental results and comparisons to the crisp version are presented.

...read moreread less

2,323 citations

"Medical data mining using BGA and R..." refers methods in this paper

...[5] as a generalization of the k-NN algorithm to allow the assignment of fractional membership, instead of zero or one like k-NN, to each class....
[...]

Proceedings Article•

Genetic algorithms with sharing for multimodal function optimization

[...]

David E. Goldberg, Jon Richardson

01 Oct 1987

TL;DR: In this article, the authors developed and investigated the method of sharing functions to permit the formation of stable subpopulations of different strings within a GA, thereby permitting the parallel investigation of many peaks.

...read moreread less

Abstract: Many practical search and optimization problems require the investigation of multiple local optima. In this paper, the method of sharing functions is developed and investigated to permit the formation of stable subpopulations of different strings within a genetic algorithm (CA), thereby permitting the parallel investigation of many peaks. The theory and implementation of the method are investigated and two, one-dimensional test functions are considered. On a test function with five peaks of equal height, a GA without sharing loses strings at all but one peak; a GA with sharing maintains roughly equally sized subpopulations clustered about all five peaks. On a test function with five peaks of different sizes, a GA without sharing loses strings at all but the highest peak; a GA with sharing allocates decreasing numbers of strings to peaks of decreasing value as predicted by theory.

...read moreread less

2,154 citations

"Medical data mining using BGA and R..." refers background in this paper

...For more information about the genetic algorithm schemes, see references [6-9]....
[...]
...[8, 9] There are two types coding to implement genetic operators, named binary-coded genetic algorithms (BGA) and real-coded genetic algorithms (RGA)....
[...]

Book•

Introduction to Genetic Algorithms for Scientists and Engineers

[...]

David A. Coley

29 Jan 1999

TL;DR: Improving the algorithm foundations advanced operators writing a genetic algorithm applications of genetic algorithms and showing the benefits of incorporating reinforcement learning into genetic algorithms.

...read moreread less

Abstract: An introduction to genetic algorithms for scientists and engineers , An introduction to genetic algorithms for scientists and engineers , کتابخانه الکترونیک و دیجیتال - آذرسا

...read moreread less

1,021 citations

"Medical data mining using BGA and R..." refers background in this paper

...For more information about the genetic algorithm schemes, see references [6-9]....
[...]
...[8, 9] There are two types coding to implement genetic operators, named binary-coded genetic algorithms (BGA) and real-coded genetic algorithms (RGA)....
[...]

Proceedings Article•DOI•

Genetic algorithms with dynamic niche sharing for multimodal function optimization

[...]

B.L. Miller¹, Michael J. Shaw¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

20 May 1996

TL;DR: Dynamic niche sharing is developed that is able to efficiently identify and search multiple niches (peaks) in a multimodal domain and perform better than two other methods for multiple optima identification, standard sharing and deterministic crowding.

...read moreread less

Abstract: Genetic algorithms utilize populations of individual hypotheses that converge over time to a single optimum, even within a multimodal domain. This paper examines methods that enable genetic algorithms to identify multiple optima within multimodal domains by maintaining population members within the niches defined by the multiple optima. A new mechanism, dynamic niche sharing, is developed that is able to efficiently identify and search multiple niches (peaks) in a multimodal domain. Dynamic niche sharing is shown to perform better than two other methods for multiple optima identification, standard sharing and deterministic crowding.

...read moreread less

400 citations

Journal Article•DOI•

Using coarse scale forest variables as ancillary information and weighting of variables in k-NN estimation: a genetic algorithm approach

[...]

Erkki Tomppo¹, Merja Halme²•Institutions (2)

Finnish Forest Research Institute¹, Aalto University²

15 Jul 2004-Remote Sensing of Environment

TL;DR: Tests with practical forest inventory data show that the method performs noticeably better than other applications of k-NN estimation methods in forest inventories, and that the problem of biases in the species volume predictions can for example, almost completely be overcome with this new approach.

...read moreread less

222 citations