Showing papers on "Dunn index published in 2015"

PDF

Open Access

Proceedings Article•DOI•

Choice of effective fitness functions for genetic algorithm-aided dynamic fuzzy rule interpolation

[...]

Nitin Naik¹, Ren Diao, Qiang Shen²•Institutions (2)

United Kingdom Ministry of Defence¹, Aberystwyth University²

01 Aug 2015

TL;DR: Experimental investigation demonstrates that results obtained by the use of Dunn index or Davies-Bouldin index are better than those by Ball-Hall or BetaCV index, with those using Davies-Bs performing the best overall.

...read moreread less

Abstract: Fuzzy rule interpolation (FRI) has been a vital reasoning tool for sparse fuzzy rule-based systems. Throughout interpolative reasoning, an FRI system may produce a large number of interpolated rules, which generally serve no further purpose once the required outcomes have been obtained. However, this abandoned pool of interpolated rules can be used to improve the existing sparse rule base, because they contain useful information on the underlying problem domain. Efficient extraction of knowledge from such a pool of interpolated rules are indeed helpful to analyse and update the sparse rule base, leading to a dynamic sparse fuzzy rule base for building an enhanced fuzzy system. Following this idea, a genetic algorithm (GA) based dynamic fuzzy rule interpolation framework has been proposed recently. This paper presents an extension of the dynamic FRI system. In particular, it investigates different fitness functions and their effects on the outcomes of the GA-based system. A variety of fitness functions based on cluster quality indices are employed and tested, including Dunn Index, Davies-Boulding Index, Ball-Hall Index and BetaCV Index. Experimental investigation demonstrates that results obtained by the use of Dunn index or Davies-Bouldin index are better than those by Ball-Hall or BetaCV index, with those using Davies-Bouldin index performing the best overall. Such results offer an empirical guideline for the selection of the fitness function in implementing accurate GA-based dynamic FRI systems.

...read moreread less

17 citations

Journal Article•DOI•

Agent based Stock Clustering for Efficient Portfolio Management

[...]

Preeti Baser, Jatinderkumar R. Saini

22 Apr 2015-International Journal of Computer Applications

TL;DR: This research helped to assemble a diversified portfolio of stocks with the use of clustering and will help investor community in specific and in turn it helps the society and economy in general for better allocation of wealth.

...read moreread less

Abstract: This research paper proposed agent based framework for portfolio management using non-hierarchical clustering method. The framework included various agents such as data agent, clustering agent, ranking agent, portfolio manager and user agent. The data agent collected financial ratio of Nifty 50 companies from financial database. Clustering agents generated clusters and DB index computed to find optimum cluster size of each method. Validation agent evaluated the performance of k -means, k -medoids and fast k -means using intra-class inertia. Clusters generated by k -means used for investment and portfolio analysis using Markowitz model. This research helped to assemble a diversified portfolio of stocks with the use of clustering Keywords Clustering, Data mining (DM), Davis-Bouldin (DB) Index, Dunn Index, k -means, k -medoids, Partitioning Around Medoids(PAM), Silhouette index 1. INTRODUCTION Data mining is a process of automatically discovering knowledge and predicting future trends from large financial markets. It creates opportunities for companies to make proactive and knowledge-driven decision in order to gain a competitive advantage. There are varieties of DM techniques available over past decades that include classification, similarity search, cluster analysis, association rule mining. Data mining techniques are also widely applied in number of financial areas, including predicting stock prices, predicting stock indices, portfolio management, portfolio risk management, trend detection, designing recommender [27, 28]. Portfolio management is one of major problem in financial domain. In today‟s competitive financial environment, an investor wants to earn maximum profit from his assets. An investor considers an investment in securities faces with the problem of choosing from among a large number of securities. He confuses in which security he has to invest. It depends upon the risk-return characteristics of individual securities. He selects most desirable securities and likes to allocate his funds over this group of securities. Again, he faces with the problem of deciding which securities to select and how much to invest in each. The investor chooses the optimal portfolio taking into consideration the risk and return characteristics of all possible portfolios. The research work describes about an agent based framework for portfolio management using non-hierarchical clustering methods. The proposed framework consist of various agents such as data agent, clustering agent, ranking agent, user agent and portfolio manager. This framework assists investors in strategic planning and investment decision-making. This research work can help to assemble a diversified portfolio of stocks with the help of clustering and also will help investor community in specific and in turn it helps the society and economy in general for better allocation of wealth. In this research paper,

...read moreread less

8 citations

Book Chapter•DOI•

Identifying nutritional patterns through integrative multiview clustering

[...]

Beatriz Sevilla-Villanueva, Karina Gibert, Miquel Sànchez-Marrè

01 Jan 2015

TL;DR: The findings suggest that the Integrative Multiview Clustering provides more compact and separated clusters and the interpretation of the resulting partition is clearer than the one obtained by classical approache.

...read moreread less

Abstract: The main goal of this work is to develop a methodology for finding nutritional patterns based on a variety of subject characteristics which can contribute to better understand the interactions between nutrition and health, provided that the complexity of the phenomenon gives poor performance using classical approaches. An innovative methodology based on advanced clustering techniques is proposed in order to find more compact patterns or clusters. The Integrative Multiview Clustering (IMC) combines Multiview Clustering approach with crossing operations over the several partitions obtained. Comparison with other classical clustering techniques is provided to assess the performance of our approach. The Dunn-like cluster validity index proposed by Bezdek & Pal is used for the comparison from a structural point of view, as it is more robust than the original Dunn index. The performance of the IMC method is better than other popular clustering techniques based on the Dunn-like Index. Our findings suggest that the Integrative Multiview Clustering provides more compact and separated clusters. In addition, IMC helps to reduce the high dimensionality of the data based on multiview division of attributes and also, the resulting partition is easier to interpret. Using the Integrative Multiview Clustering approach, a good partition is obtained from a structural point of view. Also, the interpretation of the resulting partition is clearer than the one obtained by classical approache

...read moreread less

5 citations

Proceedings Article•DOI•

ACPSO: Hybridization of ant colony and particle swarm algorithm for optimization in data clustering using multiple objective functions

[...]

Dipali Kharche¹, Anuradha D. Thakare¹•Institutions (1)

Savitribai Phule Pune University¹

23 Apr 2015

TL;DR: A hybrid algorithm, called ACPSO algorithm for optimal clustering process, used for the discovery centroids with the stimulation of ant colony system and the experimental results shows the proposed method's performance is good as compared with existing algorithm in most of evaluation metrics.

...read moreread less

Abstract: K-means clustering groups the similar information using distance function. Even though it is a good algorithm for grouping, it may affect the clustering performance in terms of cluster initialization. This directed to new research track on emerging better algorithms with good initial centroids. This paper gives a hybrid algorithm, called ACPSO algorithm for optimal clustering process. ACO algorithm is used in this paper for the discovery centroids with the stimulation of ant colony system. Once initial centroids are produced by ACO algorithm, PSO algorithm is applied to find optimal cluster with the help of different fitness function, namely, XB index, Sym index, DB index, Connected DB index, Connected Dunn index and Mean Square Distance. Finally, experimentation is performed with iris data and performance is evaluated with five different evaluation metrics. The experimental results shows the proposed method's performance is good as compared with existing algorithm in most of evaluation metrics.

...read moreread less

5 citations

Proceedings Article•DOI•

A modified brainstorm optimization for clustering using hard c-means

[...]

Reetika Roy¹, J. Anuradha¹•Institutions (1)

VIT University¹

01 Nov 2015

TL;DR: The algorithm has been implemented with the Iris data set and its validity and effectiveness is tested with the help of commonly used internal evaluation measures for clustering like Davies Boudlin Index and Dunn Index.

...read moreread less

Abstract: The preeminent intention of the proposed study is exploring the performance of the Brainstorm Optimization algorithm in Hard c-means clustering of data. The rationale behind this analysis is to generate a random solution set of centroids and then modify the centroids so as to refine the clusters. As we are using Brainstorm Optimization which is a form of evolutionary algorithm this refinement of centroid happens through competition and cooperation with existing centroid values. This algorithm incorporates both exploitation and exploration of the search space to generate the new centroids. The algorithm has been implemented with the Iris data set and its validity and effectiveness is tested with the help of commonly used internal evaluation measures for clustering like Davies Boudlin Index and Dunn Index.

...read moreread less

5 citations

Journal Article•

Applying a decision support system for accident analysis by using data mining approach: A case study on one of the Iranian manufactures

[...]

Rouzbeh Ghousi¹•Institutions (1)

Iran University of Science and Technology¹

01 Jul 2015-Journal of Industrial and Systems Engineering

TL;DR: Large data sets of the accidents of a manufacturing and industrial unit have been studied by applying clustering methods and association rules as data mining methods, finding optimum number of clusters has been determined.

...read moreread less

Abstract: Uncertain and stochastic states have been always taken into consideration in the fields of risk management and accident, like other fields of industrial engineering, and have made decision making difficult and complicated for managers in corrective action selection and control measure approach. In this research, huge data sets of the accidents of a manufacturing and industrial unit have been studied by applying clustering methods and association rules as data mining methods. First, the accident data was briefly studied. Then, effective features in an accident were selected while consulting with industry experts and considering production process information. By performing clustering method, data was divided into separate clusters and by using Dunn Index as validator of clustering, optimum number of clusters has been determined. In the next stage, by using the Apriori Algorithm as one of association rule methods, the relations between these fields were identified and the association rules among them were extracted and analyzed. Since managers need precise information for decision making, data mining methods, when to be used properly, may act as a supporting system.

...read moreread less

4 citations

Ensemble based Clustering of Plasmodiumfalciparum genes

[...]

Itunuoluwa Isewon, O. J. Oyelade, Ezekiel Adebiyi, Benedikt Brors

01 May 2015

TL;DR: Om· t·esults show that ensemble based clustering is indeed a good altet·native fm· clustet· analysis with the premise of an improved performance ovet· traditional clustering algorithms.

...read moreread less

Abstract: Ensemble learning is a recent and extended approach to the unsupervised data mining technique called clustering which is used from finding natunl gmupings that exist in a dataset Hetre, we applied an ensemble based clustering algol'ithm called Random Fot·ests with Pat·tition amund Medoids (PAM) to multiple time sel'ies gene expt·ession data of Plasmodium falcipat·um The Random Fot·est algol'ithm is most common ensemble leat·ning appmach that uses decision tt·ees Random Fm·est consists of lat·ge numbet· of classification tt·ees (nnging fmm hundt·eds to thousands) built from rabootstnp sampling of the dataset We also applied the following intemal clustet· validity measures; Silhouette Width index, Connectivity Index and the Dunn Index to select the optimal numbet· of final clustet·s Om· t·esults show that ensemble based clustering is indeed a good altet·native fm· clustet· analysis with the premise of an improved performance ovet· traditional clustering algorithms

...read moreread less

3 citations

Journal Article•DOI•

Two Level Clustering for Quality Improvement using Fuzzy Subtractive Clustering and Self-Organizing Map

[...]

Erick Alfons Lisangan, Aina Musdholifah¹, Sri Hartati¹•Institutions (1)

Gadjah Mada University¹

01 Aug 2015-Indonesian Journal of Electrical Engineering and Computer Science

TL;DR: FSC-SOM can improve the cluster center of FSC with SOM in order to obtain the better quality of clustering results, and the clustering result of F SCOM is better than or equal to the clusters result ofFSC that proven by the value of external and internal validity measurement.

...read moreread less

Abstract: Recently, clustering algorithms combined conventional methods and artificial intelligence. FSC-SOM is designed to handle the problem of SOM, such as defining the number of clusters and initial value of neuron weights. FSC find the number of clusters and the cluster centers which become the parameter of SOM. FSC-SOM is expected to improve the quality of FSC since the determination of the cluster centers are processed twice i.e. searching for data with high density at FSC then updating the cluster centers at SOM. FSC-SOM was tested using 10 datasets that is measured with F-Measure, entropy, Silhouette Index, and Dunn Index. The result showed that FSC-SOM can improve the cluster center of FSC with SOM in order to obtain the better quality of clustering results. The clustering result of FSC-SOM is better than or equal to the clustering result of FSC that proven by the value of external and internal validity measurement.

...read moreread less

3 citations

The investigation of TB patients features with K-Means clustering

[...]

Farzad Firuzi Jahantigh, Hakimeh Ameri

15 Dec 2015

TL;DR: According to the results of this study, the most important identified factors by the use of clustering are Hemoglobin, age, sex, smoking, alcohol and Creatinine.

...read moreread less

Abstract: Introduction: According to the World Health Organization, TB is the largest cause of death among infectious diseases. Due to the high percentage of tuberculosis infection and the high number of death among these patients, this study was carried out to categorized and find the relationship between different clinical and demographical characteristics. Method: This descriptive analytical study was done on 600 patients from Masih Daneshvari hospital tuberculosis research center. K-means clustering, Apriori association rules, and data mining algorithms (SPSS Clementine software) were used for clustering and determining the common characteristics among patients. Results: Based on DUNN index, 3 clusters were chosen as optimal cluster. The common factors between clusters have been described in details in findings section. According to the characteristics of each cluster, patients can be classified based on the effectiveness of various factors Conclusion: According to the results of this study, the most important identified factors by the use of clustering are Hemoglobin, age, sex, smoking, alcohol and Creatinine. Based on the association rules the highest rate of relationship is found between cough, weight loss, and ESR.

...read moreread less

3 citations

DOI•

Application of Fuzzy Clustering Technique for Analysis of North Indian Ocean Tropical Cyclone Tracks

[...]

Sankar Kumar Nath¹, S. D. Kotal¹, Prabir Kumar Kundu²•Institutions (2)

India Meteorological Department¹, Jadavpur University²

01 Sep 2015

TL;DR: In this article, a fuzzy c-means clustering technique is explored to investigate the track of tropical cyclones over the North Indian Ocean (NIO) for the period (1976-2014).

...read moreread less

Abstract: A fuzzy, c-means (FCM) clustering technique is explored to investigate the track of tropical cyclones over the North Indian Ocean (NIO) for the period (1976-2014). A total of five clusters is objectively identified based on partition index, partition coefficient, Dunn Index and separation index. The results obtained during analysis emphasized that each cluster has the unique features in terms of their genesis location, landfall, travel duration, trajectory, seasonality, accumulated cyclone energy and Intensity. Analysis of large scale environmental parameters, constructed preceding day of genesis show some of these parameters to be potential precursors to TC formation for almost all the clusters, most prominently, mid-tropospheric humidity, zonal wind, vorticity and outgoing long wave radiation of the main developing regions. The individual clusters have the several distinct features in their seasonal cycles. The cluster C5 shows distinct bimodal distributions where as other clusters are formed throughout the year. ENSO influenced the cyclone frequency in two of the five clusters. The MJO is found to play an important role in the genesis of the cyclone. The post monsoon season cyclone frequency is more in MJO phase 2, 3 and 4. The technique (FCM) can be used as a guideline in terms of the probable affected zone of TC Tracks by the operational forecasters.

...read moreread less

2 citations

Book Chapter•DOI•

Evaluation of Fitness Functions for Swarm Clustering Applied to Gene Expression Data

[...]

P. K. Nizar Banu¹, S. Andrews²•Institutions (2)

B. S. Abdur Rahman University¹, Mahendra Engineering College²

01 Jan 2015

TL;DR: High usability of algorithm and encouraging results suggests that swarm clustering (PSO based clustering) with Davies-Bouldin index as fitness functions with respect to Dunn index can be a practical tool for analyzing gene expression patterns.

...read moreread less

Abstract: Clustering problem is being studied by many of the researchers using swarm intelligence. However, the search space is not carried out entirely randomly; a proper fitness function is required to determine the next step in the search space. This paper studies Particle Swarm Optimization (PSO) based clustering with two different fitness functions namely Xie-Beni and Davies-Bouldin indices for brain tumor gene expression dataset. Clustering results are validated using Mean Absolute Error (MAE) and Dunn Index (DI). To analyze function of genes, genes that have similar expression patterns should be grouped and the datasets should be presented to the physicians in a meaningful way. High usability of algorithm and the encouraging results suggests that swarm clustering (PSO based clustering) with Davies-Bouldin index as fitness functions with respect to Dunn index can be a practical tool for analyzing gene expression patterns.

...read moreread less

Journal Article•DOI•

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

[...]

Iwona Pawlikowska¹, Iwona Pawlikowska², Zhifa Liu², Lei Shi², Tong Lin², Tanja A. Gruber², Giles W. Robinson², Arzu Onar-Thomas², Stan Pounds² - Show less +5 more•Institutions (2)

University of Silesia in Katowice¹, St. Jude Children's Research Hospital²

23 Oct 2015-BMC Bioinformatics

TL;DR: Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials and frequently a specific CCAM is chosen without quantifying the validity of its results.

...read moreread less

Abstract: Background Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials. However, for any data set, there are a very large number of candidate cluster analysis methods (CCAMs) due to the many choices for feature selection criteria, number of selected features, number of clusters to define, etc. Frequently, a specific CCAM is chosen without quantifying the validity of its results in terms of reproducibility or distinctiveness of the reported subgroups.

...read moreread less