Topic
Dunn index
About: Dunn index is a research topic. Over the lifetime, 150 publications have been published within this topic receiving 24021 citations.
Papers published on a yearly basis
Papers
More filters
••
15 Dec 2020TL;DR: In this article, the authors proposed an approach for automatic clustering for text document using a Self-Organizing Map (SOM) which is one of unsupervised artificial neural network that widely used for data analysis, data compression, clustering, and data mining.
Abstract: With the huge amount of published research papers, retrieving relevant information is a difficult task for any researcher Effective clustering algorithms can help improve and simplify the retrieval process Here, we propose an approach for automatic clustering for text document using a Self-Organizing Map (SOM) It is one of unsupervised artificial neural network that widely used for data analysis, data compression, clustering, and data mining The quality and accuracy of a SOM algorithm depends on the selection of values for some of its parameters which are its initial learning rate, SOM matrix dimensions, and the number of iterations Best values are typically selected using trial and error; however, in the current paper we suggest a more systematic approach to parameters optimization using the genetic algorithm The proposed method is applied to cluster 3 scientific papers datasets using their keywords Similar research papers were mapped closer to each other Clustering results were validated using the Dunn index
5 citations
••
01 Jan 2019
TL;DR: In this article, the authors presented the results of an investigation to cluster the temporal wind speed profiles associated with the South African renewable energy development zones, which greatly reduced the computational cost of high-level capacity allocation optimization studies.
Abstract: This paper presents the results of an investigation to cluster the temporal wind speed profiles associated with the South African renewable energy development zones. The study makes use of a renewable energy resource dataset produced by the Council for Scientific and Industrial Research. The clustering large applications algorithm, which is based on the partitioning around mediods algorithm, is used in the clustering exercise. Results are presented for each of the eight South African renewable energy zones. These results include clustered mean daily temporal profiles of the wind speed obtained for the high demand and low demand season, as well as the corresponding geographical cluster maps. Clustering performance metrics, including the average within cluster distance, the Dunn index and the average silhouette width are presented.The clustering results yield optimal output of three to five clusters for each of the individual renewable energy development zones. This implies that the wind speed profiles associated with each of these zones can be reduced to three to five archetypal mean daily profiles, which greatly reduces the computational cost of high-level capacity allocation optimization studies.
5 citations
••
01 Jan 2019TL;DR: This paper compares the performance of k-means and k-medoids in clustering objects with mixed variables by using a mixed variables data set on a modified cancer data and indicates that k- medoids is a good clustering option when the measured variables are mixed with different types.
Abstract: This paper compares the performance of k-means and k-medoids in clustering objects with mixed variables. The k-means initially means for clustering objects with continuous variables as it uses Euclidean distance to compute distance between objects. While, k-medoids has been designed suitable for mixed type variables especially with PAM (partition around medoids). By using a mixed variables data set on a modified cancer data, we compared k-means and k-medoids on internal validity set up in R package. The result indicates that k-medoids is a good clustering option when the measured variables are mixed with different types.
5 citations
••
01 Dec 2009
TL;DR: The proposed system starts with a preprocessing stage that includes: Digital Down Conversion, symbol rate estimation, base-band filtering, synchronization and normalization, and the identifier stage follows which uses clustering algorithms along with cluster validity measures if needed to identify the digital modulation scheme used.
Abstract: This paper presents a software radio based receiver architecture for identification of Linear, bi-dimensional modulation techniques via clustering algorithms. Identification of digital modulation schemes is of great importance in 3G and 4G cellular mobile systems, and it can be well presented as a pattern recognition problem with the use of vector space representation of digitally modulated signals. The proposed system starts with a preprocessing stage that includes: Digital Down Conversion (DDC), symbol rate estimation, base-band filtering, synchronization and normalization. Then the identifier stage follows which uses clustering algorithms along with cluster validity measures if needed to identify the digital modulation scheme used. Three clustering algorithms were compared, K-means clustering algorithm with Dunn index as a validity measure, Fuzzy C-means clustering algorithm with a minimum hard tendency validity measure, and Density Based Clustering. Simulation results for the three approaches are presented under the presence of AWGN.
5 citations
••
19 Jun 2013TL;DR: Experimental results indicate that TriWClustering can find significant triclusters and promote a useful tool for cross species gene regulation analysis.
Abstract: Many different biological data mining methods have been used in gene expression data analysis A common method is two-way clustering, also called biclustering, which is used to identify the gene groups that behave similarly under a subset of experimental conditions This paper introduces a novel approach called three-way clustering (TriWClustering) for cross-species gene regulation analysis, to mine coherent clusters named triclusters in three-dimensional (gene-condition-organism) gene expression datasets The developed method has been applied to three different gene expression data obtained from NCBI's GEO data collection Biological and statistical significance of the results are evaluated using Gene Ontology term enrichment analysis and Dunn index (DI) metric, respectively The experimental results indicate that TriWClustering can find significant triclusters and promote a useful tool for cross species gene regulation analysis
5 citations