scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A Heterogeneous Cluster Ensemble Model for Improving the Stability of Fuzzy Cluster Analysis

01 Dec 2016-Procedia Computer Science (Elsevier)-Vol. 102, pp 129-136
TL;DR: This paper is providing a heterogeneous cluster ensemble approach to improve the stability of fuzzy cluster analysis by applying different fuzzy clustering algorithms on the datasets obtaining multiple partitions, which in the later stage will be fused into the final consensus matrix.
About: This article is published in Procedia Computer Science.The article was published on 2016-12-01 and is currently open access. It has received 10 citations till now. The article focuses on the topics: Fuzzy clustering & FLAME clustering.
Citations
More filters
Journal ArticleDOI
TL;DR: A novel fuzzy clustering ensemble framework based on a new fuzzy diversity measure and a fuzzy quality measure to find the base-clusterings with the best performance and the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria on various standard datasets is revealed.
Abstract: In spite of some attempts at improving the quality of the clustering ensemble methods, it seems that little research has been devoted to the selection procedure within the fuzzy clustering ensemble. In addition, quality and local diversity of base-clusterings are two important factors in the selection of base-clusterings. Very few of the studies have considered these two factors together for selecting the best fuzzy base-clusterings in the ensemble. We propose a novel fuzzy clustering ensemble framework based on a new fuzzy diversity measure and a fuzzy quality measure to find the base-clusterings with the best performance. Diversity and quality are defined based on the fuzzy normalized mutual information between fuzzy base-clusterings. In our framework, the final clustering of selected base-clusterings is obtained by two types of consensus functions: (1) a fuzzy co-association matrix is constructed from the selected base-clusterings and then, a single traditional clustering such as hierarchical agglomerative clustering is applied as consensus function over the matrix to construct the final clustering. (2) a new graph based fuzzy consensus function. The time complexity of the proposed consensus function is linear in terms of the number of data-objects. Experimental results reveal the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria on various standard datasets.

46 citations

Journal ArticleDOI
TL;DR: Experimental results on various standard datasets demonstrated the effectiveness of the proposed approach compared to the state-of-the-art methods in terms of evaluation criteria and clustering robustness.

41 citations

Book ChapterDOI
05 Sep 2018
TL;DR: A hybrid fuzzy clustering model combining variants of fuzzy c-means clustering and density based clustering for exploring well-structured user feedback data intending to exploit the advantages of these two types of clustering approaches and diminishing their drawbacks is presented.
Abstract: In today’s dynamic environments, user feedback data are a valuable asset providing orientations about the achieved quality and possible improvements of various products or services. In this paper we will present a hybrid fuzzy clustering model combining variants of fuzzy c-means clustering and density based clustering for exploring well-structured user feedback data. Despite of the multitude of successful applications where these algorithms are applied separately, they also suffer drawbacks of various kinds. So, the FCM algorithm faces difficulties in detecting clusters of non-spherical shapes or densities and moreover it is sensitive to noise and outliers. On the other hand density-based clustering is not easily adaptable to generate fuzzy partitions. Our hybrid clustering model intertwines density-based clustering and variations of FCM intending to exploit the advantages of these two types of clustering approaches and diminishing their drawbacks. Finally we have assessed and compared our model in a real-world case study.

3 citations

Proceedings ArticleDOI
12 Nov 2019
TL;DR: Several models employing the fuzzy clustering techniques in data compression systems are demonstrated and image compression based on fuzzy transforms for compression and decompression of color videos is described in details.
Abstract: Data compression is the process of reducing the amount of necessary memory for the representation of a given piece of information. This process is of great utility especially in digital storage and transmission of the multimedia information and it typically involves various encoding/decoding schemes. In this work we will be primarily focused on some compression schemes which employ specific forms of clustering known as fuzzy clustering. In the data mining context, fuzzy clustering is a versatile tool which analyzes heterogeneous collections of data providing insights on the underlying structures involving the concept of partial membership. Several models employing the fuzzy clustering techniques in data compression systems are demonstrated and image compression based on fuzzy transforms for compression and decompression of color videos is described in details.

1 citations


Cites background from "A Heterogeneous Cluster Ensemble Mo..."

  • ...In general, fuzzy clustering methods can be superior to that of its hard counterparts since they can represent the relationship between the input pattern data and clusters more naturally [9]....

    [...]

Proceedings ArticleDOI
01 Dec 2019
TL;DR: A new fuzzy clustering ensemble model based on cluster forests method (CF) that can simultaneously reduce the execution time and consists mainly of two steps: generation of clusters instances and aggregation of global models.
Abstract: With the accumulation of the large data size, clustering of big data is a challenging task. However, data reduction is considered as a powerful model which significantly reduces execution time. This work presents a new fuzzy clustering ensemble model based on cluster forests method (CF) that can simultaneously reduce the execution time and consists mainly of two steps: generation of clusters instances and aggregation of global models. In the beginning, this algorithm makes multiple clusters instances using fuzzy clustering bdrFCM technique. Secondly, it aggregates this clusters to obtain final results using Ncut spectral clustering. We call it as FCE-CF approach. This proposed method is guided by cluster validity index kappa. Experimental results demonstrate that the FCE-CF outperforms the existing clustering methods in terms of time and memory on big data UCI repository.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings and proposes three effective and efficient techniques for obtaining high-quality combiners (consensus functions).
Abstract: This paper introduces the problem of combining multiple partitionings of a set of objects into a single consolidated clustering without accessing the features or algorithms that determined these partitionings. We first identify several application scenarios for the resultant 'knowledge reuse' framework that we call cluster ensembles. The cluster ensemble problem is then formalized as a combinatorial optimization problem in terms of shared mutual information. In addition to a direct maximization approach, we propose three effective and efficient techniques for obtaining high-quality combiners (consensus functions). The first combiner induces a similarity measure from the partitionings and then reclusters the objects. The second combiner is based on hypergraph partitioning. The third one collapses groups of clusters into meta-clusters which then compete for each object to determine the combined clustering. Due to the low computational costs of our techniques, it is quite feasible to use a supra-consensus function that evaluates all three approaches against the objective function and picks the best solution for a given situation. We evaluate the effectiveness of cluster ensembles in three qualitatively different application scenarios: (i) where the original clusters were formed based on non-identical sets of features, (ii) where the original clustering algorithms worked on non-identical sets of objects, and (iii) where a common data-set is used and the main purpose of combining multiple clusterings is to improve the quality and robustness of the solution. Promising results are obtained in all three situations for synthetic as well as real data-sets.

4,375 citations

Book
09 Jul 1999
TL;DR: This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and therefore expensive process of rule generation and estimation in the context of cluster dynamics.
Abstract: Introduction. Basic Concepts. Classical Fuzzy Clustering Algorithms. Linear and Ellipsoidal Prototypes Shell Prototypes. Polygonal Object Boundaries. Cluster Estimation Models. Cluster Validity. Rule Generation with Clustering. Appendix. Bibliography.

925 citations

Proceedings Article
21 Aug 2003
TL;DR: Empirical results show that the proposed approach achieves better and more robust clustering performance compared to not only single runs of random projection/clustering but also clustering with PCA, a traditional data reduction method for high dimensional data.
Abstract: We investigate how random projection can best be used for clustering high dimensional data. Random projection has been shown to have promising theoretical properties. In practice, however, we find that it results in highly unstable clustering performance. Our solution is to use random projection in a cluster ensemble approach. Empirical results show that the proposed approach achieves better and more robust clustering performance compared to not only single runs of random projection/clustering but also clustering with PCA, a traditional data reduction method for high dimensional data. To gain insights into the performance improvement obtained by our ensemble method, we analyze and identify the influence of the quality and the diversity of the individual clustering solutions on the final ensemble performance.

655 citations

Proceedings ArticleDOI
28 Jul 2002
TL;DR: This contribution is to formally define the cluster ensemble problem as an optimization problem and to propose three effective and efficient combiners for solving it based on a hypergraph model.
Abstract: It is widely recognized that combining multiple classification or regression models typically provides superior results compared to using a single, well-tuned model. However, there are no well known approaches to combining multiple non-hierarchical clusterings. The idea of combining cluster labelings without accessing the original features leads us to a general knowledge reuse framework that we call cluster ensembles. Our contribution in this paper is to formally define the cluster ensemble problem as an optimization problem and to propose three effective and efficient combiners for solving it based on a hypergraph model. Results on synthetic as well as real data sets are given to show that cluster ensembles can (i) improve quality and robustness, and (ii) enable distributed clustering.

474 citations