# Analysis of rough and fuzzy clustering

15 Oct 2010-pp 679-686

TL;DR: An experimental comparison of both the clustering techniques is provided and a procedure for conversion from fuzzy membership clustering to rough clustering is described, showing that descriptive fuzzy clustering may not always produce results that are as accurate as direct application ofrough clustering.

Abstract: With the gaining popularity of rough clustering, soft computing research community is studying relationships between rough and fuzzy clustering as well as their relative advantages. Both rough and fuzzy clustering are less restrictive than conventional clustering. Fuzzy clustering memberships are more descriptive than rough clustering. In some cases, descriptive fuzzy clustering may be advantageous, while in other cases it may lead to information overload. This paper provides an experimental comparison of both the clustering techniques and describes a procedure for conversion from fuzzy membership clustering to rough clustering. However, such a conversion is not always necessary, especially if one only needs lower and upper approximations. Experiments also show that descriptive fuzzy clustering may not always (particularly for high dimensional objects) produce results that are as accurate as direct application of rough clustering. We present analysis of the results from both the techniques.

##### Citations

More filters

â€¢â€¢

TL;DR: An integrated clustering technique using multi-phase learning is proposed called Integrated Rough Fuzzy Clustering using Random Forest, where, results of three aforementioned clustering techniques are used to compute the roughness measure.

29Â citations

â€¢â€¢

17 Aug 2012TL;DR: Comparisons are analyzed of Fuzzy and Rough variations of a popular K-means algorithm to obtain non-crisp clustering solutions and focused on the strengths of each approach.

Abstract: A crisp cluster does not share an object with other clusters. But in real life situations for several applications such rigidity is not acceptable. Hence, Fuzzy and Rough variations of a popular K-means algorithm are proposed to obtain non-crisp clustering solutions.
An Evidential c-means proposed by Masson and Denoeux [6] in the theoretical framework of belief functions uses Fuzzy c-means (FCM) to build upon basic belief assignments to determine cluster membership. On the other hand, Rough clustering uses the concept of lower and upper approximation to synthesize clusters. A variation of popular K-means algorithm namely Rough k-means (RKM) is proposed and experimented with various datasets.
In this paper we analyzed both the algorithms using synthetic, real and standard datasets to determine similarities of these two clustering approaches and focused on the strengths of each approach.

6Â citations

â€¢â€¢

01 Nov 2010

TL;DR: This paper applies RKM, FCM and IKM algorithms for clustering web visits to an educational site and highlights various features of these three soft computing algorithms.

Abstract: Fuzzy C-means (FCM) and Rough K-means (RKM) algorithms are two popular soft clustering algorithms that allow for overlapping clusters. The overlapping clusters can be useful in applications where restrictions imposed by crisp clustering that force assignment of every object to a unique cluster may not be practical. Likewise RKM and FCM, interval set representation of clusters would also generate overlapping clusters. We present and discuss the interval set K-means algorithm (IKM). This paper applies RKM, FCM and IKM algorithms for clustering web visits to an educational site. The experimental comparison highlights various features of these three soft computing algorithms.

3Â citations

### Cites background from "Analysis of rough and fuzzy cluster..."

...To resolve this problem, several researchers [2], [3], [4], [5] propose and systematically study rough clustering....

[...]

â€¢â€¢

TL;DR: Results show that the integrated model can be an effective way to improve prediction accuracy achieved with the help of rough k-means clustering and is evaluated using on line network traffic data that has been collected from WIDE backbone Network MSE, RMSE and MAPE metrics.

Abstract: In the last decade, realâ€“time audio and video services have gained much popularity, and now occupying a large portion of the total network traffic in the Internet. As the real-time services are becoming mainstream the demand for Quality of Service (QoS) is greater than ever before. It is necessary to use the network resources to the fullest, to satisfy the increasing demand for QoS. To solve this issue, we need to apply a prediction model for network traffic, on the basis of network management such as congestion control and bandwidth location. In this paper, we propose an integrated model that combines Rough K-Means (RKM) clustering with Single Moving Average (SMA) time series model to improve prediction loading packets of network traffic. The single moving average time series prediction model is used to predict loading of packets volume in real network traffic. Further, clustering granules obtained by using rough k-means is used to analyze the network data of each year separately. The proposed model is an integration of the prediction results that were obtained from conventional single moving average prediction model with centriods of clusters that obtained from rough k-means clustering. The model is evaluated using on line network traffic data that has been collected from WIDE backbone Network MSE, RMSE and MAPE metrics are used to examine the results of the integrated model. The experimental results show that the integrated model can be an effective way to improve prediction accuracy achieved with the help of rough k-means clustering. A Comparative result between conventional prediction model and our integrated model is presented.

2Â citations

### Cites methods from "Analysis of rough and fuzzy cluster..."

...The formula of Single Moving Average is as follows: E. Rough K-means Approach (RKM) The notion of rough set was presented by Pawlak [13], [14]....

[...]

...Rough K-means Approach (RKM) The notion of rough set was presented by Pawlak [13], [14]....

[...]

â€¢â€¢

01 Dec 2014TL;DR: Continuous hidden Markov model (CHMM) is designed using the new feature of c-element feature vectors derived from language specific speech corpus using the clusters which are formed by fuzzy c-means clustering algorithm.

Abstract: We propose new features for the language recognition using Gaussian computations. New features are derived from traditional features like Mel frequency cepstral coefficients (MFCC) using fuzzy c-means clustering algorithm. MFCC feature vectors derived from huge corpus of all languages under consideration are grouped into c-clusters using fuzzy c-means clustering algorithm and one Gaussian distribution is modeled for each cluster. In the training phase, new feature vectors are derived from language specific speech corpus using the clusters which are formed by fuzzy c-means clustering algorithm. In the testing phase, similar procedure is followed for the extraction of c-element feature vectors from unknown speech utterance, using the same c-Gaussians and evaluated against language specific HMMs. The language apriori knowledge (usefulness of feature vector) has been considered for the improvement of recognition performance. Continuous hidden Markov model (CHMM) is designed using the new feature. The languages in OGI database are used for the study and we have achieved good performance.

2Â citations

##### References

More filters

01 Jan 1967

TL;DR: The k-means algorithm as mentioned in this paper partitions an N-dimensional population into k sets on the basis of a sample, which is a generalization of the ordinary sample mean, and it is shown to give partitions which are reasonably efficient in the sense of within-class variance.

Abstract: The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give partitions which are reasonably efficient in the sense of within-class variance. That is, if p is the probability mass function for the population, S = {S1, S2, * *, Sk} is a partition of EN, and ui, i = 1, 2, * , k, is the conditional mean of p over the set Si, then W2(S) = ff=ISi f z u42 dp(z) tends to be low for the partitions S generated by the method. We say 'tends to be low,' primarily because of intuitive considerations, corroborated to some extent by mathematical analysis and practical computational experience. Also, the k-means procedure is easily programmed and is computationally economical, so that it is feasible to process very large samples on a digital computer. Possible applications include methods for similarity grouping, nonlinear prediction, approximating multivariate distributions, and nonparametric tests for independence among several variables. In addition to suggesting practical classification methods, the study of k-means has proved to be theoretically interesting. The k-means concept represents a generalization of the ordinary sample mean, and one is naturally led to study the pertinent asymptotic behavior, the object being to establish some sort of law of large numbers for the k-means. This problem is sufficiently interesting, in fact, for us to devote a good portion of this paper to it. The k-means are defined in section 2.1, and the main results which have been obtained on the asymptotic behavior are given there. The rest of section 2 is devoted to the proofs of these results. Section 3 describes several specific possible applications, and reports some preliminary results from computer experiments conducted to explore the possibilities inherent in the k-means idea. The extension to general metric spaces is indicated briefly in section 4. The original point of departure for the work described here was a series of problems in optimal classification (MacQueen [9]) which represented special

24,320Â citations

â€¢

31 Jul 1981

TL;DR: Books, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with, becomes what you need to get.

Abstract: New updated! The latest book from a very famous author finally comes out. Book of pattern recognition with fuzzy objective function algorithms, as an amazing reference becomes what you need to get. What's for is this book? Are you still thinking for what the book is? Well, this is what you probably will get. You should have made proper choices for your better life. Book, as a source that may involve the facts, opinion, literature, religion, and many others are the great friends to join with.

15,662Â citations

â€¢â€¢

01 Jan 1973

TL;DR: Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space; in both cases, the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the least squarederror criterion function.

Abstract: Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space. In both cases, the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the least squared error criterion function. In the first case, the range of T consists largely of ordinary (i.e. non-fuzzy) partitions of X and the associated iteration scheme is essentially the well known ISODATA process of Ball and Hall. However, in the second case, the range of T consists mainly of fuzzy partitions and the associated algorithm is new; when X consists of k compact well separated (CWS) clusters, Xi , this algorithm generates a limiting partition with membership functions which closely approximate the characteristic functions of the clusters Xi . However, when X is not the union of k CWS clusters, the limi...

5,787Â citations