scispace - formally typeset
Search or ask a question
Author

Long Chen

Bio: Long Chen is an academic researcher from Nanjing University of Posts and Telecommunications. The author has contributed to research in topics: Canopy clustering algorithm & k-medians clustering. The author has an hindex of 2, co-authored 2 publications receiving 14 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A hybrid imbalanced measure of distance and density for the rough c-means clustering is defined, and a modified roughc-mean clustering algorithm is presented in this paper.

16 citations

Journal ArticleDOI
TL;DR: An improved algorithm of rough k-means clustering based on variable weighted distance measure is presented and Comparative experimental results of real world data from UCI demonstrate the validity of the proposed algorithm.
Abstract: Rough K-means algorithm has shown that it can provides a reasonable set of lower and upper bounds for a given dataset. With the conceptions of the lower and upper approximate sets, rough k-means clustering and its emerging derivatives become valid algorithms in vague information clustering. However, the most available algorithms ignore the difference of the distances between data objects and cluster centers when computing new mean for each cluster. To solve this issue, an improved algorithm of rough k-means clustering based on variable weighted distance measure is presented in this article. Comparative experimental results of real world data from UCI demonstrate the validity of the proposed algorithm.

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This survey will directly help researchers understand the research developments of MSIF under RST and provide state-of-the-art understanding in specialized literature, as well as clarify the approaches and application of MSif in RST research community.

105 citations

Journal ArticleDOI
TL;DR: A hierarchical cluster ensemble model based onknowledge granulation is proposed with the attempt to provide a new way to deal with the cluster ensemble problem together with ensemble learning application of the knowledge granulation.
Abstract: Cluster ensemble has been shown to be very effective in unsupervised classification learning by generating a large pool of different clustering solutions and then combining them into a final decision. However, the task of it becomes more difficult due to the inherent complexities among base cluster results, such as uncertainty, vagueness and overlapping. Granular computing is one of the fastest growing information-processing paradigms in the domain of computational intelligence and human-centric systems. As the core part of granular computing, the rough set theory dealing with inexact, uncertain, or vague information, has been widely applied in machine learning and knowledge discovery related areas in recent years. From these perspectives, in this paper, a hierarchical cluster ensemble model based on knowledge granulation is proposed with the attempt to provide a new way to deal with the cluster ensemble problem together with ensemble learning application of the knowledge granulation. A novel rough distance is introduced to measure the dissimilarity between base partitions and the notion of knowledge granulation is improved to measure the agglomeration degree of a given granule. Furthermore, a novel objective function for cluster ensembles is defined and the corresponding inferences are made. A hierarchical cluster ensemble algorithm based on knowledge granulation is designed. Experimental results on real-world data sets demonstrate the effectiveness for better cluster ensemble of the proposed method.

58 citations

Journal ArticleDOI
TL;DR: To mitigate adverse effects of imbalanced clusters and decrease the computational cost, an interval type-2 fuzzy local measure for the RKM clustering is proposed, on the basis of which, a novel RKm clustering algorithm has been developed that specifically gives due consideration to im balanced clusters.
Abstract: Rough K-Means (RKM) is an efficient clustering algorithm for overlapping datasets, and has captured increasing attention in recent years. RKM algorithms are the main focus on the further description of uncertain objects located in boundary regions in order to improve the performance. However, most available RKM algorithms fail to pay attention to the influence of imbalanced clusters, together with imbalanced spatial distributions (i.e., the cluster density) and differing cluster sizes (i.e., the number of object ratios). This paper seeks to address this deficiency and examines in detail some adverse effects caused by imbalanced clusters. To mitigate adverse effects of imbalanced clusters and decrease the computational cost, an interval type-2 fuzzy local measure for the RKM clustering is proposed, on the basis of which, a novel RKM clustering algorithm has been developed that specifically gives due consideration to imbalanced clusters. The effectiveness and superiority of this algorithm are demonstrated through simulation and experimental analysis.

34 citations

Book ChapterDOI
01 Jan 2017
TL;DR: The review starts with RST in the context of data preprocessing as well as the generation of both descriptive and predictive knowledge via decision rule induction, association rule mining and clustering.
Abstract: This chapter emphasizes on the role played by rough set theory (RST) within the broad field of Machine Learning (ML). As a sound data analysis and knowledge discovery paradigm, RST has much to offer to the ML community. We surveyed the existing literature and reported on the most relevant RST theoretical developments and applications in this area. The review starts with RST in the context of data preprocessing (discretization, feature selection, instance selection and meta-learning) as well as the generation of both descriptive and predictive knowledge via decision rule induction, association rule mining and clustering. Afterward, we examined several special ML scenarios in which RST has been recently introduced, such as imbalanced classification, multi-label classification, dynamic/incremental learning, Big Data analysis and cost-sensitive learning.

31 citations

Journal ArticleDOI
TL;DR: It is shown how all these clustering approaches are able of managing in different ways the uncertainty associated with the two components of the Informational Paradigm, i.e. the Empirical and Theoretical Information.

28 citations