scispace - formally typeset
Search or ask a question
Author

Li Zhang

Bio: Li Zhang is an academic researcher from Liaoning University. The author has contributed to research in topics: Canopy clustering algorithm & k-medians clustering. The author has an hindex of 1, co-authored 1 publications receiving 33 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article, an interval number is introduced for attribute weighting in the weighted fuzzy c-means (WFCM) clustering, and it is illustrated that interval weighting can obtain appropriate weights more easily from the viewpoint of geometric probability.
Abstract: The fuzzy c-means (FCM) algorithm is a widely applied clustering technique, but the implicit assumption that each attribute of the object data has equal importance affects the clustering performance. At present, attribute weighted fuzzy clustering has became a very active area of research, and numerous approaches that develop numerical weights have been combined into fuzzy clustering. In this paper, interval number is introduced for attribute weighting in the weighted fuzzy c-means (WFCM) clustering, and it is illustrated that interval weighting can obtain appropriate weights more easily from the viewpoint of geometric probability. Moreover, a genetic heuristic strategy for attribute weight searching is proposed to guide the alternating optimization (AO) of WFCM, and improved attribute weights in interval-constrained ranges and reasonable data partition can be obtained simultaneously. The experimental results demonstrate that the proposed algorithm is superior in clustering performance. It reveals that the interval weighted clustering can act as an optimization operator on the basis of the traditional numerical weighted clustering, and the effects of interval weight perturbation on clustering performance can be decreased.

33 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Two new hybrids of FCM and improved self-adaptive PSO are presented, which combine FCM with a recent version of PSO, the IDPSO, which adjusts PSO parameters dynamically during execution, aiming to provide better balance between exploration and exploitation, avoiding falling into local minima quickly and thereby obtaining better solutions.
Abstract: We present two new hybrids of FCM and improved self-adaptive PSO.The methods are based on the FCM-PSO algorithm.We use FCM to initialize one particle to achieve better results in less iterations.The new methods are compared to FCM-PSO using many real and synthetic datasets.The proposed methods consistently outperform FCM-PSO in three evaluation metrics. Fuzzy clustering has become an important research field with many applications to real world problems. Among fuzzy clustering methods, fuzzy c-means (FCM) is one of the best known for its simplicity and efficiency, although it shows some weaknesses, particularly its tendency to fall into local minima. To tackle this shortcoming, many optimization-based fuzzy clustering methods have been proposed in the literature. Some of these methods are based solely on a metaheuristic optimization, such as particle swarm optimization (PSO) whereas others are hybrid methods that combine a metaheuristic with a traditional partitional clustering method such as FCM. It is demonstrated in the literature that methods that hybridize PSO and FCM for clustering have an improved accuracy over traditional partitional clustering approaches. On the other hand, PSO-based clustering methods have poor execution time in comparison to partitional clustering techniques. Another problem with PSO-based clustering is that the current PSO algorithms require tuning a range of parameters before they are able to find good solutions. In this paper we introduce two hybrid methods for fuzzy clustering that aim to deal with these shortcomings. The methods, referred to as FCM-IDPSO and FCM2-IDPSO, combine FCM with a recent version of PSO, the IDPSO, which adjusts PSO parameters dynamically during execution, aiming to provide better balance between exploration and exploitation, avoiding falling into local minima quickly and thereby obtaining better solutions. Experiments using two synthetic data sets and eight real-world data sets are reported and discussed. The experiments considered the proposed methods as well as some recent PSO-based fuzzy clustering methods. The results show that the methods introduced in this paper provide comparable or in many cases better solutions than the other methods considered in the comparison and were much faster than the other state of the art PSO-based methods.

128 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel clustering model, in which probabilistic information granules of missing values are incorporated into the Fuzzy C-Means clustering of incomplete data by involving the maximum likelihood criterion.
Abstract: Missing values are a common phenomenon when dealing with real-world data sets. Analysis of incomplete data sets has become an active area of research. In this paper, we focus on the problem of clustering incomplete data, which is intended to introduce some prior distribution information of the missing values into the algorithm of fuzzy clustering. First, non-parametric hypothesis testing is employed to describe the missing values adhering to a certain Gaussian distribution as probabilistic information granules based on the nearest neighbors of incomplete data. Second, we propose a novel clustering model, in which probabilistic information granules of missing values are incorporated into the Fuzzy C-Means clustering of incomplete data by involving the maximum likelihood criterion. Third, the clustering model is optimized by using a tri-level alternating optimization utilizing the method of Lagrange multipliers. The convergence and the time complexity of the clustering algorithm are also discussed. The experiments reported both on synthetic and real-world data sets demonstrate that the proposed approach can effectively realize clustering of incomplete data.

95 citations

Journal ArticleDOI
TL;DR: A novel user clustering approach based on Quantum-behaved Particle Swarm Optimization (QPSO) has been proposed for the collaborative filtering based recommender system and evaluation results prove the usefulness of the generated recommendations and depict the users’ satisfaction on the proposed recommendation approach.

86 citations

Journal ArticleDOI
TL;DR: A new bio-inspired clustering ensemble through aggregating swarm intelligence and fuzzy clustering models for user-based collaborative filtering is presented and the obtained results illustrate the advantageous performance of the proposed approach over its peer works of recent times.
Abstract: In recent years, internet technologies and its rapid growth have created a paradigm of digital services. In this new digital world, users suffer due to the information overload problem and the recommender systems are widely used as a decision support tool to address this issue. Though recommender systems are proven personalization tool available, the need for the improvement of its recommendation ability and efficiency is high. Among various recommendation generation mechanisms available, collaborative filtering-based approaches are widely utilized to produce similarity-based recommendations. To improve the recommendation generation process of collaborative filtering approaches, clustering techniques are incorporated for grouping users. Though many traditional clustering mechanisms are employed for the users clustering in the existing works, utilization of bio-inspired clustering techniques needs to be explored for the generation of optimal recommendations. This article presents a new bio-inspired clustering ensemble through aggregating swarm intelligence and fuzzy clustering models for user-based collaborative filtering. The presented recommendation approaches have been evaluated on the real-world large-scale datasets of Yelp and TripAdvisor for recommendation accuracy and stability through standard evaluation metrics. The obtained results illustrate the advantageous performance of the proposed approach over its peer works of recent times.

85 citations

Journal ArticleDOI
TL;DR: It is demonstrated that the proposed IT2 FS based approach is more efficient in giving better clustering results for uncertain gene expression dataset and is scalable to the large gene expression datasets.

58 citations