scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Data clustering using an advanced PSO variant

TL;DR: The proposed algorithm proved that the use of Subtractive Clustering methodology at the start of any PSO approach can improve the clustering process by suggesting good initial cluster centers and number of clusters in advance and then fasten the further clustering with theUse of adaptive inertia weight factor and boundary restriction strategy.
Abstract: This paper proposes an advanced PSO variant using Subtractive Clustering methodology for data clustering. The implementation of this algorithm will be used to provide fast, efficient and appropriate solution for any complex clustering problem. This algorithm addresses the basic challenges faced with the existing PSO based clustering techniques i.e. preknowledge of initial cluster centers, dead unit problem, premature convergence to local optima, stagnation problem, etc. The proposed algorithm proved that the use of Subtractive Clustering methodology at the start of any PSO approach can improve the clustering process by suggesting good initial cluster centers and number of clusters in advance and then fasten the further clustering with the use of adaptive inertia weight factor and boundary restriction strategy. The performance of proposed algorithm is tested against well know clustering techniques over three datasets, where the results showed a better or comparable performance with respect to accuracy of clustering and convergence rate.
Citations
More filters
Journal ArticleDOI
TL;DR: A systematic mapping review on recent investigations of swarm-inspired algorithms to tackle clustering problems and provides an overview of how to apply the swarm methods together with a critical analysis of the current and future perspectives in the field.

59 citations

Posted Content
TL;DR: The different challenges associated with multidimensional data clustering and scope of research on optimizing the clustering problems using PSO are described and a strategy to use hybrid PSO variant for clustering multiddimensional numerical, text and image data is proposed.
Abstract: Optimization is nothing but a mathematical technique which finds maxima or minima of any function of concern in some realistic region. Different optimization techniques are proposed which are competing for the best solution. Particle Swarm Optimization (PSO) is a new, advanced, and most powerful optimization methodology that performs empirically well on several optimization problems. It is the extensively used Swarm Intelligence (SI) inspired optimization algorithm used for finding the global optimal solution in a multifaceted search region. Data clustering is one of the challenging real world applications that invite the eminent research works in variety of fields. Applicability of different PSO variants to data clustering is studied in the literature, and the analyzed research work shows that, PSO variants give poor results for multidimensional data. This paper describes the different challenges associated with multidimensional data clustering and scope of research on optimizing the clustering problems using PSO. We also propose a strategy to use hybrid PSO variant for clustering multidimensional numerical, text and image data.

4 citations

Proceedings ArticleDOI
01 Aug 2017
TL;DR: This paper modifies the clustering scheme, using Universal Networking Language (UNL) generative feature vector, Subtractive Clustering approach combined with Boundary Restricted Particle Swarm Optimization (BR-APSO) algorithm for efficient document clustering.
Abstract: World Wide Web, the largest shared information source has a remarkable amount of text documents, which makes the document clustering as one of the ideal areas of research these days. To navigate, summarize and retrieve the information effectively, document clustering can facilitate the automatic document organization. To attain the basic objective of document clustering, i.e. clustered documents should have high intra-similarity rate and low inter-similarity rate to other clusters, several techniques are proposed. The basic categorization of document clustering techniques is done into two: partitional and hierarchical techniques. However, the partitional clustering techniques are extremely popular in document clustering area. The K-means inspired algorithms are the most efficient and fast partitional clustering algorithms, which seeks to divide documents collection into separate groups to look for the optimized value of clustering. Cluster grouping techniques frequently experience scalability, high dimensionality, and inaccurate cluster labels issues. This paper modifies the clustering scheme, using Universal Networking Language (UNL) generative feature vector, Subtractive Clustering approach combined with Boundary Restricted Particle Swarm Optimization (BR-APSO) algorithm for efficient document clustering. The proposed method not only compares but analyses the existing document clustering methodologies and improves entropy and purity rates.

3 citations


Cites background or methods from "Data clustering using an advanced P..."

  • ...Also, the use of boundary restriction strategy is a useful criterion to convince particles not to travel ahead of the search area makes sure the avoidance of continuous premature convergence to local optima [1] [4]....

    [...]

  • ..., xn}, calculation of density measure at data item xi can be done using equation (7) [1][5]....

    [...]

  • ...Calculate the fitness value for each particle using equation (9) [1][2][4][5]....

    [...]

  • ...For each particle, using boundary restriction strategy, update velocity and position using equation (11) & (12) [1][4][21][22]....

    [...]

  • ...The BR-APSO algorithm using Subtractive Clustering approach is applied to document vector created [1]....

    [...]

Proceedings ArticleDOI
01 Sep 2019
TL;DR: A superior variant of Particle Swarm Optimization (PSO) algorithm based on one of the density based clustering methodologies i.e. Subtractive Clustering (SC) is used, intended to cluster image datasets for fine classification of images.
Abstract: Clustering plays an important role in almost every field of real time applications and whenever optimization is the core concern, good quality clustering is crucial. Different optimization techniques are proposed with respect to clustering and find its applications in areas like Image Processing and Pattern Recognition, Finance, Communication Networks, Biological Sequences, etc. This paper uses a superior variant of Particle Swarm Optimization (PSO) algorithm based on one of the density based clustering methodologies i.e. Subtractive Clustering (SC). The implementation of the algorithm proved its excellence in numerical and text data clustering by addressing the several issues those were came across in the literature survey of PSO based clustering techniques. The algorithm is intended to cluster image datasets for fine classification of images. The performance of the algorithm is evaluated against K-means, K-PSO, Subtractive-PSO over synthesized and MRI image datasets with respect to quantization error, intra and inter cluster distances. The obtained results showed a better or comparable performance.

Cites background from "Data clustering using an advanced P..."

  • ...Nodes are utilized to speak to ideas and arcs speak to connection between these ideas [6][26]....

    [...]

References
More filters
Dissertation
01 Jan 2002
TL;DR: This thesis presents a theoretical model that can be used to describe the long-term behaviour of the Particle Swarm Optimiser and results are presented to support the theoretical properties predicted by the various models, using synthetic benchmark functions to investigate specific properties.
Abstract: Many scientific, engineering and economic problems involve the optimisation of a set of parameters. These problems include examples like minimising the losses in a power grid by finding the optimal configuration of the components, or training a neural network to recognise images of people's faces. Numerous optimisation algorithms have been proposed to solve these problems, with varying degrees of success. The Particle Swarm Optimiser (PSO) is a relatively new technique that has been empirically shown to perform well on many of these optimisation problems. This thesis presents a theoretical model that can be used to describe the long-term behaviour of the algorithm. An enhanced version of the Particle Swarm Optimiser is constructed and shown to have guaranteed convergence on local minima. This algorithm is extended further, resulting in an algorithm with guaranteed convergence on global minima. A model for constructing cooperative PSO algorithms is developed, resulting in the introduction of two new PSO-based algorithms. Empirical results are presented to support the theoretical properties predicted by the various models, using synthetic benchmark functions to investigate specific properties. The various PSO-based algorithms are then applied to the task of training neural networks, corroborating the results obtained on the synthetic benchmark functions.

1,498 citations

Journal ArticleDOI
TL;DR: The main idea of the principle of PSO is presented; the advantages and the shortcomings are summarized; and some kinds of improved versions ofPSO and research situation are presented.
Abstract: Particle swarm optimization is a heuristic global optimization method and also an optimization algorithm, which is based on swarm intelligence. It comes from the research on the bird and fish flock movement behavior. The algorithm is widely used and rapidly developed for its easy implementation and few particles required to be tuned. The main idea of the principle of PSO is presented; the advantages and the shortcomings are summarized. At last this paper presents some kinds of improved versions of PSO and research situation, and the future research issues are also given.

699 citations


"Data clustering using an advanced P..." refers background in this paper

  • ...Particle Swarm Optimization (PSO) is a popular optimization technique of Swarm Intelligence (SI) domain, which studies collective behavior observed in biological elements like birds flocking, fish schooling, etc [21][16][17]....

    [...]

  • ...For each particle, velocity and position is updated at each time interval according to (3) and (4) respectively [6][21][22]....

    [...]

Proceedings ArticleDOI
08 Jun 2005
TL;DR: This paper presents a particle swarm optimization (PSO) document clustering algorithm, which performs a globalized search in the entire solution space and shows that the hybrid PSO algorithm can generate more compact clustering results than the K-means algorithm.
Abstract: Fast and high-quality document clustering algorithms play an important role in effectively navigating, summarizing, and organizing information. Recent studies have shown that partitional clustering algorithms are more suitable for clustering large datasets. However, the K-means algorithm, the most commonly used partitional clustering algorithm, can only generate a local optimal solution. In this paper, we present a particle swarm optimization (PSO) document clustering algorithm. Contrary to the localized searching of the K-means algorithm, the PSO clustering algorithm performs a globalized search in the entire solution space. In the experiments we conducted, we applied the PSO, K-means and hybrid PSO clustering algorithm on four different text document datasets. The number of documents in the datasets ranges from 204 to over 800, and the number of terms ranges from over 5000 to over 7000. The results illustrate that the hybrid PSO algorithm can generate more compact clustering results than the K-means algorithm.

336 citations

Journal ArticleDOI
TL;DR: This paper presents a literature survey on the PSO algorithm and its variants to clustering high-dimensional data and an attempt is made to provide a guide for the researchers who are working in the area of PSO and high- dimensional data clustering.
Abstract: Data clustering is one of the most popular techniques in data mining. It is a process of partitioning an unlabeled dataset into groups, where each group contains objects which are similar to each other with respect to a certain similarity measure and different from those of other groups. Clustering high-dimensional data is the cluster analysis of data which have anywhere from a few dozen to many thousands of dimensions. Such high-dimensional data spaces are often encountered in areas such as medicine, bioinformatics, biology, recommendation systems and the clustering of text documents. Many algorithms for large data sets have been proposed in the literature using different techniques. However, conventional algorithms have some shortcomings such as the slowness of their convergence and their sensitivity to initialization values. Particle Swarm Optimization (PSO) is a population-based globalized search algorithm that uses the principles of the social behavior of swarms. PSO produces better results in complicated and multi-peak problems. This paper presents a literature survey on the PSO algorithm and its variants to clustering high-dimensional data. An attempt is made to provide a guide for the researchers who are working in the area of PSO and high-dimensional data clustering.

267 citations


"Data clustering using an advanced P..." refers background in this paper

  • ...Analyzed Basic PSO algorithm [15][19][23][24][25]...

    [...]

  • ...PSO has its own importance in dealing with clustering multifaceted issues, hence considering clustering as a complex computational problem gave us a new scope of innovative research in PSO (SI) as well as data clustering [1][3][15]....

    [...]

Journal ArticleDOI
TL;DR: An attempt is made to provide a guide for the researchers who are working in the area of PSO and data clustering to produce better results in complicated and multi-peak problems.
Abstract: Data clustering is one of the most popular techniques in data mining. It is a method of grouping data into clusters, in which each cluster must have data of great similarity and high dissimilarity with other cluster data. The most popular clustering algorithm K-mean and other classical algorithms suffer from disadvantages of initial centroid selection, local optima, low convergence rate problem etc. Particle Swarm Optimization (PSO) is a population based globalized search algorithm that mimics the capability (cognitive and social behavior) of swarms. PSO produces better results in complicated and multi-peak problems. This paper presents a literature survey on the PSO application in data clustering. PSO variants are also described in this paper. An attempt is made to provide a guide for the researchers who are working in the area of PSO and data clustering.

235 citations


Additional excerpts

  • ...4 for guaranteed global convergence using (5)[5][8][9][23][25],...

    [...]

  • ...Analyzed Basic PSO algorithm [15][19][23][24][25]...

    [...]