Home
/
Authors
/
Malay K. Pakhira

Author

Malay K. Pakhira

Bio: Malay K. Pakhira is an academic researcher from Kalyani Government Engineering College. The author has contributed to research in topics: Cluster analysis & Fuzzy clustering. The author has an hindex of 7, co-authored 20 publications receiving 988 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Validity index for crisp and fuzzy clusters

[...]

Malay K. Pakhira¹, Sanghamitra Bandyopadhyay², Ujjwal Maulik¹•Institutions (2)

Kalyani Government Engineering College¹, Indian Statistical Institute²

01 Mar 2004-Pattern Recognition

TL;DR: A cluster validity index and its fuzzification is described, which can provide a measure of goodness of clustering on different partitions of a data set, and results demonstrating the superiority of the PBM-index in appropriately determining the number of clusters are provided.

...read moreread less

710 citations

Journal Article•DOI•

A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification

[...]

Malay K. Pakhira¹, Sanghamitra Bandyopadhyay², Ujjwal Maulik³•Institutions (3)

Kalyani Government Engineering College¹, Indian Statistical Institute², Jadavpur University³

01 Oct 2005-Fuzzy Sets and Systems

TL;DR: The effectiveness of the PBMF index as the optimization criterion along with a genetic fuzzy partitioning technique is demonstrated on a number of artificial and real data sets including a remote sensing image of the city of Kolkata.

...read moreread less

199 citations

Journal Article•DOI•

Clustering using simulated annealing with probabilistic redistribution

[...]

Sanghamitra Bandyopadhyay¹, Ujjwal Maulik², Malay K. Pakhira²•Institutions (2)

Indian Statistical Institute¹, Kalyani Government Engineering College²

01 Mar 2001-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: An efficient partitional clustering technique, called SAKM-clustering, that integrates the power of simulated annealing for obtaining minimum energy configuration, and the searching capability of K-means algorithm is proposed in this article.

...read moreread less

Abstract: An efficient partitional clustering technique, called SAKM-clustering, that integrates the power of simulated annealing for obtaining minimum energy configuration, and the searching capability of K-means algorithm is proposed in this article. The clustering methodology is used to search for appropriate clusters in multidimensional feature space such that a similarity metric of the resulting clusters is optimized. Data points are redistributed among the clusters probabilistically, so that points that are farther away from the cluster center have higher probabilities of migrating to other clusters than those which are closer to it. The superiority of the SAKM-clustering algorithm over the widely used K-means algorithm is extensively demonstrated for artificial and real life data sets.

...read moreread less

78 citations

Journal Article•DOI•

Finding Number of Clusters before Finding Clusters

[...]

Malay K. Pakhira¹•Institutions (1)

Kalyani Government Engineering College¹

01 Jan 2012-Procedia Technology

TL;DR: It is shown how VAT-based algorithms may be used for automatic determination of number of clusters very efficiently.

...read moreread less

23 citations

Proceedings Article•DOI•

A hardware pipeline for function optimization using genetic algorithms

[...]

Malay K. Pakhira¹, Rajat K. De²•Institutions (2)

Kalyani Government Engineering College¹, Indian Statistical Institute²

25 Jun 2005

TL;DR: A pipelined version of the commonly used Genetic Algorithms and a corresponding hardware platform is described and a detailed description of one such unit is presented, showing that PLGA may be even more effective than PGAs.

...read moreread less

Abstract: Genetic Algorithms (GAs) are very commonly used as function optimizers, basically due to their search capability. A number of different serial and parallel versions of GA exist. In this paper, a pipelined version of the commonly used Genetic Algorithms and a corresponding hardware platform is described. The main idea of achieving pipelined execution of different operations of GA is to use a stochastic selection function which works with the fitness value of the candidate chromosome only. The modified algorithm is termed PLGA (Pipelined Genetic Algorithm). When executed in a CGA (Classical Genetic Algorithm) framework, the stochastic selection gives comparable performances with the roulette-wheel selection. In the pipelined hardware environment, PLGA will be much faster than the CGA. When executed on similar hardware platforms, PLGA may attain a maximum speedup of four over CGA. However, if CGA is executed in a uniprocessor system the speedup is much more. A comparison of PLGA against PGA (Parallel Genetic Algorithms) shows that PLGA may be even more effective than PGAs. A scheme for realizing the hardware pipeline is also presented. Since a general function evaluation unit is essential, a detailed description of one such unit is presented.

...read moreread less

16 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Performance evaluation of some clustering algorithms and validity indices

[...]

Ujjwal Maulik, Sanghamitra Bandyopadhyay¹•Institutions (1)

Indian Statistical Institute¹

01 Dec 2002-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This article evaluates the performance of three clustering algorithms, hard K-Means, single linkage, and a simulated annealing (SA) based technique, in conjunction with four cluster validity indices, namely Davies-Bouldin index, Dunn's index, Calinski-Harabasz index, andA recently developed index I.

...read moreread less

Abstract: In this article, we evaluate the performance of three clustering algorithms, hard K-Means, single linkage, and a simulated annealing (SA) based technique, in conjunction with four cluster validity indices, namely Davies-Bouldin index, Dunn's index, Calinski-Harabasz index, and a recently developed index I. Based on a relation between the index I and the Dunn's index, a lower bound of the value of the former is theoretically estimated in order to get unique hard K-partition when the data set has distinct substructures. The effectiveness of the different validity indices and clustering methods in automatically evolving the appropriate number of clusters is demonstrated experimentally for both artificial and real-life data sets with the number of clusters varying from two to ten. Once the appropriate number of clusters is determined, the SA-based clustering technique is used for proper partitioning of the data into the said number of clusters.

...read moreread less

1,247 citations

Journal Article•DOI•

A Simulated Annealing-Based Multiobjective Optimization Algorithm: AMOSA

[...]

Sanghamitra Bandyopadhyay, Sriparna Saha, Ujjwal Maulik¹, Kalyanmoy Deb²•Institutions (2)

Jadavpur University¹, Indian Institute of Technology Kanpur²

01 Jun 2008-IEEE Transactions on Evolutionary Computation

TL;DR: A simulated annealing based multiobjective optimization algorithm that incorporates the concept of archive in order to provide a set of tradeoff solutions for the problem under consideration that is found to be significantly superior for many objective test problems.

...read moreread less

Abstract: This paper describes a simulated annealing based multiobjective optimization algorithm that incorporates the concept of archive in order to provide a set of tradeoff solutions for the problem under consideration. To determine the acceptance probability of a new solution vis-a-vis the current solution, an elaborate procedure is followed that takes into account the domination status of the new solution with the current solution, as well as those in the archive. A measure of the amount of domination between two solutions is also used for this purpose. A complexity analysis of the proposed algorithm is provided. An extensive comparative study of the proposed algorithm with two other existing and well-known multiobjective evolutionary algorithms (MOEAs) demonstrate the effectiveness of the former with respect to five existing performance measures, and several test problems of varying degrees of difficulty. In particular, the proposed algorithm is found to be significantly superior for many objective test problems (e.g., 4, 5, 10, and 15 objective problems), while recent studies have indicated that the Pareto ranking-based MOEAs perform poorly for such problems. In a part of the investigation, comparison of the real-coded version of the proposed algorithm is conducted with a very recent multiobjective simulated annealing algorithm, where the performance of the former is found to be generally superior to that of the latter.

...read moreread less

764 citations

Journal Article•DOI•

Automatic Clustering Using an Improved Differential Evolution Algorithm

[...]

Swagatam Das¹, Ajith Abraham², Amit Konar¹•Institutions (2)

Jadavpur University¹, Norwegian University of Science and Technology²

01 Jan 2008

TL;DR: Differential evolution has emerged as one of the fast, robust, and efficient global search heuristics of current interest as mentioned in this paper, which has been applied to the automatic clustering of large unlabeled data sets.

...read moreread less

Abstract: Differential evolution (DE) has emerged as one of the fast, robust, and efficient global search heuristics of current interest. This paper describes an application of DE to the automatic clustering of large unlabeled data sets. In contrast to most of the existing clustering techniques, the proposed algorithm requires no prior knowledge of the data to be classified. Rather, it determines the optimal number of partitions of the data "on the run." Superiority of the new method is demonstrated by comparing it with two recently developed partitional clustering techniques and one popular hierarchical clustering algorithm. The partitional clustering algorithms are based on two powerful well-known optimization algorithms, namely the genetic algorithm and the particle swarm optimization. An interesting real-world application of the proposed method to automatic segmentation of images is also reported.

...read moreread less

700 citations

Journal Article•DOI•

A Survey of Evolutionary Algorithms for Clustering

[...]

Eduardo R. Hruschka¹, Ricardo J. G. B. Campello¹, Alex A. Freitas², A.C.P.L.F. de Carvalho¹•Institutions (2)

University of São Paulo¹, University of Kent²

01 Mar 2009

TL;DR: An up-to-date overview that is fully devoted to evolutionary algorithms for clustering, is not limited to any particular kind of evolutionary approach, and comprises advanced topics like multiobjective and ensemble-based evolutionary clustering.

...read moreread less

Abstract: This paper presents a survey of evolutionary algorithms designed for clustering tasks. It tries to reflect the profile of this area by focusing more on those subjects that have been given more importance in the literature. In this context, most of the paper is devoted to partitional algorithms that look for hard clusterings of data, though overlapping (i.e., soft and fuzzy) approaches are also covered in the paper. The paper is original in what concerns two main aspects. First, it provides an up-to-date overview that is fully devoted to evolutionary algorithms for clustering, is not limited to any particular kind of evolutionary approach, and comprises advanced topics like multiobjective and ensemble-based evolutionary clustering. Second, it provides a taxonomy that highlights some very important aspects in the context of evolutionary data clustering, namely, fixed or variable number of clusters, cluster-oriented or nonoriented operators, context-sensitive or context-insensitive operators, guided or unguided operators, binary, integer, or real encodings, centroid-based, medoid-based, label-based, tree-based, or graph-based representations, among others. A number of references are provided that describe applications of evolutionary algorithms for clustering in different domains, such as image processing, computer security, and bioinformatics. The paper ends by addressing some important issues and open questions that can be subject of future research.

...read moreread less

690 citations

Proceedings Article•DOI•

A local search approximation algorithm for k-means clustering

[...]

Tapas Kanungo¹, David M. Mount², Nathan S. Netanyahu², Christine D. Piatko³, Ruth Silverman², Angela Y. Wu⁴ - Show less +2 more•Institutions (4)

IBM¹, University of Maryland, College Park², Johns Hopkins University Applied Physics Laboratory³, American University⁴

05 Jun 2002

TL;DR: This work considers the question of whether there exists a simple and practical approximation algorithm for k-means clustering, and presents a local improvement heuristic based on swapping centers in and out that yields a (9+ε)-approximation algorithm.

...read moreread less

Abstract: In k-means clustering we are given a set of n data points in d-dimensional space ℜd and an integer k, and the problem is to determine a set of k points in ℜd, called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known for this problem. Although asymptotically efficient approximation algorithms exist, these algorithms are not practical due to the extremely high constant factors involved. There are many heuristics that are used in practice, but we know of no bounds on their performance.We consider the question of whether there exists a simple and practical approximation algorithm for k-means clustering. We present a local improvement heuristic based on swapping centers in and out. We prove that this yields a (9+e)-approximation algorithm. We show that the approximation factor is almost tight, by giving an example for which the algorithm achieves an approximation factor of (9-e). To establish the practical value of the heuristic, we present an empirical study that shows that, when combined with Lloyd's algorithm, this heuristic performs quite well in practice.

...read moreread less

639 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199

Collapse