Cluster Validity with Fuzzy Sets

doi:10.1080/01969727308546047

Home
/
Papers
/
Cluster Validity with Fuzzy Sets

Journal Article•DOI•

Cluster Validity with Fuzzy Sets

01 Jan 1973-Vol. 3, Iss: 3, pp 58-73

TL;DR: This paper uses membership function matrices associated with fuzzy c-partitions of X, together with their values in the Euclidean (matrix) norm, to formulate an a posteriori method for evaluating algorithmically suggested clusterings of X.

read less

Abstract: Given a finite, unlabelled set of real vectors X, one often presumes the existence of (c) subsets (clusters) in X, the members of which somehow bear more similarity to each other than to members of adjoining clusters. In this paper, we use membership function matrices associated with fuzzy c-partitions of X, together with their values in the Euclidean (matrix) norm, to formulate an a posteriori method for evaluating algorithmically suggested clusterings of X. Several numerical examples are offered in support of the proposed technique.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Fuzzy Model Identification Based on Cluster Estimation

[...]

Stephen L. Chiu¹•Institutions (1)

Rockwell Automation¹

01 May 1994-Journal of Intelligent and Fuzzy Systems

TL;DR: An efficient method for estimating cluster centers of numerical data that can be used to determine the number of clusters and their initial values for initializing iterative optimization-based clustering algorithms such as fuzzy C-means is presented.

...read moreread less

Abstract: We present an efficient method for estimating cluster centers of numerical data. This method can be used to determine the number of clusters and their initial values for initializing iterative optimization-based clustering algorithms such as fuzzy C-means. Here we use the cluster estimation method as the basis of a fast and robust algorithm for identifying fuzzy models. A benchmark problem involving the prediction of a chaotic time series shows this model identification method compares favorably with other, more computationally intensive methods. We also illustrate an application of this method in modeling the relationship between automobile trips and demographic factors.

...read moreread less

2,815 citations

Proceedings Article•DOI•

Fuzzy clustering with a fuzzy covariance matrix

[...]

Donald E. Gustafson, William Kessel

01 Jan 1978

TL;DR: Experimental results are presented which indicate that more accurate clustering may be obtained by using fuzzy covariances, a natural approach to fuzzy clustering.

...read moreread less

Abstract: A class of fuzzy ISODATA clustering algorithms has been developed previously which includes fuzzy means. This class of algorithms is generalized to include fuzzy covariances. The resulting algorithm closely resembles maximum likelihood estimation of mixture densities. It is argued that use of fuzzy covariances is a natural approach to fuzzy clustering. Experimental results are presented which indicate that more accurate clustering may be obtained by using fuzzy covariances.

...read moreread less

1,988 citations

Journal Article•DOI•

On cluster validity for the fuzzy c-means model

[...]

Nikhil R. Pal¹, James C. Bezdek¹•Institutions (1)

University of West Florida¹

01 Aug 1995-IEEE Transactions on Fuzzy Systems

TL;DR: Limitation analysis indicates, and numerical experiments confirm, that the Fukuyama-Sugeno index is sensitive to both high and low values of m and may be unreliable because of this, and calculations suggest that the best choice for m is probably in the interval [1.5, 2.5], whose mean and midpoint, m=2, have often been the preferred choice for many users of FCM.

...read moreread less

Abstract: Many functionals have been proposed for validation of partitions of object data produced by the fuzzy c-means (FCM) clustering algorithm We examine the role a subtle but important parameter-the weighting exponent m of the FCM model-plays in determining the validity of FCM partitions The functionals considered are the partition coefficient and entropy indexes of Bezdek, the Xie-Beni (1991), and extended Xie-Beni indexes, and the Fukuyama-Sugeno index (1989) Limit analysis indicates, and numerical experiments confirm, that the Fukuyama-Sugeno index is sensitive to both high and low values of m and may be unreliable because of this Of the indexes tested, the Xie-Beni index provided the best response over a wide range of choices for the number of clusters, (2-10), and for m from 101-7 Finally, our calculations suggest that the best choice for m is probably in the interval [15, 25], whose mean and midpoint, m=2, have often been the preferred choice for many users of FCM >

...read moreread less

1,724 citations

Journal Article•DOI•

Fuzzy c-means clustering with spatial information for image segmentation.

[...]

Keh-Shih Chuang¹, Hong Long Tzeng², Hong Long Tzeng¹, Sharon C.-A. Chen¹, Jay Wu¹, Jay Wu², Tzong-Jer Chen - Show less +3 more•Institutions (2)

National Tsing Hua University¹, Atomic Energy Council²

01 Jan 2006-Computerized Medical Imaging and Graphics

TL;DR: This paper presents a fuzzy c-means (FCM) algorithm that incorporates spatial information into the membership function for clustering and yields regions more homogeneous than those of other methods.

...read moreread less

1,296 citations

Cites methods from "Cluster Validity with Fuzzy Sets"

...Disadvantages of Vpc and Vpe are that they measure only the fuzzy partition and lack a direct connection to the featuring property....
[...]
...As a result, the best clustering is achieved when the value Vpc is maximal or Vpe is minimal....
[...]
...The representative functions for the fuzzy partition are partition coefficient Vpc [9] and partition entropy Vpe [10]....
[...]
...The representative functions for the fuzzy partition are partition coefficient Vpc [9] and partition entropy Vpe [10]. g (a) FCM; (b) sFCM1,1; and (c) sFCM0,2....
[...]
...They are defined as follows: Vpc Z PN j Pc i u2ij N (6) and Vpe Z K PN j Pc i ½uijlog uij N (7) The idea of these validity functions is that the partition with less fuzziness means better performance....
[...]

Journal Article•DOI•

Network Anomaly Detection: Methods, Systems and Tools

[...]

Monowar H. Bhuyan¹, Dhruba K. Bhattacharyya¹, Jugal Kalita²•Institutions (2)

Tezpur University¹, University of Colorado Colorado Springs²

21 Jan 2014-IEEE Communications Surveys and Tutorials

TL;DR: This paper provides a structured and comprehensive overview of various facets of network anomaly detection so that a researcher can become quickly familiar with every aspect of network anomalies detection.

...read moreread less

Abstract: Network anomaly detection is an important and dynamic research area. Many network intrusion detection methods and systems (NIDS) have been proposed in the literature. In this paper, we provide a structured and comprehensive overview of various facets of network anomaly detection so that a researcher can become quickly familiar with every aspect of network anomaly detection. We present attacks normally encountered by network intrusion detection systems. We categorize existing network anomaly detection methods and systems based on the underlying computational techniques used. Within this framework, we briefly describe and compare a large number of network anomaly detection methods and systems. In addition, we also discuss tools that can be used by network defenders and datasets that researchers in network anomaly detection can use. We also highlight research directions in network anomaly detection.

...read moreread less

971 citations

Cites background from "Cluster Validity with Fuzzy Sets"

...Bezdek [76] Classification entropy CE = 1 N ∑k...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters

[...]

J. C. Dunn

01 Jan 1973

TL;DR: Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space; in both cases, the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the least squarederror criterion function.

...read moreread less

Abstract: Two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space. In both cases, the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the least squared error criterion function. In the first case, the range of T consists largely of ordinary (i.e. non-fuzzy) partitions of X and the associated iteration scheme is essentially the well known ISODATA process of Ball and Hall. However, in the second case, the range of T consists mainly of fuzzy partitions and the associated algorithm is new; when X consists of k compact well separated (CWS) clusters, Xi , this algorithm generates a limiting partition with membership functions which closely approximate the characteristic functions of the clusters Xi . However, when X is not the union of k CWS clusters, the limi...

...read moreread less

5,787 citations

A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters

[...]

J. C. Dunn

01 Jan 1973

TL;DR: In this paper, two fuzzy versions of the k-means optimal, least squared error partitioning problem are formulated for finite subsets X of a general inner product space, and the extremizing solutions are shown to be fixed points of a certain operator T on the class of fuzzy, k-partitions of X, and simple iteration of T provides an algorithm which has the descent property relative to the LSE criterion function.

...read moreread less

5,254 citations

Journal Article•DOI•

Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters

[...]

C.T. Zahn

01 Jan 1971-IEEE Transactions on Computers

TL;DR: A family of graph-theoretical algorithms based on the minimal spanning tree are capable of detecting several kinds of cluster structure in arbitrary point sets; description of the detected clusters is possible in some cases by extensions of the method.

...read moreread less

Abstract: A family of graph-theoretical algorithms based on the minimal spanning tree are capable of detecting several kinds of cluster structure in arbitrary point sets; description of the detected clusters is possible in some cases by extensions of the method. Development of these clustering algorithms was based on examples from two-dimensional space because we wanted to copy the human perception of gestalts or point groupings. On the other hand, all the methods considered apply to higher dimensional spaces and even to general metric spaces. Advantages of these methods include determinacy, easy interpretation of the resulting clusters, conformity to gestalt principles of perceptual organization, and invariance of results under monotone transformations of interpoint distance. Brief discussion is made of the application of cluster detection to taxonomy and the selection of good feature spaces for pattern recognition. Detailed analyses of several planar cluster detection problems are illustrated by text and figures. The well-known Fisher iris data, in four-dimensional space, have been analyzed by these methods also. PL/1 programs to implement the minimal spanning tree methods have been fully debugged.

...read moreread less

1,832 citations

Journal Article•DOI•

On Some Invariant Criteria for Grouping Data

[...]

H. P. Friedman¹, J. Rubin¹•Institutions (1)

IBM¹

01 Dec 1967-Journal of the American Statistical Association

TL;DR: This paper attacks the problem of exploring the structure of multivariate data in search of “clusters” by using a computer procedure to obtain the “best” partition of n objects into g groups.

...read moreread less

Abstract: This paper deals with methods of “cluster analysis”. In particular we attack the problem of exploring the structure of multivariate data in search of “clusters”. The approach taken is to use a computer procedure to obtain the “best” partition of n objects into g groups. A number of mathematical criteria for “best” are discussed and related to statistical theory. A procedure for optimizing the criteria is outlined. Some of the criteria are compared with respect to their behavior on actual data. Results of data analysis are presented and discussed.

...read moreread less

586 citations

Journal Article•DOI•

State of the art in pattern recognition

[...]

George Nagy¹•Institutions (1)

IBM¹

01 Jan 1968

TL;DR: This paper reviews statistical, adaptive, and heuristic techniques used in laboratory investigations of pattern recognition problems and includes correlation methods, discriminant analysis, maximum likelihood decisions minimax techniques, perceptron-like algorithms, feature extraction, preprocessing, clustering and nonsupervised learning.

...read moreread less

Abstract: This paper reviews statistical, adaptive, and heuristic techniques used in laboratory investigations of pattern recognition problems. The discussion includes correlation methods, discriminant analysis, maximum likelihood decisions minimax techniques, perceptron-like algorithms, feature extraction, preprocessing, clustering and nonsupervised learning. Two-dimensional distributions are used to illustrate the properties of the various procedures. Several experimental projects, representative of prospective applications, are also described.

...read moreread less

317 citations